Skip to content
/ Rowmancer Public template

A tool for Data Science/MLops to count the rows of csv data in a directory tree, including various headline auditing functions.

License

Notifications You must be signed in to change notification settings

TheLustriVA/Rowmancer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RowMancer: CSV/TSV Data Reporter

RowMancer_banner

Description

RowMancer is a Command Line Interface (CLI) tool that allows you to count rows, columns, and files in CSV/TSV datasets. The tool provides various options for specific count metrics, including the ability to count blank files, specify directory depth, and calculate column statistics.

License

This project is under the Apache 2.0 License. See the LICENSE file for more details.

Installation

From Source

  1. Clone the repository

    git clone https://github.com/TheLustriVA/Rowmancer.git
  2. Navigate to the project directory

    cd Rowmancer
  3. Install the package

    pip install .

From PyPI

You can also install the package from PyPI:

pip install RowMancer

Usage

Run the tool with no options to count all rows in all .csv and .tsv files in the current directory and its subdirectories:

Rowmancer

Options

  1. Count Files: -c, --count-files

    • Count the number of .csv and .tsv files instead of rows.
    RowMancer --count-files
  2. Blank Files: -b, --blank

    • Count the number of blank or non-parsable .csv and .tsv files.
    RowMancer --blank
  3. Readable Numbers: -l, --readable

    • Show numbers in a more readable format (e.g., 1,000 instead of 1000).
    RowMancer --readable
  4. Directory: dir

    • Specify the directory to start the search.
    RowMancer /path/to/directory
  5. Header Row: -H, --header-row

    • Exclude the first row from each .csv file in the count.
    RowMancer --header-row
  6. Depth: -d, --depth

    • Set the directory depth for the search.
    RowMancer --depth 2
  7. Column Stats: -x, --columns

    • Show column statistics (MIN, MAX, MEAN, SINGLE).
    RowMancer --columns MIN

Contributing

Please read CONTRIBUTING.md for details on our code of conduct, and the process for submitting pull requests.

Author

  • KGB aka Marco Lustri - With help from GPT-4

Acknowledgments

  • Morgan Medici, who knows more than most have forgotten.

About

A tool for Data Science/MLops to count the rows of csv data in a directory tree, including various headline auditing functions.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages