Contents:
.. toctree:: :maxdepth: 2
Unix-like command line utilities. Filters (read from stdin/write to stdout) for files
Installation should put these in your path. To see help, do
module_name.py -h
.. automodule:: rosetta.cmd.cut
.. automodule:: rosetta.cmd.subsample
.. automodule:: rosetta.cmd.split
.. automodule:: rosetta.cmd.row_filter
.. automodule:: rosetta.cmd.files_to_vw
.. automodule:: rosetta.cmd.join_csv
.. automodule:: rosetta.cmd.concat_csv
- Wrappers for Python multiprocessing that add ease of use
- Memory-friendly multiprocessing
.. automodule:: rosetta.parallel.parallel_easy :members:
.. automodule:: rosetta.parallel.pandas_easy :members:
Text-processing specific
- Stream text from disk to formats used in common ML processes
- Write processed text to sparse formats
- Helpers for ML tools (e.g. Vowpal Wabbit, Gensim, etc...)
- Other general utilities
.. automodule:: rosetta.text.filefilter :members:
.. automodule:: rosetta.text.streamers :members:
.. automodule:: rosetta.text.text_processors :members:
.. automodule:: rosetta.text.nlp :members:
.. automodule:: rosetta.text.vw_helpers :members:
.. automodule:: rosetta.text.gensim_helpers :members:
- General ML modeling utilities
.. automodule:: rosetta.modeling.eda :members:
.. automodule:: rosetta.modeling.prediction_plotter :members:
.. automodule:: rosetta.modeling.var_create :members:
.. automodule:: rosetta.modeling.fitting :members:
.. automodule:: rosetta.modeling.categorical_fitter :members:
Shared by other modules.
.. automodule:: rosetta.common :members:
.. automodule:: rosetta.common_math :members:
.. plot:: ../examples/plot_classifiers.py :include-source:
.. plot:: ../examples/plot_regressors.py :include-source:
.. plot:: ../examples/eda_examples.py :include-source: