This is a package for easy application of multiple machine learning models to a single problem, and comparison via various metrics. For a pretty detailed description of the models that are available in the tool, and a quite detailed description of the metrics that it pumps out, see the paper.
- Clone the repo.
- Put all of your code into the
src/folder - Import modules as
import ml_battery.some_modulesorfrom ml_battery import * - TODO: Create a
setup.pyfor farill installation
Probably the easiest thing to do is to just copy one of the existing jupyter notebooks, and repurpose it with your own data.
- You have to test/train split your own data, and input a codebook identifying categorical features.
See thefuel_use.ipynbfor a good example of reading in a csv and pumping it into the pipeline - You can edit items in the model, but you can just run it as-is for first-pass results.
- As such, the model, fitting and scoring lines can all just be run, without editing for new datasets.
Because all of this stuff has the ability to run multiple processes, it imports a handy log to a socket functionality from ml_battery.log.
In order to log to a socket, there needs to be a logger reading from that socket.
Fortunately, the src/ml_battery/logging_server.py script is exactly that.
Run the logging_server.py script from anywhere, and a file will be created in the working directory called test.log that logs all of the output from the ml_battery functions.
There is a sphinx documentation framework here. To build it, go into the docs/ folder and $make html (or ./make html on windows)
This will create a bunch of handy documentation of the individual functions and classes available in the ml_battery library.