GitHub - scalarstop/scalarstop: A Python framework for managing machine learning experiments.

Keep track of your machine learning experiments with ScalarStop.

ScalarStop is a Python framework for reproducible machine learning research.

It was written and open-sourced at Neocrym, where it is used to train thousands of models every week.

ScalarStop can help you:

organize datasets and models with content-addressable names.
save/load datasets and models to/from the filesystem.
record hyperparameters and metrics to a relational database.

System requirements

ScalarStop is a Python package that requires Python 3.8 or newer.

Currently, ScalarStop only supports tracking tf.data.Dataset datasets and tf.keras.Model models. As such, ScalarStop requires TensorFlow 2.8.0 or newer.

We encourage anybody that would like to add support for other machine learning frameworks to ScalarStop. :)

Installation

ScalarStop is available on PyPI.

Selecting a TensorFlow package variant

If you are using TensorFlow on a CPU, you can install ScalarStop with the command:

python3 -m pip install scalarstop[tensorflow]

If you are using TensorFlow with GPUs, you can install ScalarStop with the command:

python3 -m pip install scalarstop[tensorflow-gpu]

Selecting a PostgreSQL psycopg2 package variant

If you intend to use ScalarStop with PostgreSQL, you should also install either psycopg2-binary (which works out of the box) or psycopg2 (which you compile from source).

Therefore, your installation command could look like either:

python3 -m pip install scalarstop[tensorflow,psycopg2]
python3 -m pip install scalarstop[tensorflow,psycopg2-binary]
python3 -m pip install scalarstop[tensorflow-gpu,psycopg2]
python3 -m pip install scalarstop[tensorflow-gpu,psycopg2-binary]

Development

If you would like to make changes to ScalarStop, you can clone the repository from GitHub.

git clone https://github.com/scalarstop/scalarstop.git
cd scalarstop
python3 -m pip install .

Usage

Read the ScalarStop Tutorial to learn the core concepts behind ScalarStop and how to structure your datasets and models.

Afterwards, you might want to dig deeper into the ScalarStop Documentation. In general, a typical ScalarStop workflow involves four steps:

1. Organize your datasets with scalarstop.datablob.

2. Describe your machine learning model architectures using scalarstop.model_template.

3. Load, train, and save machine learning models with scalarstop.model.

4. Save hyperparameters and training metrics to a SQLite or PostgreSQL database using scalarstop.train_store.

Contributing to ScalarStop

We warmly welcome contributions to ScalarStop. Here are the technical details for getting started with adding code to ScalarStop.

Getting started

First, clone this repository from GitHub. All development happens on the main branch.

git clone https://github.com/scalarstop/scalarstop.git

Then, run make install to install Python dependencies in a Poetry virtualenv.

You can run make help to see the other commands that are available.

Checking your code

Run make fmt to automatically format code.

Run make lint to run Pylint and MyPy to check for errors.

Generating documentation

Documentation is important! Here is how to add to it.

Generating Sphinx documentation

You can generate a local copy of our Sphinx documentation at scalarstop.com with make docs.

The generated documentation can be found at docs/_build/dirhtml. To view it, you should start an HTTP server in this directory, such as:

make docs
cd docs/_build/dirhtml
python3 -m http.server 5000

Then visit http://localhost:5000 in your browser to preview changes to the documentation.

If you want to use Sphinx's ability to automatically generate hyperlinks to the Sphinx documentation of other Python projects, then you should configure intersphinx settings at the path docs/conf.py. If you need to download an objects.inv file, make sure to update the make update-sphinx command in the Makefile.

Editing the tutorial notebook

The main ScalarStop tutorial is in a Jupyter notebook. If you have made changes to ScalarStop, you should rerun the Jupyter notebook on your machine with your changes to make sure that it still runs without error.

Running unit tests

Run make test to run all unit tests.

If you want to run a specific unit test, try running python3 -m poetry run python -m unittest -k {name of your test}.

Unit tests with SQLite3

If you are running tests using a Python interpreter that does not have the SQLite3 JSON1 extension, then TrainStore unit tests involving SQLite3 will be skipped. This is likely to happen if you are using Python 3.8 on Windows. If you suspect that you are missing the SQLite3 JSON1 extension, the Django documentation has some suggestions for how to fix it.

Unit tests with PostgreSQL

By default, tests involving PostgreSQL are skipped. To enable PostgreSQL, run make test in a shell where the environment variable TRAIN_STORE_CONNECTION_STRING is set to a SQLAlchemy database connection URL--which looks something like "postgresql://scalarstop:changeme@localhost:5432/train_store". The connection URL should point to a working PostgreSQL database with an existing database and user.

The docker-compose.yml file in the root of this directory can set up a PostgreSQL instance on your local machine. If you have Docker and Docker Compose installed, you can start the PostgreSQL database by running docker-compose up in the same directory as the docker-compose.yml file.

Measuring test coverage

You can run make test-with-coverage to collect Python line and branch coverage information. Afterwards, run make coverage-html to generate an HTML report of unit test coverage. You can view the report in a web browser at the path htmlcov/index.html.

Credits

ScalarStop's documentation is built with Sphinx using @pradyunsg's Furo theme and is hosted by Read the Docs.

Name		Name	Last commit message	Last commit date
Latest commit History 93 Commits
.github/workflows		.github/workflows
docs		docs
notebooks		notebooks
scalarstop		scalarstop
tests		tests
.gitignore		.gitignore
.gitmodules		.gitmodules
.isort.cfg		.isort.cfg
.makefile-help		.makefile-help
.mypy.ini		.mypy.ini
.pylintrc		.pylintrc
.readthedocs-requirements.txt		.readthedocs-requirements.txt
.readthedocs.yml		.readthedocs.yml
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE		LICENSE
Makefile		Makefile
README.rst		README.rst
create_dummy_sqlite3_db.py		create_dummy_sqlite3_db.py
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Keep track of your machine learning experiments with ScalarStop.

System requirements

Installation

Selecting a TensorFlow package variant

Selecting a PostgreSQL psycopg2 package variant

Development

Usage

1. Organize your datasets with scalarstop.datablob.

2. Describe your machine learning model architectures using scalarstop.model_template.

3. Load, train, and save machine learning models with scalarstop.model.

4. Save hyperparameters and training metrics to a SQLite or PostgreSQL database using scalarstop.train_store.

Contributing to ScalarStop

Getting started

Checking your code

Generating documentation

Generating Sphinx documentation

Editing the tutorial notebook

Running unit tests

Unit tests with SQLite3

Unit tests with PostgreSQL

Measuring test coverage

Credits

About

Releases 13

Languages

License

scalarstop/scalarstop

Folders and files

Latest commit

History

Repository files navigation

Keep track of your machine learning experiments with ScalarStop.

System requirements

Installation

Selecting a TensorFlow package variant

Selecting a PostgreSQL psycopg2 package variant

Development

Usage

1. Organize your datasets with scalarstop.datablob.

2. Describe your machine learning model architectures using scalarstop.model_template.

3. Load, train, and save machine learning models with scalarstop.model.

4. Save hyperparameters and training metrics to a SQLite or PostgreSQL database using scalarstop.train_store.

Contributing to ScalarStop

Getting started

Checking your code

Generating documentation

Generating Sphinx documentation

Editing the tutorial notebook

Running unit tests

Unit tests with SQLite3

Unit tests with PostgreSQL

Measuring test coverage

Credits

About

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases 13

Languages