-
pdoc: Automatically create an API documentation for your project
-
pre-commit plugins: Automate code reviewing formatting
.
├── config
│ ├── main.yaml # Main configuration file
│ ├── model # Configurations for training model
│ │ └── model1.yaml # Second variation of parameters to train model
│ └── process # Configurations for processing data
│ └── process1.yaml # Second variation of parameters to process data
├── data
│ ├── final # data after training the model
│ ├── processed # data after processing
│ └── raw # raw data
├── docs # documentation for your project
├── .gitignore # ignore files that cannot commit to Git
├── Makefile # store useful commands to set up the environment
├── models # store models
├── notebooks # store notebooks
├── .pre-commit-config.yaml # configurations for pre-commit
├── pyproject.toml # dependencies for poetry
├── README.md # describe your project
├── requirements.txt # This contains the requirements file
└── src # store source code
├── __init__.py # make src a Python module
├── process.py # process data before training model
├── train_model.py # train model
└── utils.py # store helper functions- Install Poetry
- Activate the virtual environment:
poetry shell- Install dependencies:
- To install all dependencies from pyproject.toml, run:
poetry install- To install only production dependencies, run:
poetry install --only main- To install a new package, run:
poetry add <package-name>To view the configurations associated with a Pythons script, run the following command:
python src/process.py --helpOutput:
process is powered by Hydra.
== Configuration groups ==
Compose your configuration from those groups (group=option)
model: model1
process: process1
== Config ==
Override anything in the config (foo.bar=value)
process:
use_columns: sentence
batch_size: 16
model:
name: Logistic regression
parameters:
steps: 200
data:
raw:
train: ../data/raw/train.parquet
val: ../data/raw/val.parquet
processed:
train: ../data/processed/train.parquet
val: ../data/processed/val.parquet
final: ../data/final/metrics.csvTo alter the configurations associated with a Python script from the command line, run the following:
python src/process.py data.raw=sample2.csvTo auto-generate API document for your project, run:
make docs_save