This is a project developed to create a code template for Machine Learning application development lifecycle. This code template includes folder structure as well as some boilerplate code used for typical ML project to be developed and pushed to production.
This code template can be used for end to end ML project development as well as deployment.
The code template considers following phases in ML project development lifecycle:
Requirement gathering
Data Collection
Model Building
Inference
Testing
Deployment
There are a lot of commercial as well as open source libraries avaialable to achieve this but this project is created to start on your own with simple base code.
During typical ML project lifecycle,this code template can be used in following way:
- Keep all Requirement gathering documents in docs folder.
- Keep Data Collection and exploration notebooks in src/training folder.
- Keep datasets in data folder.
- Keep model building notebooks at src/training folder.
- Keep generated model files at src/models.
- Write and keep inference code in src/inference.
- Write Logging and configuration code in src/utility.
- Write unit test cases in tests folder.pytest,pytest-cov
- Write performance test cases in tests folder.locust
- Build docker image.Docker
- Use and configure code formatter.black
- Use and configure code linter.pylint
- Add Git Pre-commit hooks.
- Use Circle Ci for CI/CD.Circlci
Clone this repo locally and add/update/delete as per your requirement.
Please note that this template is in no way complete or the best way for your project structure.
This template is just to get you started quickly with almost all basic phases covered.
├── README.md <- top-level README for developers using this project.
├── pyproject.toml <- black code formatting configurations.
├── .dockerignore <- Files to be ognored in docker image creation.
├── .gitignore <- Files to be ignored in git check in.
├── .pre-commit-config.yaml <- Things to check before git commit.
├── .circleci/config.yml <- Circleci configurations
├── .pylintrc <- Pylint code linting configurations.
├── Dockerfile <- A file to create docker image.
├── environment.yml <- stores all the dependencies of this project
├── main.py <- A main file to run API server.
├── src <- Source code files to be used by project.
│ ├── inference <- model output generator code
│ ├── model <- model files
│ ├── training <- model training code
│ ├── utility <- contains utility and constant modules.
├── logs <- log file path
├── config <- config file path
├── data <- datasets files
├── docs <- documents from requirement,team collabaroation etc.
├── tests <- unit and performancetest cases files.
│ ├── cov_html <- Unit test cases coverage report
Development Environment used to create this project:
Operating System: Windows 10 Home
Anaconda:4.8.5 Anaconda installation
Go to location of environment.yml file and run:
conda env create -f environment.yml
Here we have created ML inference on FastAPI server with dummy model output.
- Go inside 'Code_Template' folder on command line.
- Run:
conda activate code_template
python main.py
- Open 'http://localhost:5000/docs' in a browser.
- Go inside 'tests' folder on command line.
- Run:
pytest -vv
pytest --cov-report html:tests/cov_html --cov=src tests/
- Open 2 terminals and start main application in one terminal
python main.py
- In second terminal,Go inside 'tests' folder on command line.
- Run:
locust -f locust_test.py
- Go inside 'Code_Template' folder on command line.
- Run:
black src
- Go inside 'Code_Template' folder on command line.
- Run:
pylint src
- Go inside 'Code_Template' folder on command line.
- Run:
docker build -t myimage .
docker run -d --name mycontainer -p 5000:5000 myimage
- Go inside 'Code_Template' folder on command line.
- Run:
pre-commit install
- Whenever the command git commit is run, the pre-commit hooks will automatically be applied.
- To test before commit,run:
pre-commit run
- Add project on circleci website then monitor build on every commit.
Please create a Pull request for any change.
NOTE: This software depends on other packages that are licensed under different open source licenses.