Skip to content

kesant/projects_template

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Template for Machine learning projects

A logical, reasonably standardized, but flexible project structure for doing and sharing data science work.

Requirements

Installing Miniconda

Windows

These three commands quickly and quietly download the latest 64-bit Windows installer, rename it to a shorter file name, silently install, and then delete the installer:

curl https://repo.anaconda.com/miniconda/Miniconda3-latest-Windows-x86_64.exe -o miniconda.exe
start /wait "" .\miniconda.exe /S
del miniconda.exe

After installing, open the Anaconda Prompt (miniconda3) program to use Miniconda3.

Linux

These four commands download the latest 64-bit version of the Linux installer, rename it to a shorter file name, silently install, and then delete the installer:

mkdir -p ~/miniconda3
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O ~/miniconda3/miniconda.sh
bash ~/miniconda3/miniconda.sh -b -u -p ~/miniconda3
rm ~/miniconda3/miniconda.sh

After installing, add the following line to your .bashrc to initialize conda automatically:

export PATH=~/miniconda3/bin:$PATH

Then run:

source ~/.bashrc

Now you can use Miniconda3 on your Linux system.

Installing Cookiecutter

pip install cookiecutter

or

conda install -c conda-forge cookiecutter

Create a new project

In a folder where you want your project generated:

cookiecutter https://github.com/kesant/projects_template

Resulting directory structure

├── LICENSE
├── README.md          <- The top-level README for developers using this project.
├── install.md         <- Detailed instructions to set up this project.
├── data
│   ├── interim        <- Intermediate data that has been transformed.
│   ├── processed      <- The final, canonical data sets for modeling.
│   └── raw            <- The original, immutable data dump.
│
├── models             <- Trained and serialized models, model predictions, or model summaries.
│
├── notebooks          <- Jupyter notebooks. Naming convention is a number (for ordering),
│                         the creator's initials, and a short `-` delimited description, eg.
│                         `1.0-jqp-initial-data-exploration`.
│
├── environment.yml    <- The requirements file for reproducing the analysis environment.
├── requirements.txt   <- The pip requirements file for reproducing the environment.
│
├── test               <- Unit and integration tests for the project.
│   ├── __init__.py
│   └── test_model.py  <- Example of a test script.
│
├── .here              <- File that will stop the search if none of the other criteria
│                         apply when searching head of project.
│
├── setup.py           <- Makes project pip installable (pip install -e .)
│                         so {{ cookiecutter.project_module_name }} can be imported.
│
└── {{ cookiecutter.project_module_name }}   <- Source code for use in this project.
    │
    ├── __init__.py             <- Makes {{ cookiecutter.project_module_name }} a Python module.
    │
    ├── config.py               <- Store useful variables and configuration.
    │
    ├── dataset.py              <- Scripts to download or generate data.
    │
    ├── features.py             <- Code to create features for modeling.
    │
    ├── modeling                
    │   ├── __init__.py 
    │   ├── predict.py          <- Code to run model inference with trained models.
    │   └── train.py            <- Code to train models.
    │
    ├── utils                   <- Scripts to help with common tasks.
    │   └── paths.py            <- Helper functions for relative file referencing across the project.        
    │
    └── plots.py                <- Code to create visualizations.

Credits

This project is heavily influenced by drivendata's Cookiecutter Data Science, andfanilo's Cookiecutter for Kaggle Conda projects

Other links that helped shape this cookiecutter :

About

template for data science projects using cookiecutter

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages