A cookiecutter template for science and data science projects that include data, code, and dissemination.
- Optimized for data-based publications
- Optimized for use with VS Code
- Docker-based, version-controlled environment using VS Code Dev Containers
- conda based environment inside the Dev Container - just add packages to envrionment.yml and rebuild. Same environment for the whole team
- use of Dev container Features with pre-installed, Python, oh-my-zsh and LaTeX
- Optimised for use with Python but could also be used with Julia, and R
- Make commands for: collecting data, generating, figures, typsetting latex, clean temp files, clean demo files
- use of VS Code tasks to trigger data collection, plotting and paper compilation
- LaTeX-based paper
- Added path definitions in the
project_package
Python module - Kedro-inspired data folder structure
- filled with a demo - which can be cleaned with "make delete_demo"
- used in at least 5 papers
For more detailed information, please see the README of the resulting project.
cookiecutter https://github.com/tgoelles/cookiecutter_science
βββ .devcontainer # Definition of the Docker container and environment for VS Code
β βββ Dockerfile # Defines the Docker container
β βββ devcontainer.json # Defines the devcontainer settings for VS Code
β βββ noop.txt # Placeholder file to ensure the COPY instruction does not fail if no environment.yml exists
βββ .gitattributes # Git attributes for handling line endings and merge strategies
βββ .gitignore # Git ignore file to exclude files and directories from version control
βββ Makefile # Makefile with commands like `make data` and `make clean`
βββ README.md # Project readme
βββ code # Source code and notebooks
β βββ notebooks # Jupyter notebooks
β β βββ exploratory # Data explorations
β β βββ 1.0-tg-example.ipynb # Jupyter notebook with naming conventions. tg are initials
β βββ project_package # Project-specific Python package
β β βββ __init__.py # Makes project_package a Python module
β β βββ data # Scripts to download, generate and parse data
β β β βββ __init__.py
β β β βββ config.py # Project-wide path definitions
β β β βββ example.py # Example script
β β β βββ import_data.py # Functions to read raw data
β β β βββ make_dataset.py # Scripts to download or generate data (used in the Makefile)
β β βββ tools # Scripts and functions for general use
β β β βββ __init__.py
β β β βββ convert_latex.py # Functions to convert elements for use in LaTeX
β β βββ visualization # Scripts and functions to create visualizations
β β βββ __init__.py
β β βββ make_plots.py # Scripts to make all plots for the publication
β β βββ visualize.py # Functions to produce final plots
β βββ pyproject.toml # Configuration file for the project
βββ data # Data directories
β βββ 01_raw # The original, immutable data dump
β β βββ demo.csv # Example raw data file
β βββ 02_intermediate # Intermediate processed data
β βββ 03_primary # cleanes data, used for the publication
β βββ 04_feature # For Machine learning, features based on the primary data
β βββ 05_model_input # The final data used for machine learning
β βββ 06_models # Stored, serialized pre-trained machine learning models
β βββ 07_model_output # Output from trained machine learning models
β βββ 08_reporting # Reporting data like log files
βββ dissemination # Materials for dissemination
β βββ figures # Figures for paper generated with Python
β β βββ demo.png # Example figure file
β βββ presentations # All related PowerPoint files, especially for deliverables
β βββ papers # LaTeX-based papers
β βββ paper.tex # Example LaTeX paper
βββ environment.yml # Conda environment configuration file
βββ literature # References and explanatory materials
βββ references.bib # Bibliography file for LaTeX documents
Use of VS Code tasks:
- Git: Should be part of your OS or install it here
- GitHub account
- GitHub CLI: Install from here
- Docker Desktop: Install from here
- VS Code: Install from here
- VS Code Extension: Remote Development: Install from here
- Cookiecutter Python package: Install like this:
pip install cookiecutter
For Mac users:
brew install cookiecutter
-
Navigate to the folder where you want to create the project (on your local drive):
cookiecutter https://github.com/tgoelles/cookiecutter_science
-
Answer the questions prompted by cookiecutter.
-
A new VS Code window will open automatically.
-
Click "OK" to reopen the folder in a container (only asked the first time).
-
Read the README.md in the generated project folder.
Cookiecutter can generate a GitHub repository for you. This initializes the git repo and pushes it to GitHub. You can then invite your team members to join the project.
- Each team member works on their local version of the project, regularly committing and pushing changes.
- Avoid working on the same folder over a network.
If you want to use git inside the container (recommended), you need to clone the repo from WSL, as Windows may mess up the .git
folder. Git inside the container uses the same .gitconfig
as Windows, which is copied into the container.
Ensure user.email
and user.name
are set (in PowerShell):
git config --global user.name "your_name"
git config --global user.email "your_email@gmail.com"