A simple template for research project repos. Also check out data science and reproducible science cookie cutters.
Run the following
./install.sh YOUR_PROJECT_REPO_FOLDER
This script creates the following folders and files.
libs
for a software library for the project.data
for datasets and scripts for downloading datasets.exps
for timestamped experiments.paper
for manuscripts.workflow
for workflow scripts..gitignore
that lists temporary and binary files to ignore (LaTeX, Python, Jupyter, data files, etc. )
Miniforge is preferred over conda because Miniforge comes with mamba and conda-forge is the default channel.
First create a virtual environment for the project.
mamba create -n project_env_name python=3.7
mamba activate project_env_name
Install ipykernel
for Jupyter and snakemake
for workflow management.
mamba install -y -c bioconda -c conda-forge snakemake ipykernel numpy pandas scipy matplotlib seaborn tqdm austin
Create a kernel for the virtual environment that you can use in Jupyter lab/notebook.
python -m ipykernel install --user --name project_env_kernel_name
conda install -y -c conda-forge pre-commit
pre-commit install
mkdir -p ~/.config/snakemake/default
and create ~/.config/snakemake/default/config.yaml
:
# non-slurm profile defaults
keep-going: True
rerun-triggers: mtime
and add the following to .zshrc or .bashrc file
export SNAKEMAKE_PROFILE=default