# How to create a new project

## 1. Define repository structure

The first thing to do is to create the folder for your new project on your local device.

### 1.1 Short projects

The structure of a simple data science reposity for small projects could be the following:

In [None]:
├── data
│   ├── processed # > data preprocessed/cleaned/results
│   └── raw # > immutable raw data in its original form
├── .gitignore # > file which specifies which folder should not be tracked by git
├── notebooks
│   └── template_notebook.ipynb # > example notebook.ipynb
├── README.md # > the entry point for your project, explaining goals, etc
├── reports # > structured reports in markdown format or presentations
│   ├── img # > all images used for reports should be stored here
│   │   └── pic01.jpg
│   └── template_report.md
└── src # > folder for python scripts and modules
    └── template_module.py # > example python code with properly formatted documentation

You can get the folder .zip version [here](https://github.com/mikjf/project_creation_setup/tree/main/template_project_download).

### 1.2 Longer projects

For more longer/complex project you could leverage a cool command-line utility that creates a Python package project form a Python package project template called [cookiecutter](https://anaconda.org/conda-forge/cookiecutter).

In [None]:
# use this command to install cookiecutter if you are using anaconda
conda install -c conda-forge cookiecutter

## 2. Create project on GitHub

Simply head to your repositories section on GitHub and click on the green button "new" to define the repository name and public/private permissions. The folder in section 1.1 already contains the README.md file. Once the project has been created, you will land on the quick setup page.

### 2.1 Push local git repository onto the GitHub project

While in the local project working directory in the terminal:

In [None]:
# initialize local directory as git repository
git init

In [None]:
# add the files in the local repository and stages them for commit
git add .

In [None]:
# commit tracked changes to prepare push
git commit -m 'project initialized'

In [None]:
# set new remote
git remote add origin <SSH_paste>

In [None]:
# verify new remote URL
git remote -v

In [None]:
# push changes in local repository up to specified remote repository (GitHub)
git push -u origin main

## 3. Conda environment

It is good practice to create a dedicated conda environment for each project.

### 3.1 Conda environment setup

From the terminal (conda-terminal):

In [None]:
# create new conda environment
conda create --name project_name python

In [None]:
# activate created environment
conda activate project_name

In [None]:
# add conta environment to the jupyter kernel
conda install -c anaconnda ipykernel
python -m ipykernetl install --user --name=project_name

### 3.2 Conda environment export

Adding the environment specification to a file within your project folder makes it easier to replicate the project and run it properly for other users.

In [None]:
# conda export to yaml specification file
conda env export > environment.yml

### 3.3 Create environment from .yml file

New conda environment from an environment file:

In [None]:
# conda environment from existing yml
conda env create -f environment.yml

In [None]:
# update active environment with an existing environment file
conda env update --prefix ./env --file environment.yml --prune