# Best Practices
* [Version control how-to](#Version-control-how-to)
  - [Set-up GitHub repository for backup, version-control & collaboration](#Set-up-GitHub-repository-for-backup,-version-control-&-collaboration)
  - [How to make sure no Jupyter Notebook data is leaked to GitHub
](#How-to-make-sure-no-Jupyter-Notebook-data-is-leaked-to-GitHub)
* [How to save your entire computational context if you installed additional Python packages](#How-to-save-your-entire-computational-context-if-you-installed-additional-Python-packages)
* [How to safely handle secrets, passwords, tokens](#How-to-safely-handle-secrets,-passwords,-tokens)

## Version control how-to
### Set-up GitHub repository for backup, version-control & collaboration
1. Create a new **empty** repository (usually a private one, thus visible for Eraneos employees only) on the [AWK GitHub page](https://github.com/awkgroupag) (the green `New`-button). Note the new URL to your new repo, e.g. https://github.com/awkgroupag/MY-NEW-REPO
2. Open a command prompt and navigate to your source code folder (`datalab` in the diagram above)
  * :warning: Be sure to NOT have any data in the directory you are currently in! See above :warning:
3. Type (replacing the URL)
```console
$ git init
# git's default branch name is master, let's change this to GitHub's main
$ git branch -M main
$ git remote add origin https://github.com/awkgroupag/MY-NEW-REPO
# Add the entire datalab to your first commit
$ git add .
$ git commit -m "initial commit"
# Actually upload the files to GitHub.com
# Save GitHub credentials so you don't need to auth again and again
$ git config --global credential.helper store
$ git push --set-upstream origin main
```
4. You should be prompted for your GitHub credentials after the last command above
5. Check [Atlassian's Comparing Workflows](https://www.atlassian.com/git/tutorials/comparing-workflows) to get started with `git`. See the [Git-flow-Workflow](https://www.atlassian.com/git/tutorials/comparing-workflows/gitflow-workflow) to understand collaboration with other team members.
    * Use `git pull` to get the latest changes from GitHub
    * Use `git commit` and `git push` to push your changes to GitHub
    * Work with dedicated new branches for changes, do not work directly with the branch `main`!


### How to make sure no Jupyter Notebook data is leaked to GitHub
This will make sure that all results in Jupyter Notebooks will be cleaned before a `git commit` and upload to e.g. GitHub. Your local notebook stays unchanged. Note that Notebook outputs might still show up as changes in certain tools, but they won't ever be committed. 

1. Edit your existing `.git/config` file: Hit `CTRL+Shift+L`, open a new Terminal. Then
```console
$ git config filter.strip-jupyter-notebook-output.clean 'jupyter nbconvert --ClearOutputPreprocessor.enabled=True --to=notebook --stdin --stdout --log-level=ERROR'
```
2. Create a `.gitattributes` file at the root of your directoy. Type the following in the Terminal:
```console
$ cd ~/work
$ nano .gitattributes
```
    and add the following line:
```
*.ipynb filter=strip-jupyter-notebook-output
```

<div class="alert alert-block alert-warning">
<b>WARNING</b> For this to work, you will need to use git from "within" the Jupyter Kubernetes pod, e.g. by using the "Git" window here in this notebook (symbol in the ribbon on the very left). Otherwise, you need to ensure that the command "jupyter" can be found on your host machine. 
</div>

## How to save your entire computational context if you installed additional Python packages
> Execute these commands in your normal Jupyter Notebook (white GUI), not this controlboard (black GUI) ;-)

You might change your pod by installing new [**PIP Python packages**](https://pypi.org) e.g. with `pip install <package name>`. Any such change will be lost with the pod. To quickly save your entire pip environment, including all packages, copy-paste the following into your Juypter notebook and run it:

```console
! pip freeze > /home/jovyan/work/pip-environment.txt
```

To load your environment again from scratch, e.g. if you re-created your environment/pod:

```console
! pip install -r /home/jovyan/work/pip-environment.txt
```

If you installed additional Python packages with [**Anaconda**](https://anaconda.org), `conda install <package name>`, here's how to save the entire conda environment:

```console
! conda env export -n base > /home/jovyan/work/anaconda-environment.yml
```

To re-install all Anaconda packages from this file, do:

```console
! conda env update --name base --file /home/jovyan/work/anaconda-environment.yml
```

## How to safely handle secrets, passwords, tokens
If you need a password to connect to e.g. an API, it's easiest to do this:

In [None]:
insecure_password = '1234567890asdfghjkl'

<div class="alert alert-block alert-warning">
<b>WARNING</b> NEVER save secrets in clear text in a Notebook - it's unsafe as you will check the secret into GitHub at some point. Use a dedicated *.env file!
</div>

Create a new file, e.g. named `environment.env`, in the same directory where your Notebook lies. Add your sensitive information as key-value-pairs like this. Note that **no \*.env file is checked into GitHub** - you're responsible for backing up this information e.g. using a password manager. 
```console
# Production settings
DOMAIN="example.org"
ADMIN_EMAIL="admin@${DOMAIN}"
ROOT_URL="${DOMAIN}/app"
SECURE_PASSWORD="1234567890asdfghjkl"
```

To use secrets in a Notebook (or any Python-environment), load the env file's contents as key value-pairs in a dictionary `secure_config`:
```python
from dotenv import dotenv_values

secure_config = dotenv_values('environment.env')  # returns a dict {'DOMAIN': 'example.org', 'ADMIN_EMAIL': 'admin@example.org', ...}
secure_password = secure_config['SECURE_PASSWORD']
```

Alternatively, `dotenv` can load your `environment.env` key value-pairs as environment variables - very useful if you need these values somewhere else. Simply do this:
```python
from dotenv import load_dotenv
import os

# Load environment variables from environment.env - useably by any other application
# If an environment variable already exists, overwrite it
load_dotenv('environment.env', override=True)
# To use the environment variables here using Python:
secure_password = os.getenv('SECURE_PASSWORD')
```