---

<img src="logo/anchormen-logo.svg" width="500">

---

#  BASICS Lab: Python Environment 

In this lab session we will take a look at some usefull working tools. 

## Goals

After this lab, you will:

- Have installed Anaconda
- Know how to use conda to create and work with isolated Python environments
- Know how to install Python packages
- Know how to launch a Jupyter Notebook server in its own conda environment
- Know how to create and work with Jupyter Notebooks
- Know how to work with markdown and code cells

## A: Installing Anaconda

**Exercise**: If you haven't already done so, install Anaconda from https://www.continuum.io/downloads and choose
Python version 3.5 or higher (these instructions have been checked using Python
3.5). You are encouraged to install the English version of this software.

## B: Virtual Environments and Installing Packages

### Creating Environments

**Exercise**: Now that we have conda, let’s create an environment that uses
pandas. Because Conda provides pre-compiled versions of packages, it can install
pandas even if you don’t have a C or C++ compiler. This alone is a major reason
to prefer conda over virtualenvs.

> ```conda create --name mypanda pandas```

**Tip**: when using literal text from these instructions, try to copy/paste from the
digital version of this document to prevent typos.
 
Gotcha’s:

- It’s ```conda create```, not ```conda env create```
- You have to specify at least one name of a package the environment should contain. When in doubt, tell it to install ```pip```.


### Activating / Deactivating Environments

Packages installed inside an environment are not accessible outside it, and vice
versa. This means that if your program works in an environment, you can be sure
that that environment’s specification is sufficient for other people trying to run
your program.

**Exercise**: Activate and deactivate the environment and find out its effects:

>```python 
> activate mypanda      # On Linux: source activate mypanda
> # Now open python, and check that `import pandas` works.
>
> deactivate mypanda    # On Linux: source deactivate
> # Open python again and verify that `import pandas` no longer works.
>```

### Installing Packages

Packages can be installed as follows.

>```
> # Using conda:
> conda install <package_name>
>
> # Using pip:
> pip install <package_name>
>```

**Exercise**: Activate the `mypanda` environment again and install 2 libraries from this list with useful Data Science libraries:

- numpy (math functions, arrays)
- pandas (dataframes, data munging)
- sklearn (machine learning)
- matplotlib (visualization)
- seaborn (visualization)

### Environment Specification Files

Next, we shall export an environment specification to a file, traditionally named
environment.yml and kept in the directory of the project that it belongs to.

Note that yml comes from YAML (’YAML Ain’t Markup Language’). YAML is like XML: a plain text format for nested data structures. It can do everything XML can. Unlike XML, YAML was designed to be nice for humans to read and edit, which makes it a popular format for configuration files. Furthermore, by keeping this file under version control, you keep a record of your project’s requirements, which makes it easy for others to build and deploy.

**Exercise**: Export your virtual environment to a YAML file:

>```python 
>conda env export --name mypanda > environment.yml
># have a look inside using your favourite editor
>notepad environment.yml```

**Exercise**: Once the specification is created, you can use the file to create an environment in a reproducible way:

>```python 
>conda env create --file environment.yml --name redpanda
>```

Creating an environment from a file is done with `conda env create`; creating
an environment from scratch is done with `conda create`. This inconsistency is
a mystery.

**Note**: if after executing `conda env create --file environment.yml --name redpanda` this error appears:

>```python 
>Error: prefix already exists:
>``` 

try solving it using:

>```python 
>conda install -n root _license
>```

## C: Jupyter Notebooks
 
Jupyter notebooks let you interactively write code and interleave it with its results,
and with rich text. By keeping code, results, and text together, you get
what is known as *reproducible research*. Jupyter notebooks are an offshoot of
the IPython interpreter: IPython is a user-friendly interactive interpreter, while
Jupyter focuses on the notebook technology. Thanks to the recent seperation of
the codebases, Jupyter now also supports non-Python kernels such as R or Ruby.

### Exercises 

The first few steps use skills from the conda lab: we’ll set up an environment in
which to run the notebook.

1. Create a conda environment with the jupyter package installed.
2. Activate the environment
3. Create a folder in which you want to work, and export the environment.yml.

Next, open a command line and launch a notebook by typing the command: 

> ```jupyter notebook```

This command automatically launches a browser. Click around the interface and
find out how to do the following things:

- Create a new notebook
- Open the notebook
- Type some Python code and execute it with ```Shift + Enter```
- Add a Markdown cell with ```*emphasis*, [a link](http://example.com)```, and ```## a level-two header```
- Save a checkpoint of the notebook
- Export the notebook as HTML

**Tip**: use Esc-H to show keyboard shortcuts in Jupyter.

--- 

Finally, open Explorer or the command line and look in the folder where you ran
the notebook server. There will be an file .ipynb. You can send this file to other
people just as you would with a normal file. If you run a notebook server in a
folder containing this file, it will appear in the list of notebooks.

## Summary

In this lab session we installed the `Anaconda` Python distribution and observed how to create, activate, deactivate and save `virtual environments`. Furthermore, we learned how to work with the `jupyter notebook` interface in an interactive work-flow.

- Head over to the next notebook to learn more about Python Basics