# Data Science Ex 00 - Preparation

20.02.2023, Lukas Kretschmar (lukas.kretschmar@ost.ch) 

## Let's have some Fun with Data Science!

Welcome to the Data Science module.
During the practical exercises, we will use an interactive environment where you can mix text and code with the awesome feature that you can execute the code.
This environment is called [Jupyter Notebooks](https://jupyter.org).
And you have several different ways to run the notebooks on your machine.

In this document, we describe how you can install the tool with the following three approaches:

- Anaconda (recommended if you are not that geeky, it's the easiest way)
- Miniconda (a smaller version of Anaconda, where we have to add some features by ourselves)
- VS Code with the Jupyter Extension

## Install Anaconda

If you want to install Anaconda, head to https://www.anaconda.com/products/individual and download the installer for your system.
Install the distribution for your operating system that uses **Python 3.9**.

**Note:** Anaconda needs up to **6 GB** of disk space. So, make sure you have this amount of storage available on your machine.

For the installation process, please follow the installation instructions provided under https://docs.anaconda.com/anaconda/install/.
If you don't want to install Anaconda to your user profile but the whole machine, have a look at the instrutions under https://docs.anaconda.com/anaconda/install/multi-user/.

### Known Issues

#### Non-unicode characters in user name (e.g., ä, ö, ü, etc.)

We have encountered problems with students having non-unicode characters in their installation path (e.g., ä, ö, ü, é, etc.).
This might be a problem for you if you use the default location which points to your user profile (e.g., *C:\Users\\[your user name]*).
Please choose a location that does only contain ASCII characters.

**Solution:**
- Choose a location that contains only ASCII characters
- Install Anaconda for multiple-users (https://docs.anaconda.com/anaconda/install/multi-user/)

If you've installed Anaconda nevertheless to a "non-suitable" location, there exists a simple workaround.
In this case you have to change the default security settings on your notebook server and open the website everytime by hand (or you try to find the url that your notebook server hosted).
You'll find the instructions at the end of this document.

## Install Miniconda

Miniconda is a stripped down version of Anaconda, containing just the necessary packages.
Using this tool, you have to install all the packages that we need throughout the exercises by yourself.
But that's not that hard.

You can download Miniconda under https://docs.conda.io/en/latest/miniconda.html#windows-installers.
Just download the version that is best suitable for your machine.
You can find some further instructions for the installation under https://conda.io/projects/conda/en/latest/user-guide/install/windows.html, but basically you just have to execute the installer and everything works.

To install the needed extensions, you need to open a command prompt and enter a command.
The installer from above added a **Anaconda Prompt (Miniconda)** CLI to your machine.
Open the start menu, search for it and exectue it with **elevated privileges (administrator rights)**.
A command line will open.
Enter the following command
```
conda install anaconda-navigator jupyterlab numpy pandas matplotlib seaborn scikit-learn
```
Press *ENTER* to execute the command.
When asked to proceed, accept with `y` and wait until the installer is completed.

## Install VS Code with the Jupyter Extension

If you've already installed VS Code on your machine or want to work with it, you can do so as well.
Download and install VS Code from https://code.visualstudio.com.
When the installation is completed, you also need to install Python on your machine.
Download and install Python from https://www.python.org.

When you've installed your IDE and programming language, open VS Code and search for the **Jupyter** extension.

- Name: Jupyter
- Id: ms-toolsai.jupyter
- Description: Jupyter notebook support, interactive programming and computing that supports Intellisense, debugging and more.
- Version: 2023.1.2010391206 (or newer)
- Publisher: Microsoft
- VS Marketplace Link: https://marketplace.visualstudio.com/items?itemName=ms-toolsai.jupyter

To be able to run all the notebooks, open a Terminal in VS Code and run the following code
```
pip install numpy pandas matplotlib seaborn scikit-learn
```
Press *ENTER* to execute the command.

## Post-Installation

### Update

**Note:** This step is only needed if you've installed Anaconda or Miniconda.

After the installation is complete, you should also run an update to ensure that all packages are up-to-date.
To do so, open an **Anaconda Prompt** with **elevated privileges (administrator rights)** and enter the following command
```
conda update --all
```
Press *ENTER* to execute it

### Installation of additional Modules

- mlxtend

```
conda install -c conda-forge mlxtend
```

```
pip install mlxtend
```

### Configuration

Juypter Notebooks opens the file browser in a specific directory.
Per default, it's your *My Documents* folder.
You can change the starting location to a different path by editing the configuration.
So, the [following](https://stackoverflow.com/questions/35254852/how-to-change-the-jupyter-start-up-folder) steps are only necessary, if you want Jupyter Notebooks to start from a specific location.

Open an **Anaconda Prompt** and enter the following
```
jupyter notebook --generate-config
```
This command will generate a configuration for your Jupyter installation at *C:\Users\yourusername\\.jupyter\jupyter_notebook_config.py* (for the nerds of you - yeah, it's a python code file).
The location on a Mac is probably at a similar location.
Open the file with a text editor and search for the line
``` python
#c.NotebookApp.notebook_dir = ''
```
Remove the \# at the beginning (this is the character for code comments) and enter the path you want Jupyter to start from.
Your entry should now look like
``` python
c.NotebookApp.notebook_dir = 'path/to/your/folder'
```
**Note:**
- And use / within your path and not \\ as it is common on windows systems.
Otherwise the path might not work.
- Use single quotes (') instead of double quotes (")

### Change the security settings

**PLEASE NOTE: This step is only necessary, if your notebooks won't start property (e.g., installation at a location with unicode characters).**
If your Jupyter Lab or Jupyter Notebook starts, you must not change the security settings.

Within the configuration, you'll find the following line
``` python
# c.NotebookApp.token = '<generated>'
```
Per default, a new token is generated everytime you start a new server.

Now, you can either set the token to a fixed value, like
``` python
c.NotebookApp.token = 'ffed3a68-f5b2-47a3-bb11-df8711c5aab3'
```
*Note: This is just an example. You can choose your own token value.*

or to none (security is disabled)
``` python
c.NotebookApp.token = ''
```

In the first case, your server will always run at
- **JupyterLab:** http://localhost:8888/lab?token=ffed3a68-f5b2-47a3-bb11-df8711c5aab3
- **Jupyter Notebook:** http://localhost:8888/tree?token=ffed3a68-f5b2-47a3-bb11-df8711c5aab3

In the second case, your server will always run at
- **JupyterLab:** http://localhost:8888/lab
- **Juypter Notebook:** http://localhost:8888/tree

Please note: The port (`8888`) might be incremented by `1` (e.g. `8889`) if `8888` is already blocked.
Thus, if http://localhost:8888/lab is already used, the next server will be hosted at http://localhost:8889/lab

## Run Anaconda

Check that your installation is running by starting **Anaconda Navigator**.
Start the **Anaconda Navigator** with elevated privileges, otherwise your notebooks might not run.
When running the navigator, you should be able to get to the following screen.

<img src="./AnacondaNavigator.png" style="height:600px" />

And then try to start either **JupyterLab** or **Jupyter Notebook**.
Both tools will open a new browser tab.
And you are ready for the exercises.