# Getting Started

In this notebook, I will briefly go over the steps required for someone to start working in Jupyter Notebook environment.

This notebook is copied partly from [https://github.com/UBC-CS/cpsc330](https://github.com/UBC-CS/cpsc330).

## Install on your local machine

- [Git](#github) for version control

- [Python and Conda](#python_conda)

- [JupyterLab](#jupyterlab)

- [Virtual Environment](#virtual_env)

- [Debugging](#debug)

## Use online servers

- [UBC Syzygy](https://ubc.syzygy.ca/)

- [Google Colab](https://colab.research.google.com/)

<img src="https://i.pinimg.com/originals/9a/01/fa/9a01fa3c079c6a8d79379f9e4f94e016.png" style="height:300px" />


----
----

# Github <a class="anchor" id="github"></a>

# What are git and GitHub?

GitHub uses the [git version control system](https://en.wikipedia.org/wiki/Git). Roughly speaking, git is the system that takes care of different versions of your files and merging together changes from different collaborators (and much more!). git is [free software](https://en.wikipedia.org/wiki/Free_software). 

GitHub is a corporation and the name of the corporation's product. Their site github.com provides servers that host git repositories. This serves as a central place that different collaborators all sync up with. 

# GitHub Desktop

The instructions below pertain to using git from the command line. For those who are less comfortable with this sort of thing, I was told that [GitHub Desktop](https://desktop.github.com/) makes things a lot easier. So, feel free to try this route instead of the instructions below.

# Command-line git

## Setting up

GitHub is a web-based application and does not require set-up. Since you will be cloning the [course GitHub repository](https://github.com/UBC-CS/cpsc330) in order to run the lecture notebooks locally, you need git installed locally. Follow the instructions below for this. 

#### Mac Users

Open Terminal (Applications –> Utilities folder or search with Spotlight). From the terminal, run the command:

```
xcode-select --install
```

This will install git and many other very useful applications as well (including Make).

#### Ubuntu Users

Open the terminal and install git using your system package manager. For example

```
sudo apt-get install git
```

should do the trick on Ubuntu.

#### Windows Users

Go to http://git-scm.com. Click on the download link, and accept all defaults in the installation process. 
Installing git will also install for you a minimal UNIX environment with a bash shell and terminal window. 


If the above does not work for you, follow the [installation instructions here](https://git-scm.com/book/en/v2/Getting-Started-Installing-Git).

## Testing your git installation

Open your Terminal (or Git for Windows) and run `git --version`.  
If you are returned the version of git, it means your install was successful!

## Learning git

There are many free online resources for learning git. One possibility is the [Software Carpentry git tutorial](https://swcarpentry.github.io/git-novice/). 

----
----

# Python and Conda <a class="anchor" id="python_conda"></a>

We will be using Python because it is open source and widely used in machine learning and data science. We will use Python 3 (in particular 3.9).

We recommend the Anaconda Python distribution because it comes bundled with a bunch of useful packages (NumPy, SciPy, scikit-learn, Jupyter notebook) pre-installed. You can [download and install Anaconda from their website](https://docs.anaconda.com/anaconda/install/index.html) that is suitable for your operating system. 

To make sure that Anaconda and Python are correctly installed, follow the instructions below based on your operating system. 

### macOS

After installation, go to spotlight search on your macbook and type "terminal" and go to this program. If you already have terminal open,  restart it. If the installation was successful, you will see (base) prepending to your prompt string. For example, here is how the terminal prompt looks like on my macbook. 

```
(base) kvarada@CPSC-W-KVARADA01:~$
```

To confirm that conda is working, you can ask it which version was installed:

```
conda --version
```

which should return something like: 

```
conda 4.10.3
```

Now, type 

```
python --version
``` 

which should return Python 3.9.0 or greater. 


### Windows

After installation, open the Start Menu and search for the program called `Anaconda Prompt`. When this opens you will see a prompt similar to 

```
(base) C:\Users\your_name
```

Type the following to check that your Python installation is working:

```
python --version
```

which should return Python 3.9.0 or greater. 

_Note: If instead you see Python 2.7.X you installed the wrong version. Uninstall the Anaconda you just installed (which usually lives in the /opt directory), and try the installation again, selecting Python 3.9._


## Installing Python packages

`conda` installs Python packages from different online repositories which are called "channels". A package needs to go through thorough testing before it is included in the default channel, which is good for stability, but also means that new versions will be delayed and fewer packages are available overall. There is a community-driven effort called the [conda-forge](https://conda-forge.org/), which provides more up to date packages. To enable us to access the most up to date version of the Python packages we are going to use, we will add the more up to date channel. To add the conda-forge channel by typing the following in the terminal:

```
conda config --add channels conda-forge
```

----
----

# JupyterLab <a class="anchor" id="jupyterlab"></a>

We will be using [JupyterLab](https://jupyter.org/) as our main coding environment and `pandas` is one of the key data analyses packages. Install them via the following commands:

```
conda install pandas jupyterlab jupyterlab-spellchecker nb_conda_kernels
```

For other packages we need, we will be creating a `conda` virtual environment. (See the instructions in the next section.)

----
----

# Virtual environment <a class="anchor" id="virtual_env"></a>

### What and Why
[A virtual environment](https://docs.python.org/3/library/venv.html) is a Python environment such that the Python interpreter, libraries and scripts installed into it are isolated from those installed in other virtual environments, and (by default) any libraries installed in a “system” Python, i.e., one which is installed as part of your operating system.  For example, you may want a certain version of tensorflow for one project but another version for a different project. Virtual environments helps us to build environment isolation between different projects and make sure any change to dependencies affects only the projects that need it.

### Setting up a virtual environment: Conda environments

1. Make sure that `conda` is installed by running
    ```
    conda env list
    ```
    You should see a list of environments as the output. If Miniconda is not installed, you can download Miniconda (a small, bootstrap version of Anaconda) from [here](https://docs.conda.io/en/latest/miniconda.html).  
2. Download [cpsc330env.yml](https://github.com/UBC-CS/cpsc330/blob/master/docs/cpsc330env.yml) (or use the file provided to you) and put it in your working directory
3. Create an environment by 
    ```
    conda env create -f cpsc330env.yml
    ```
    which allows `conda` to download the dependencies needed for this course and put them in a virtual environment named `cpsc330`.
    You can check that the environemnt is installed successfully by running `conda env list` again. `cpsc330` should show up in the output.
4. Activate the environment with
    ```
    conda activate cpsc330
    ```
    After a successful activation, something like `(cpsc330)` should show up in the terminal.
5. To deactivate the environment, run
    ```
    conda deactivate
    ```    
6. We are all set! When you want to run the lecture materials or work on your homework, start Jupyter Lab from your base environment, as shown below.

```(base) kvarada@CPSC-W-KVARADA01:~$ jupyter lab```

Jupyter Lab will be opened in your default browser. Navigate to the appropriate notebook in Jupyter Lab. When you open the notebook, you should see our newly created `conda` environment `cpsc330` there. See the screenshots below. Select `cpsc330` as the preferred kernel. 

<img src="https://github.com/UBC-CS/cpsc330/blob/master/docs/img/conda-kernel.png?raw=true" style="height:500px" />

<img src="https://github.com/UBC-CS/cpsc330/blob/master/docs/img/conda-env.png?raw=true" style="height:500px" />
    
For more information on conda environments, see [here](https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html).

----
----

# Debugging <a class="anchor" id="debug"></a>

If the `conda env create -f cpsc330env.yml` command above results in an error on your computer: 

- Figure out what package it's failing on from the error message.
- Get rid of the line with that package from your local copy of `cpsc330env.yml`. 
- Try creating the environment again with the modified `cpsc330env.yml`. 
- Once the environment is created, activate the environment and install the missing packages manually in the environment. You may have to install these packages using `pip install` in some cases, as the most recent version of the package might not available via `conda` for your operating system yet. 
- If you still have trouble with the environment and running lecture notebooks on your machine, make use of office hours and tutorials. 