# Assiment 0: Build Up Your Lab
## Goals
Since most of the students do not have relevant background knowledge, we give the following detailed guidance to help you install the experimental environment and tools required for the course. 

If you encounter any difficulties, we hope you can try to solve them yourself first (check the official documents or Google/Baidu). You are also welcome to come to us in the office hour next week (next Monday is a holiday, and we will inform you later).

In this course, we will use `Python` as the programming language, use scientific computing related packages like `scikit-learn` and `PyTorch` and mainly through `jupyter notebook` for tutorial. If you have your own familar tools, just use it.

# Python and Jupyter Notebooks
[Python](https://www.python.org/) can be easy to pick up whether you're a first time programmer or you're experienced with other languages. 

The Jupyter Notebook is the original web application for creating and sharing computational documents. It offers a simple, streamlined, document-centric experience. 

We recommend using the Anaconda distribution to install Python and Jupyter.

# Anaconda
Anaconda offers the easiest way to perform Python data science and machine learning on a single machine. Anaconda conveniently installs Python, the Jupyter Notebook, and other commonly used packages for scientific computing and data science. 
1. Download from [Anaconda](https://www.anaconda.com/products/distribution) or [Tsinghua Mirrors](https://mirrors.tuna.tsinghua.edu.cn/anaconda/archive/), sort by Date and select the latest version of the platform you are using. If you want to use Anaconda in your remote server, you can see [this](https://www.lizhihao999.cn/ubuntu%E6%9C%8D%E5%8A%A1%E5%99%A8%E5%AE%89%E8%A3%85anaconda).

    <img src="./img/anaconda.png"  alt='missing' width="600">

2. Install the version of Anaconda which you downloaded, following the instructions on the download page.
3. Congratulations, you have installed everything you need including Jupyter Notebook. 

## Basic Usage of Anaconda and conda
We will introduce some common information about Anaconda and conda, we strongly suggest you to check the [User Guide](https://docs.anaconda.com/anaconda/#anaconda-distribution) on your own and find what you need. 

Conda is a powerful package manager and environment manager that you use with command line commands at the Anaconda Prompt for Windows, or in a terminal window for macOS or Linux. 

### 1. Starting conda
For full use of conda, see [this](https://conda.io/projects/conda/en/latest/user-guide/getting-started.html#getting-started-with-conda).
- Windows: From the Start menu, search for and open "Anaconda Prompt."

    <img src="https://conda.io/projects/conda/en/latest/_images/anaconda-prompt.png" alt="missing" width="400">

- MacOS/Linux: Open a terminal window.
you shall see a `(base)` prefix and type `anaconda -V` and `conda -V` to check your Anaconda version and conda version.

    <img src="./img/anaconda-v.png"  alt='missing' width="600">

### 2. Managing environment
When you begin using conda, you already have a default environment named base. You don't want to put programs into your base environment, though. Please remenber to **keep different projects' environments isolated** to each other by creating separate environments. Conda allows you to create separate environments containing files, packages, and their dependencies that will not interact with other environments.
1. Create a new environment for this course (you can choose any name of the environment) and specific the python version ($\ge 3.9$), When conda asks if you want to proceed, type "y" and press Enter.
    ```shell
    conda create --name matinfo python=3.9
    ```
    No packages will be installed in this environment. 

2. To use, or "activate" the new environment, type the following:
    ```shell
    conda activate matinfo
    ```
    And you can see the prefix turned to `(matinfo)`.

    <img src="./img/conda-env.png"  alt='missing' width="600">

3. To see a list of all your environments, type:
    ```shell
    conda info --envs
    ```
    <img src="./img/env-list.png"  alt='missing' width="600">

4. Change your current environment back to the default (base): `conda activate`

5. To deactivate an environment, type: `conda deactivate`, conda will remove the path for the currently active environment from your system command.

6. Yo remove an environment, run:
    ```
    conda env remove --name matinfo
    ```
    <img src="./img/env-rm.png"  alt='missing' width="600">


### 3. Managing packages
1. To see all your installed packages in current env, type `conda list`.

    <img src="./img/package-list.png"  alt='missing'  width="600">

    As you can see, there is not any packages in the `matinfo` env. 

2. To install a specific package like `scikit-learn` into the current environment:
    ```
    conda install scikit-learn
    ```
    conda will collect the dependent packages and install at once.

    <img src="./img/package-install.png"  alt='missing'  width="400">

    Use `conda list` to check:

    <img src="./img/sklearn-check.png"  alt='missing' width="400">

3. To remove a package such as `scikit-learn` in current environment:
    ```
    conda remove scikit-learn
    ```

**Now, try to install package `matplotlib` in `matinfo`.**

# Jupyter Notebook
We will use two types of cells in this course. 

One is `Markdown Cells` where you can use markdown formatting language. 
Markdown is a text-to-HTML conversion tool for web writers. Markdown allows you to write using an easy-to-read, easy-to-write plain text format, then convert it to structurally valid XHTML (or HTML). 
Here is [basic syntax](https://www.markdownguide.org/basic-syntax/) of markdown.

The other one is `Code Cells` where you write code.

1. To install, activate your environment and use conda: 

    ```
    conda install notebook
    ```

2. Open this folder in your terminal, first find and copy the path of the folder, like this:
    ```
    /Users/lizhihao/Library/CloudStorage/OneDrive-HKUST(Guangzhou)/2022/FUNH5010/tutorial/introduction
    ```
    use `cd` to enter this folder:
    ```
    cd /Users/lizhihao/Library/CloudStorage/OneDrive-HKUST(Guangzhou)/2022/FUNH5010/tutorial/introduction
    ```
    and start the notebook server from the command line:
    ```
    jupyter notebook
    ```
3. You should see the notebook open in your browser: 

    <img src="./img/notebook-open.png"  alt='missing' width="400">

4.  Open `introduction.ipynb`:

    <img src="./img/notebook-intro.png"  alt='missing' width="400">

    Scroll to this part in the `jupyter notebook`, and you shall close this pdf. We will provide tutorial materials by `jupyter notebook`.

Try to run this `code cell` (Remember to check if `matplotlib` is installed):


In [None]:
from matplotlib import pyplot as plt
import numpy as np

# Generate 100 random data points along 3 dimensions
x, y, scale = np.random.randn(3, 100)
fig, ax = plt.subplots()

# Map each onto a scatterplot we'll create with Matplotlib
ax.scatter(x=x, y=y, c=scale, s=np.abs(scale)*500)
ax.set(title="Some random data, created with Jupyter!")
plt.show()

# Python Packeges

In this course, we will manily use `scikit-learn` and `PyTorch` to implement machine learning models. 

## Scikit-learn
`Scikit-learn` is an open source machine learning library that supports supervised and unsupervised learning. It also provides various tools for model fitting, data preprocessing, model selection, model evaluation, and many other utilities. We've introduced how to install it above. We'll use `scikit-learn` in the following tutorial, see [User Guide](https://scikit-learn.org/stable/user_guide.html) for full usage. 

Try to run this example:

In [None]:
# Code source: Jaques Grobler
# License: BSD 3 clause

import matplotlib.pyplot as plt
import numpy as np
from sklearn import datasets, linear_model
from sklearn.metrics import mean_squared_error, r2_score

# Load the diabetes dataset
diabetes_X, diabetes_y = datasets.load_diabetes(return_X_y=True)

# Use only one feature
diabetes_X = diabetes_X[:, np.newaxis, 2]

# Split the data into training/testing sets
diabetes_X_train = diabetes_X[:-20]
diabetes_X_test = diabetes_X[-20:]

# Split the targets into training/testing sets
diabetes_y_train = diabetes_y[:-20]
diabetes_y_test = diabetes_y[-20:]

# Create linear regression object
regr = linear_model.LinearRegression()

# Train the model using the training sets
regr.fit(diabetes_X_train, diabetes_y_train)

# Make predictions using the testing set
diabetes_y_pred = regr.predict(diabetes_X_test)

# The coefficients
print("Coefficients: \n", regr.coef_)
# The mean squared error
print("Mean squared error: %.2f" % mean_squared_error(diabetes_y_test, diabetes_y_pred))
# The coefficient of determination: 1 is perfect prediction
print("Coefficient of determination: %.2f" % r2_score(diabetes_y_test, diabetes_y_pred))

# Plot outputs
plt.scatter(diabetes_X_test, diabetes_y_test, color="black")
plt.plot(diabetes_X_test, diabetes_y_pred, color="blue", linewidth=3)

plt.xticks(())
plt.yticks(())

plt.show()

## PyTorch
`PyTorch` is an open source machine learning framework that accelerates the path from research prototyping to production deployment. 

To start locally, you can follow [doc](https://pytorch.org/get-started/locally/). 
1. Open this [site](https://pytorch.org/), choose your `OS`, `package`(conda), `Language`(Python) and `compute platform`(Default for CPU):

    <img src="./img/pytorch-version.png"  alt='missing' width="400">

    Then copy the follow command.

2. Open your terminal (Anaconda Prompt), activate our env `matinfo`, paste the command and run it.

    <img src="./img/pytorch-install.png"  alt='missing' width="600">

3. After install, use `conda list` to check:

    <img src="./img/pytorch-check.png"  alt='missing' width="600">

Try to run:

In [None]:
import torch
x = torch.rand(5, 3)
print(x)

# IDE
In this course, we recommend two IDEs, one is the well-known IDE made for python: `PyCharm` and the other one is a universal editor called `Visual Studio Code`. In fact, you can use both of them, and they are both really good tools.

## PyCharm
Pycharm is well designed for python, and there are two editions, one is `community` which is for free and there is another `professional` version, which has more functions but is charged. Actually, `community` edition is enough for this course and common use. However, you can use [education](https://www.jetbrains.com/edu-products/) to use `professional` for free. 
Download from [here](https://www.jetbrains.com/pycharm/), choose the version according to your platform: 

<img src="./img/pycharm-web.png"  alt='missing' width="600">

**Note:** We only introduce basic usages, to find detailed information and usages, see [user guide](https://www.jetbrains.com/help/pycharm/quick-start-guide.html).

1. To open this `introduction` as a project, first open `PyCharm` and choose `open`:

    <img src="./img/pycharm-open.png"  alt='missing' width="600">

2. Choose the `introduction` folder and click `open`:

3. Open `hello world.py` file. Find and click `Interpreter Settings...` as shown or find it in other way:

    <img src="./img/pycharm-hello.png"  alt='missing' width="600">

4. Click to add a new local interpreter:

    <img src="./img/pycharm-env.png"  alt='missing' width="600">

5. Choose the new anaconda env `matinfo` we just created and click `OK`:

    <img src="./img/pycharm-conda.png"  alt='missing' width="600">

6. `Run` the `hello world.py` (there are lots of ways to run a file, try to find out):

    <img src="./img/pycharm-run.png"  alt='missing' width="600">

7. You shall see the result:

    <img src="./img/pycharm-result.png"  alt='missing' width="600">

8. You can also use `jupyter notebook` in Pycharm without open it in terminal:

    <img src="./img/pycharm-jupyter.png"  alt='missing' width="600">


## Visual Studio Code
Visual Studio Code is a lightweight but powerful source code editor. Actually, through rich extensions,you can use VScode to anything you want.

Download from [here](https://code.visualstudio.com/).

<img src="./img/vscode.png"  alt='missing' width="600">

For this course, we recommend you to install `python`, `Jupyter` and `Jupyter Keymap` extensions. You can go through the extension market to try different extensions.

<img src="./img/extensions.png"  alt='missing' width="400">

**Note:** We only introduce basic usages, to find detailed information and usages, see [Docs](https://code.visualstudio.com/docs).

1. To open this project, open `introduction` folder.

2. Open `introduction.ipynb` file to see this jupyter notebook:

    <img src="./img/vscode-jupy.png"  alt='missing' width="600">

3. Try to change the kernel to our `matinfo`:

    <img src="./img/vscode-inter.png"  alt='missing' width="800">

4. Open `hello world.py` file and change the interpreter to `matinfo`:

    <img src="./img/vscode-env.png"  alt='missing' width="600">

5. Run it:

    <img src="./img/vscode-run.png"  alt='missing' width="600">

6. With extension `Jupyter`, you can easily use `jupyter notebook` in VSCode without open it in terminal:

    <img src="./img/vscode-jupyter.png"  alt='missing' width="600">

    Remember to change the kernel to `matinfo`:

    <img src="./img/vscode-kernel.png"  alt='missing' width="600">
 

# Acknowledgment
I write this notebook mainly according to official docs.