# <u> For all assignments regarding the course - **Please read** </u>

1) **Code:** Python is an object-oriented interpreted script language, and all modern neural stuff is written almost exclusively in Python, so, we'll use this language for all coding assignments in this course as well. We will be working based on the latest Python and PyTorch versions. Despite that, there is no issue if you want to work with older Python/PyTorch versions. What is unworkable for me, however, is having to create a new environment in order grade each one of your deliverables. Hence, I would require that your submitted code runs <u>as delivered</u> on Google colab, to be fair and save time from everybody.

2) **Report:** Regarding your reports, one word: **LaTeX**. I personally use TeXstudio (https://www.texstudio.org) for editing (and some compiler like https://miktex.org) when I'm solo, because it is native, fast, easy, works on all OSs, and Overleaf (https://www.overleaf.com) otherwise. Of course, you are welcome to use anything you want, as long as your report is a .pdf produced by LaTeX, or, at least, is a very presentable one. <u>Do not</u> submit your report in, e.g., some Microsoft Word format, unless you are either feeling lucky, or are some wizard that makes it look as presentable as LaTeX.

3) **Deliverables:** Provide a <u>single</u> compressed file (.zip, .rar, etc.) that includes your report, and each programming exercise in a <u>separate</u> Python file (.ipynb or .py). Once ready, attach the compressed file to an e-mail and send it to the TA. You can use https://wetransfer.com in case you encounter any issues, e.g., large size. Do not forget to include your full name and registry number at both your report and the name of the submitted file. Submission instructions will likely be given at each assignment, but, even in case they are not, you can follow the above.

These are the three main things we require from you. If there is any problem regarding the above, contact the TA during our Friday tutorials 10:00-12:00 room H.206, or via e-mail at: mrap@csd.uoc.gr

## <u>Setting a Python environment with Anaconda</u>

In order to run Python scripts, we need to set up a Python environment. There are many ways to do it, and you are welcome to chose whatever suits you the best. I have always used Anaconda and/or pip for this task, so I'll explain briefly how it works below.

<b>Anaconda</b> (or just Conda) is an option for creating and managing virtual environments, with either its GUI, terminal, or both. Conda allows to you manage multiple Python installations at the same time, and in each of them you can have different libraries/packages. For instance, you may want to have a separate Conda environment for each assignment depending on its specific needs, although it probably won't be necessary for this course, except, perhaps, your project.

Guides for installing Anaconda:

- Windows: https://docs.anaconda.com/anaconda/install/windows/
- Linux: &nbsp; &nbsp; &nbsp; &nbsp;https://docs.anaconda.com/anaconda/install/linux/
- macOS: &nbsp; &thinsp;&thinsp;https://docs.anaconda.com/anaconda/install/mac-os/

Don't worry if the guide mentions, like, "Python 3.7". If you follow the instructions you should be able to get the latest Python installation anyway.

### <u>Set up a Conda environment from zero</u>

In order to create an environment with all the necessary packages, you can run the following commands in order (at the *anaconda powershell prompt* in case of a Windows OS):

1) Create a new environment named "hy673" with Python 3.11 (there are some issues with Python 3.12 and Jupyter):

`conda create --name hy673 python=3.11`

2) Switch to your environment with:

`conda activate hy673`

You have to use this command every time after you start Anaconda. Otherwise, the default environment will be "base". You can work on "base", if you want, but, I would advise against it. Just create a new one for this course.

3) Install some useful packages we'll need:

`conda install numpy pandas matplotlib scikit-learn tqdm`

You can also specify the version for each package, or install packages with pip instead of conda, although, I doubt you will need it for the assignments.

4) Install the most important package for our machine learning needs: PyTorch. If you just want regular installation which runs <u> only on CPU </u>, you can run:

`conda install pytorch torchvision torchaudio cpuonly -c pytorch`

Otherwise, you can generate the command for a custom installation, at, say, the following link:
<br> https://pytorch.org/get-started/locally/ <br>
If you can use a GPU, I would definitely recommend to go for it instead of plain CPU installation. For example (using the link above): <br>
`conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia`

## <u>Set up Conda environment with one command</u>

You can create an environment from a .txt or .yml file that lists all the packages/libraries you need, using the following command:

<code>conda env create --name hy673 --file environment.txt</code>  

Where the name of the environment will be `hy673` and the requirements are specified in `environment.txt`. There are many ways to get this environment file, such as:

`conda list -e > environment.txt`

Alternatively, when exporting your environment, you can use the option `--from-history`. It will export just the libs you explicitly installed, and not the dependencies:

`conda env export --from-history > environment.yml`

No platform specific info will be exported this way, thus, it may prevent some headaches.

## <u>Jupyter and Python IDEs</u>

In this tutorial, and likely also some following ones, we will be using Jupyter notebooks to run Python code.
Jupyter notebooks have multiple advantages, like:
- They allow you to divide and run code in separate cells.
- Cells can either contain Python code or Markdown, allowing us to combine code, text, and the ability to write math like:
\begin{equation*}
e^{j\pi} + 1 = 0.
\end{equation*}
(Yes, I use $j$ for the imaginary unit.)

<b>Installation:</b> One way to install Jupyter is by running on your `hy673` conda environment:

`conda install jupyter`

Another option is to manually click it on Conda's GUI, or use pip, etc. Anyhow, once you install that, to start a session in Jupyter, you can either do it from the Conda GUI, or run the command:

`jupyter-notebook`

This should open automatically in your browser. If that does not happen, you can check the output on the terminal, where you should see a line like:

`Jupyter Notebook 6.4.12 is running at: http://localhost:8889/?token=****************************`

So, just copy the link on your browser.

If you are already using an IDE like PyCharm or VSCode for your Python code, you can use the Jupyter extension to open the notebooks instead of a browser:
- PyCharm (only Professional edition): <br>
https://www.jetbrains.com/help/pycharm/jupyter-notebook-support.html
- VSCode: <br>
https://marketplace.visualstudio.com/items?itemName=ms-toolsai.jupyter

## <u>Verifying your installation</u>

Now that you have set up Anaconda, the `hy673` environment, and Jupyter, you can check that everything is working fine.
You may start by opening this Jupyter notebook and running the cell below.
If it doesn't raise any error, then you are on a good track: it means that you have installed the most important packages for this course.

In [7]:
import os
import sys
import tqdm # This is not important; we use it to display loop progress.
import numpy as np
import pandas as pd
import sklearn
import matplotlib
import torch as tc

Now, you can try to run the cell below, which verifies that all the package versions you have meet a minimum requirement for the course. Even if they do not, it shouldn't be an issue as long as they are not very old.

In [8]:
require_dict = {
    'python': '3.10.9',
    'numpy': '1.23.5',
    'pandas': '1.5.2',
    'matplotlib': '3.6.2',
    'scikit-learn': '1.2.0',
    'pytorch': '1.13.1',
}

version_dict = {
    'python': sys.version.split(" (")[0],
    'numpy': np.__version__,
    'pandas': pd.__version__,
    'matplotlib': matplotlib.__version__,
    'scikit-learn': sklearn.__version__,
    'pytorch': tc.__version__,
}

bar = 64*'-'
all_passed = True

for key in require_dict:

    cap_key = key.capitalize()

    ver_dict = version_dict[key].split('.')[0:2]
    req_dict = require_dict[key].split('.')[0:2]
    ver_dict[0], ver_dict[1] = int(ver_dict[0]), int(ver_dict[1])
    req_dict[0], req_dict[1] = int(req_dict[0]), int(req_dict[1])

    ok1 = ver_dict[0] > req_dict[0]
    ok2 = ver_dict[0] == req_dict[0] and ver_dict[1] >= req_dict[1]

    if ok1 or ok2:
        print(f"{cap_key} version: {version_dict[key]} \tOK")
    else:
        all_passed = False
        print(f"Warning:   Old {cap_key} version.\nSuggested: {cap_key} >= {require_dict[key]}\nCurrent:   {cap_key} == {version_dict[key]}")

    print(bar)

if all_passed:
    print("All packages meet the minimum requirements.")

Python version: 3.11.7 | packaged by Anaconda, Inc. | 	OK
----------------------------------------------------------------
Numpy version: 1.26.3 	OK
----------------------------------------------------------------
Pandas version: 2.1.4 	OK
----------------------------------------------------------------
Matplotlib version: 3.8.0 	OK
----------------------------------------------------------------
Scikit-learn version: 1.2.2 	OK
----------------------------------------------------------------
Pytorch version: 2.2.0 	OK
----------------------------------------------------------------
All packages meet the minimum requirements.


To verify/check if your installation supports GPU:

In [9]:
print(tc.cuda.is_available())

True


## <u>Conda cheatsheet</u>

Anaconda allows you to do a lot of things with your environments, such as, cloning them, deleting them, restoring them to a previous state, etc. In the following URL, you will find a longer list of possible Conda commands: https://docs.conda.io/projects/conda/en/4.6.0/_downloads/52a95608c49671267e40c689e0bc00ca/conda-cheatsheet.pdf

Here are some notable ones:
- Create an environment:

`conda create --name [env-name]`

- List all the environments:

`conda env list`

- Activate an environment:

`conda activate [env-name]`

- Remove an environment:

`conda env remove --name [env-name]`

- Restore the environment to a previous state (revision):

`conda install --revision [revision-id]`

About this `revision-id`: Every time you make some changes to your environment, these changes will be logged as a "revision", which will be identified by some number that Conda stores. To see the list of all revisions, you can use:

`conda list --revisions`

For example, suppose you accidentally installed the Plotly package instead of Matplotlib. You can run `conda list --revisions` to display all the revisions and find the one in which Plotly was installed. Once you realize Plotly was installed at, say, revision 5, you can just run `conda install --revision 4` to restore the environment to this previous state, before the goof was made.


# <u>Google Colab</u>

You can also ignore almost everything we've said thus far, and just use Google colab for everything, without the need to have local Python environments, IDEs, etc. It runs on Google servers and automatically gives you the requirements you want once you import the packages you need. If you want to use GPU, check the tab **Runtime** at the top. As the assignments get longer in lines of code, however, you may find it impractical to code there, versus some IDE. Nonetheless, you can also just use it to train your models once you've coded them, instead of, say, melting your own PC.

Needless to say, if you're one of those people who want to use a text editor like Vim or Emacs for absolutely everything as if you're writing C or something, go ahead—I won't stop you. The three main requirements lie at the top of this page.