# <center>Creation of a desktop environment for the _Advanced Algorithms_ course</center>
![logoei.jpg](attachment:logoei.jpg)

<blockquote>
    <h1><center>Tutor guide</center></h1>
    <table>
        <tr>
            <th style="text-align:center;">Version</th>
            <th style="text-align:center;">Date</th>
            <th style="text-align:left;">Designer</th>
            <th style="text-align:left;">Proofreader</th>
            <th style="text-align:left;">Comments</th>
        </tr><tr>
            <td style="text-align:center;">3.0</td>
            <td style="text-align:center;">15/12/2022</td>
            <td style="text-align:left;"><a href="mailto:bcohen@cesi.fr">Benjamin COHEN BOULAKIA</a></td>
            <td style="text-align:left;"></td>
            <td style="text-align:left;">Adaptation for students from the Integrated Undergraduate Programme (CPI).<br/>
 Addition of how to use <code>virtualenv</code> with the Notebook files.<br/>
 Addition of how to use the debugger (including in Notebook files).<br/>
 Reference to the <em>code optimization</em> exercise series.<br/>
 Addition of explanations.</td>
        </tr><tr>
            <td style="text-align:center;">2.0</td>
            <td style="text-align:center;">26/04/2019</td>
            <td style="text-align:left;"><a href="mailto:bcohen@cesi.fr">Benjamin COHEN BOULAKIA</a></td>
            <td style="text-align:left;"></td>
            <td style="text-align:left;">Adaptation for Y3 Data processing course</td>
        </tr><tr>
            <td style="text-align:center;">1.0</td>
            <td style="text-align:center;">13/11/2018</td>
            <td style="text-align:left;">Pierre HALFTERMEYER</td>
            <td style="text-align:left;"></td>
            <td style="text-align:left;">Original version for optional Y5 Data Science subject</td>
        </tr>
    </table>
</blockquote>

Most of you have already tried Python, but in this course we are going to exploit its capabilities in a much more advanced way. So you’ll have to prepare your desktop environment for this.

To begin with, all Python-based Workshops use Jupyter Notebook files as support. Therefore, you must deploy a local Jupyter environment on your workstation if this hasn’t been done yet.

Besides, this entire course uses Python 3 as programming language. The use of Python 2 is strongly discouraged, even though there are still many resources for this programming language version. Many implementation inconsistencies and flaws have been fixed in version 3, and Python 2.7, the latest version to date of Python 2, is no longer supported as of late 2019. Many scientific libraries (including SciKit-Learn, a famous Artificial Intelligence API) no longer update their versions for Python 2.

Finally, and in order to avoid suffering the consequences of conflicts between libraries, it is recommended to use a so-called _isolated_ or _virtualized_ environment. Although built-in tools such as Anaconda significantly help in deploying such an environment, it is still recommended to learn how to do it manually, with a tool like `virtualenv`, especially since Anaconda is a development environment, which means it is not designed for production deployment.

### Package Manager

This course will require installing a number of additional libraries. Python offers a utility to easily do this - the `pip` command. Make sure that ‘python3’ (version 3.9.7 or above) and ‘pip3’ (version 1.4 or above) are installed on your machine&nbsp;:

```console
$ pip3 --version
```

This is important because some fairly recent Python features are going to be used in this course, and you will need an updated Python version for that. You can update `pip3` if necessary

```console
$ pip3 install --upgrade pip
```

Some of you might be tempted to use Anaconda’s package manager, but it’s not recommended here. Firstly, as mentioned above, Anaconda is not suitable for a production environment, and it’s important that you master the tools for these environments. Secondly, mixing `pip` and `conda` (the manager in question) installations is not recommended. And finally, the packages managed by Anaconda are not always up to date. The `conda` command is especially useful in projects that mix several languages, which is not our case. Furthermore, you will probably have to run Python scripts outside the Jupyter environment, for example as part of your project (if you launch them on a remote environment).

### Creating the virtual environment
With Python, when installing a product, many dependencies can be installed. Managing the ‘pollution’ caused by Python’s worldwide distribution could be cumbersome: dependencies, package version conflicts, binary files... Moreover, some packages may be required for a single user/product. By default, all packages and dependencies are globally downloaded and installed.

To avoid this, Python allows you to create and manipulate _virtual environments_, i.e. isolated micro-containers (a bit like docker, but without the whole OS layer) in which you can develop a particular project, with its own dependencies, package versions, etc.


This tool is called `virtualenv`, and we will start by installing it&nbsp;:

```console
$ pip3 install --user --upgrade virtualenv
```

And then create your virtual environment&nbsp;:

```console
$ virtualenv algo_env
```

`algo_env` is the name of your environment, and it’s also the name of the directory in which all the data in this environment will be stored. We will now activate it, but the manipulation varies depending on your OS.

* On Linux/MacOS X&nbsp;:
```console
$ source algo_env/bin/activate
```
* On Windows&nbsp;:
```console
> algo_env\Scripts\activate.bat
```
* And for the more radical, who use PowerShell on Windows&nbsp;:
```console
> algo_env\Scripts\Activate.ps1
```

Now that the environment is activated, all manipulations of Python distribution (installation, update, environment variables...) will only apply to this `algo_env` environment. You can switch environments, test multiple packages or versions, delete an environment, etc. The recommended practice is to have a virtual environment for each project (and possibly create a temporary one if you want to do some weird testing). During your studies, an environment for each course that uses Python should do the job.

Again, Anaconda offers an alternative to manage virtual environments. This is an interesting option because it’s easier to incorporate into Notebooks (point discussed further below). But if you’re managing a code directly from the OS, it’s not necessary.

### Installing packages

As mentioned, by default, packages installed in `algo_env` via `pip` will only be installed in `algo_env`, and `python` only has access to these single packages from `algo_env` (the environment is sealed in both directions).

We can already install (or update) some modules and their dependencies&nbsp;:

```console
$ pip3 install --upgrade jupyter matplotlib numpy pulp
```

To test that the modules were installed correctly&nbsp;:

```console
$ python3 -c "import jupyter, matplotlib, numpy, pulp"
```

### Jupyter

Now it’s time to launch ‘Jupyter’ so you can create or open your first _Notebook_.
```console
$ jupyter notebook
```
This document is in fact a _Notebook_, the proof&nbsp;:

### Package Manager

This course will require installing a number of additional libraries. Python offers a utility to easily do this - the `pip` command. Make sure that ‘python3’ (version 3.9.7 or above) and ‘pip3’ (version 1.4 or above) are installed on your machine&nbsp;:

```console
$ pip3 --version
```

This is important because some fairly recent Python features are going to be used in this course, and you will need an updated Python version for that. You can update `pip3` if necessary

```console
$ pip3 install --upgrade pip
```

Some of you might be tempted to use Anaconda’s package manager, but it’s not recommended here. Firstly, as mentioned above, Anaconda is not suitable for a production environment, and it’s important that you master the tools for these environments. Secondly, mixing `pip` and `conda` (the manager in question) installations is not recommended. And finally, the packages managed by Anaconda are not always up to date. The `conda` command is especially useful in projects that mix several languages, which is not our case. Furthermore, you will probably have to run Python scripts outside the Jupyter environment, for example as part of your project (if you launch them on a remote environment).

### Creating the virtual environment
With Python, when installing a product, many dependencies can be installed. Managing the ‘pollution’ caused by Python’s worldwide distribution could be cumbersome: dependencies, package version conflicts, binary files... Moreover, some packages may be required for a single user/product. By default, all packages and dependencies are globally downloaded and installed.

To avoid this, Python allows you to create and manipulate _virtual environments_, i.e. isolated micro-containers (a bit like docker, but without the whole OS layer) in which you can develop a particular project, with its own dependencies, package versions, etc.


This tool is called `virtualenv`, and we will start by installing it&nbsp;:

```console
$ pip3 install --user --upgrade virtualenv
```

And then create your virtual environment&nbsp;:

```console
$ virtualenv algo_env
```

`algo_env` is the name of your environment, and it’s also the name of the directory in which all the data in this environment will be stored. We will now activate it, but the manipulation varies depending on your OS.

* On Linux/MacOS X&nbsp;:
```console
$ source algo_env/bin/activate
```
* On Windows&nbsp;:
```console
> algo_env\Scripts\activate.bat
```
* And for the more radical, who use PowerShell on Windows&nbsp;:
```console
> algo_env\Scripts\Activate.ps1
```

Now that the environment is activated, all manipulations of Python distribution (installation, update, environment variables...) will only apply to this `algo_env` environment. You can switch environments, test multiple packages or versions, delete an environment, etc. The recommended practice is to have a virtual environment for each project (and possibly create a temporary one if you want to do some weird testing). During your studies, an environment for each course that uses Python should do the job.

Again, Anaconda offers an alternative to manage virtual environments. This is an interesting option because it’s easier to incorporate into Notebooks (point discussed further below). But if you’re managing a code directly from the OS, it’s not necessary.

### Installing packages

As mentioned, by default, packages installed in `algo_env` via `pip` will only be installed in `algo_env`, and `python` only has access to these single packages from `algo_env` (the environment is sealed in both directions).

We can already install (or update) some modules and their dependencies&nbsp;:

```console
$ pip3 install --upgrade jupyter matplotlib numpy pulp
```

To test that the modules were installed correctly&nbsp;:

```console
$ python3 -c "import jupyter, matplotlib, numpy, pulp"
```

### Jupyter

Now it’s time to launch ‘Jupyter’ so you can create or open your first _Notebook_.
```console
$ jupyter notebook
```
This document is in fact a _Notebook_, the proof&nbsp;:

In [1]:
print ("Hello world!")

Hello world!


You can now open the Jupyter version of this (`.ipynb`extension) file and run the cell above.

### Python and Jupyter
You are now ready. You can either test your code in a dedicated Notebook or edit and run your script directly from the OS.

Let’s quickly go back to the use of virtual environments with Jupyter. Incorporating virtual environments into Notebooks is feasible, since they will appear as _kernel_ (the language that the Notebook runs). To do this, we only need to make the environment accessible from a Notebook&nbsp;:
```console
$ ipython kernel install --user --name=algo_env
```

Your environment will then appear in the list of kernels available in the Notebook (a kernel is what runs the code in your Notebook), and will be accessible from the _kernel_ menu of your Notebook. In the worst case, Jupyter will have to be restarted.

You can now get started with [Jupyter’s basic functions](https://www.math.ubc.ca/~pwalls/math-python/jupyter/notebook/), find out how to format text with [Markdown](https://www.math.ubc.ca/~pwalls/math-python/jupyter/markdown/), and use the [equation format $\mathrm{\LaTeX}$](https://www.math.ubc.ca/~pwalls/math-python/jupyter/latex/).<br/>

This last point will be very useful in this course, and will allow you to format equations by placing them between \\$. For example, the code&nbsp;:

```latex
$$\sum_{i=1}^n n$$
```
will produce the mathematical expression&nbsp;:

$$\sum_{i=1}^n n$$

This feature is available in Markdown cells, but also in Python code, for example to format labels on a chart. And you will need it yourself in this course, to formally express the algorithm problems that you will process.

### The Python debugger, and its use in Jupyter
If you’ve ever come across bugs in your code (and most likely you have...), you’ve probably placed a lot of `print()` commands all over your functions to try to figure out what’s wrong. However, there are much more practical tools called _debuggers_, which allow you to run a code step by step, and to check the state of your variables (or even to modify them) while running the code.

In Python, the standard debugger is called `ipdb`. Install it if necessary&nbsp;:
```console
$ pip3 install --upgrade ipdb
```

Then all you’ll have to do is load the package into your program to use it. Be careful though, because ipdb does not work in a Notebook. You must use a specific debugger (that is used just like ipdb), and load it at the beginning of the cell&nbsp;:
```python
import IPython.core.debugger
ipdb = IPython.core.debugger.Pdb()
```

Once the debugger is installed (either the basic `ipdb` or its Notebook version), it is very easy to use. Simply load it and place the `ipdb.set_trace()` instruction (also called statement) in your code so that the Python interpreter stops when it reaches this instruction and waits for you to tell it what to do. Just type the command corresponding to what you want. Any command Python will do, you can display a variable, or even modify it. And there’s also the debugger’s internal commands. The main ones are&nbsp;:
* `n`&nbsp;: skips to the next line
* `q`&nbsp;: quits the debugger
* `c`&nbsp;: continues running the program until the end
* `s`&nbsp;: enters the function of the current line

(by the way, in your code, avoid naming your variables `s`, `n` or similar if you’re planning to use the debugger, so that it doesn’t get mixed up).

Test it on the following code, which iterates 5 times. When you assess the cell, execution will stop on the first execution of the `ipdb.set_trace()` instruction. Display the contents of `x` with the `print(x)` command, then do the same for `y`. Change the value of `y`, and check that the change has been taken into account. Then relaunch the execution with the `c` command and start again at the next iteration. Once you’ve had enough, enter the command `q`.

In [None]:
import IPython.core.debugger
ipdb = IPython.core.debugger.Pdb()

for x in range(5):
 y = x*2
 ipdb.set_trace()

And as long as the breakpoint is defined by a Python instruction, there’s nothing to stop you, for example, from making a conditional breakpoint&nbsp;:
```python
if y==6:
 ipdb.set_trace()
```
so you can, for example, have a look at the value of the other variables at this point in time.

This tool is essential if you want to debug your code efficiently and quickly, so it’s best to get it right away&nbsp;!

### The software suite
We have explored some of the possibilities that Jupyter offers, but you can go further, there’s a lot of things you can do with this tool. More specifically, [extensions](https://jupyter-contrib-nbextensions.readthedocs.io/en/latest/install.html) allow us to add interesting features. For example, a built-in spellchecker. You never know, it could be useful, if (for instance) you had to write a project report in the form of a Notebook and the assessment chart included a score for the writing quality...

You can now launch the second exercise series of this introduction sequence. This series will allow you to discover some of Python’s advanced features, which make it a particularly expressive, concise, and readable programming language. On the other hand, although the [object-oriented design](https://docs.python.org/3/tutorial/classes.html) part of the language is always interesting to know, it will be used very little in this course so we will not talk about it at all.