## Module 4: Python


# More on Data
## REPRODUCIBILITY Tools
<br>

Asel Kushkeyeva<br>
Data Science Institute, University of Toronto<br>
2022

### Jupyter Notebook as a Slideshow

To see this notebook as a live slideshow, we need to install RISE (Reveal.js - Jupyter/IPython Slideshow Extension):

1. Insert a cell and execute the following code: `conda install -c conda-forge rise`
2. Restart the Jupyter Notebook.
3. On the top of your notebook you have a new icon that looks like a bar chart; hover over the icon to see 'Enter/Exit RISE Slideshow'.
4. Click on the RISE icon and enjoy the slideshow.
5. You can edit the notebook in a slideshow mode by double clicking the line.
*This is done only once. Now all your notebooks will have the RISE extension (unless you re-install the Jupyter Notebook).*

# Agenda

1. Jupyter Notebook. Markdown
2. Google Colab
3. Spyder
4. Visual Studio Code
5. Virtual Environment
6. requirements.txt

# Jupyter Notebook

You were introduced to *Anaconda's* __Jupyter Notebook__ at the begining of this module. We typed and executed codes in our notebooks. Let's talk about some of its other features such as:
- using markup language;
- creating slides.

### Markdown

*Markdown* is a lighweight markup language that allows to edit a Jupyter Notebook to make it easier to read. For example, we can add:

# headers,
__bolded__ and *italized* font,
- bulletted lists, and
[links](https://jupyter.org/documentation).

Go to __Cell -> Cell Type -> Markdown__. Now we can type in this cell as we do in a word document, run the cell and get some text rendered. Try in your notebooks!

Table 1 demonstrates some of the essential Markdown syntax uses. These will help you create human readable documents.

__Table1. Markdown syntax__
<br>

|Symbol|Output|Description|
|------|------|------|
|#     |Header 1     |One hashtag at the begining of a line    |
|##    |Header 2     |Two hashtags at the begining of a line      |
|###   |Header 3     |Three hashtags at the begining of a line      |
|####  |Header 4     |Four hashtags at the begining of a line      |
|##### |Header 5     |Five hashtags at the begining of a line      |
|__ __ |Bold font    |Two underscores at the begining and end of text     |
|**    |Italic font  |A star on each sides of text     |
|-     |Bullet points|One hiphen at the begining of a line   |
|\[click here\](url)  |Embedded links|text between square brakets and full url in brackets|

## PRACTICE IN YOUR NOTEBOOK

Explore other Markdown formatting syntax. [The Ultimate Markdown Guide (for Jupyter Notebook)](https://medium.com/analytics-vidhya/the-ultimate-markdown-guide-for-jupyter-notebook-d5e5abf728fd) on medium.com will get you started.

### Creating slides in Jupyter Notebook

To create slides:
- Go to *View -> Cell Toolbar -> Slideshow*.
- Now each cell has *Slide Type* menu on the right.
- *Slide Type* options:
    - *Slide* turns a cell into a slide;
    - *Sub-Slide* adds a slide below previous slide;
    - *Fragment* adds fragments in previous slide;
    - *Skip* does not show this cell in a slidedeck;
    - *Notes* adds speaker notes.

Voila! We have a working presentation.

To run the presentation, we need to install __RISE__. Please do so if you haven't done during the first class.

### RISE (Reveal.js - Jupyter/IPython Slideshow Extension) installation:
1. In a new cell execute the following code: `conda install -c conda-forge rise`
2. Restart your Jupyter.
3. Now on the top of your notebook you have __RISE__ icon - an icon that looks like a bar chart. 
4. Click on the RISE icon and enjoy the slideshow.
5. You can edit the notebook in a slideshow mode by double clicking the line.
*This is done only once for all your notebooks to have the RISE extension. (Unless you reinstall your Jupyter.)*

# Google Colab

*Google Colab* is very similar to Jupyter Notebook. It's a Google product so we can share it like we would any other document of Google Suite.

Go to http://colab.research.google.com.

Many example notebooks are presented there.

Click on Examples and scroll to *Altair Chart Snippets*.

In the *Altair Chart Snippet* notebook, we see several visualization codes. 

Click on Run cell button at the beginning of a code to enjoy the beautiful visuals.

Similarly, we can create new Colab Notebook or upload our own Jupyter Notebook to Google Drive and open it with Colab to use its features.

Notebooks exploring more advanced topics of machine learning are on Google's [AiHUB](https://aihub.cloud.google.com/u/0/).

## PRACTICE IN YOUR Google Colab

Access any of the Google Colab's examples and try to use the example code in your own Notebook, changing the parameters.

# Spyder

Another Python IDE that comes under *Anaconda* umbrella is __Spyder__. If you are familiar with __RStudio__, you will find Spyder very similar to it. 

Launch the Spyder and let's explore.

On the top, *Toolbar* allows a quick access to open and run scripts and adjust panes.

On the bottom, *Status Bar* shows Python version, line and column our cursor pointing to.

A large pane on the left, *Editor* is for writing code.

On the right bottom, *Console* evaluates the code written in the *Editor*.

On the right top, *Variable Explorer* shows defined variables, their type, size, and value. 


Let's look at one of the functions we defined earlier in the module, adding *if, else* statements.

In [2]:
def boiling_temp(x: float):
    """Print 'Boiling!' if x is greater than or equal to 100 degree Celsius.
    Otherwise print 'Not boiling yet.'
    boiling_temp(150)
    Boiling!
    boiling_temp(30)
    Not boiling yet. """
    if x >= 100:
        print("Boiling!")
    else:
        print('Not boiling yet.')

1. Open a new file in Spyder -- New File icon in the toolbar.

2. You may save the file as *boiling_temp*.

3. Type or copy/paste the `boiling_temp` function in the file.

4. Run the file (Run icon in the toolbar or F5).

5. Nothing seem to be happening, right? Except *Console* shows the file's path.

6. We can call the function either in the *Editor* or *Console*. Try both to see the difference.

7. Start typing `boiling_temp` in the *Editor* and Spyder shows a list of available options.

8. Double click on the *function* option and insert a value in brackets.

9. Run the current cell.

10. The function gets evaluated in the *Console*.

11. Type `boiling_temp(` in the *Console* and Spyder hints the function with its documentation. Pretty convenient, right!?



## PRACTICE IN YOUR Spyder

Design a function and try its workings in Spyder.

# Visual Studio Code

*Visual Studio Code* is another code editor. VSCode supports Python and other languages such as JavaScript, JSON, HTML, C++, and Julia. 

Go to https://code.visualstudio.com/docs and install VSCode for your OS.

Go through a quick initial setup.

To work in Python, we need to install additional extensions.

Go to File -> Open and open any Jupyter Notebook.

VSCode will prompt you to install *Jupyter Notebook* and *Python* extensions.

Now you are ready to work in VSCode.

Similar to Spyder, VSCode provides us with hints and autocompletion. However, VSCode might get a little overwhelming for a beginner programmer. This code editor definitely comes in handy to program in various languages.

## Optional exercise:
Go to https://code.visualstudio.com/docs/python/python-tutorial and go through the tutorial.

# Virtual Environment

In our coding journey we will soon discover that __installing various Python packages on your machine__ might not be the best choice. Applications/projects that need need to simultaneously might require __different versions__ of the same packages (or modules) to work properly. However, the versions of the packages are installed __under the same name__. In other words, version name (v1.0.0 or v2.3.0) is not part of the packages' path, and our projects/applications are not able to differentiate between the versions, potentially using a wrong package version.

### How can we work around this?

### Create and start working with a virtual environment

- Open terminal/shell.
- Make a new directory to work with: type `mkdir python-virtual-environments && cd python-virtual-environments` and execute.
- Now in the terminal we see `python-virtual-environments` before our user name.
- Create a new virtual environment inside the directory: type `python3 -m venv env` and execute.
- Activate the virtual environment: type `source env/bin/activate` and execute.
- After execution, we see `(env)` at the beginning of the line.

To test the workings of our virtual environment we deactivate the virtual environment, install a new module, perform a command ensuring the command works in the global Python. Then activate the environment and try the very same command and witness that the module is not available in the environment.

- Deactivate the virtual environment: type and execute `deactivate`.
- To nstall `bcrypt` and use it to hash a password type and execute the following two commands separately: 
    - `pip -q install bcrypt`
    - `python -c "import bcrypt; print(bcrypt.hashpw('password'.encode('utf-8'), bcrypt.gensalt()))"`
- Activate the environment: `source env/bin/activate`.
- Try using the password hashing command: `python -c "import bcrypt; print(bcrypt.hashpw('password'.encode('utf-8'), bcrypt.gensalt()))"`
- The shell tells us that there is no 'bcrypt' module available. This proves that the virtual environment works!

### Managing virtual environments

In your shell: 

Install a tool to manage virtual environments - `pip install virtualenvwrapper`.

Check its path - `which virtualenvwrapper.sh`.

Activate it with the following three lines of code:


In [None]:
export WORKON_HOME=$HOME/.virtualenvs
export PROJECT_HOME=$HOME/projects
source __wrapper's path__

Now there should be a directory to store all our virtual environments: `echo $WORKON_HOME`

Every time we start a new project, run: `mkvirtualenv my-new-project`

We see (my-new-project) at the beginning of the line in the shell.

To deactivate: `deactivate`

To list all available projects: `workon`

To activate a project: `workon project-name`

# Requirements.txt

*requirements* file is a file containing all required packages for a specific project. Maintaining such file is another way of handling project dependencies and be able to run the project on a different machine or in another virtual environment.

We can list the projects' packages by `pip freeze`.

To store the project's packages, run: `pip freeze > requirements.txt`.

Now, we are able to reproduce the project on another machine by running the following command on it: `pip install -r requirements.txt`.

In case you need to upgrade the packages, run: `pip install --upgrade -r requirements.txt`.

# References

- Brett Cannon, 2020, *A quick-and-dirty guide on how to install packages for Python*. https://snarky.ca/a-quick-and-dirty-guide-on-how-to-install-packages-for-python/
- Google Colab. https://colab.research.google.com
- Hannan Satopay, 2019, *The Ultimate Markdown Guide (for Jupyter Notebook)*. https://medium.com/analytics-vidhya/the-ultimate-markdown-guide-for-jupyter-notebook-d5e5abf728fd
- Mike Driscoll, *Creating Presentations with Jupyter Notebook*. https://www.blog.pythonlibrary.org/2018/09/25/creating-presentations-with-jupyter-notebook/
- Mark Roepke, 2019, *Tips for Creating Slideshows in Jupyter*
 https://www.markroepke.me/posts/2019/06/05/tips-for-slideshows-in-jupyter.html
- Real Python, 2016, *Python Virtual Environments: A Primer*. https://realpython.com/python-virtual-environments-a-primer/
- Spyder. https://www.spyder-ide.org
- Virtual Environment. https://docs.python.org/3/tutorial/venv.html
- Virtualenvwrapper. https://virtualenvwrapper.readthedocs.io/en/latest/install.html
- Visual Studio Code. https://code.visualstudio.com/learn/get-started/basics