Author: Itay Segev  
E-mail: [itaysegev@campus.technion.ac.il](mailto:itaysegev@campus.technion.ac.il)

<img src='https://www.tng-project.org/static/data/lab_logo_tng.png'/>

The [python](https://www.python.org/) programming language contains many science-oriented libraries such as `numpy` (a powerfull and efficient multi-dimensional array library), `matplotlib` (a fully stacked graphing and plotting library), and many more, making it great for interactive scientific coding. [Jupyter](https://jupyter.org/) is a non-profit organization that builds open-source software for interactive scientific coding, accross many programming languages, through the use of web-UI that displays ipython notebooks, sometimes called jupyter notebooks, or simply `.ipynb` files. It is based on the [IPython project](https://ipython.org/), and supports python out of the box.

Jupyter Lab is an IDE-like web interface capable displaying and running code from jupyter notebooks in a meaningful and intuitive way. It works, for the most part, like any other python IDE. We can edit python files, regular text files, and anything in between, with syntax highlighting supported for many files types including markdown (`.md`) YMAL (`.yml`) among others. The GUI functions as one would expect from an IDE, allowing us to resize panes, change fonts, tabbed file management, light and dark themes, and many other functionalities. There is even an extension manager for third party extensions that enhance customizability and functionality.

# Install and launch

There are two ways to install Jupyter Lab on a local machine.
1. with conda - `conda install -c conda-forge jupyterlab`
2. with pip - `pip install jupyterlab`

Once installed, we can run `jupyter lab` from the terminal to start a jupyter server. By default, a new tab should open on the machine's default web browser, but if that is not the case then use one of the links provided in the command output in the terminal.

# The Interface


1. Menu - general UI and engine control
2. New items - launcher, new directory, upload files, sync files with disk.
3. Directory navigation
4. File browser
5. Open tabs bar
6. Launcher tab - new notebook, code file, text file, terminal, console, ...
7. Notebook cell settings - used for creating [reveal.js](https://revealjs.com/) presentations out of notebooks
8. Debugging tools

# Notebooks

Files with extension `.ipynb` are treated as jupyter notebooks. These notebooks are divided into cells containing either code or text. The cell manipulation toolbar is located directly under the open tabs bar, where one can save all changes, add new cells, cut existing cells, copy/paste cells, and run/stop code cells. Jupyter notebooks support two kinds of cells: Code, Markdown, and Raw. We can select the cell type by selecting the cell we wish to edit and choosing the correct type from the drop-down menu in the cell manipulation bar.

## Text
We can write rich text representations with [markdown](https://www.markdownguide.org/) (Markdown cell type) or raw utf-8 text (Raw cell type). Raw types are simply raw text without any special formatting that can be rendered by running the current cell by pressing the triangular button in the cell manipulation bar. Markdown is a common convention for formatted text written in plain text. For example, the following text will be rendered as a link:  
<code>\[Google\](https://www.google.com)</code>  
Rendered:  
[Google](https://www.google.com)  
Markdown is used by a large community of coders offering support via blog and forum posts, and is worth exploring in order to create more aesthetic notebooks.

Markdown cells also support [$\LaTeX$](https://www.latex-project.org/) math  notations by wrapping them with the dollar sign character '\$'. For example, the following text will be rendered as the quadratic equation:  
<code>\$x_{1,2}=\frac{-b\pm\sqrt{b^2-4ac}}{2a}\$</code>  
Rendered:  
$x_{1,2}=\frac{-b\pm\sqrt{b^2-4ac}}{2a}$  
For more information on $\LaTeX$ math notations, checkout [this entry](https://www.overleaf.com/learn/latex/Mathematical_expressions).

Mixing rich / mathematical text in between code cells is useful for documentations, demonstrations, presentations, etc. Markdown cells have even more functionality, from displaying images and animated gifs to rendering raw HTML code. Lookup specific markdown cell functionality for find out more.

## Running Code

Jupyter notebooks run using a background console called a kernel. This kernel keeps the program's memory in tact until it is restarted, shut down, or crashes -- similarly to the python console. The kernel being used for the current notebook can be seen in the top right corner of the notebook. The default kernel is based on the python version in which we installed jupyter (with pip / conda). We can write code in "Code" type cells to get syntax highlighting and automatic code completion. Code cells can be run using the notebook's live kernel and its current memory state by pressing the triangular button in the cell manipulation bar.

Whenever a cell is run, the last exression in the cell is printed to the console. Anything we wish to print beforehand must be printed using the `print` function.

In [None]:
for i in range(3):
    print(i)

"hello everyone"

When a variable is assigned, it remains in memory until the end of the kernel's lifetime. This means that the variable can be used in any cell once it has been initialized. We can even use a variable in a cell preceeding the assignment cell as long as we run the initialization cell first!

In [None]:
x = 2

In [None]:
def f(x):
    return 2*x

f(x)

By using the exclamation point character '!' we can run terminal commands. For example, the cell below will output `hello from terminal`.

In [None]:
!echo "hello from terminal"

A common use for this feature is to install external packages via pip like so:  
`!pip install numpy`

Like any other python program, we can import code from built-in / installed / and local python modules. For example, we can use the `os` module to list the current working directory.

In [None]:
import os
os.listdir('.')

## Magic Functions

Magic funcitons control the funcionality of certain libraries integrated with jupyter. We can call these functions using the percent character '%'. Some notable examples include:
- `%matplotlib inline` - Shows matplotlib figures inline with the cell output
- `%matplotlib notebook` - Live interactive plot
- `%load_ext` - Load a jupyter extension, e.g. `%load_ext autoreload`.
- `%autoreload 2` - If `autoreload` is loded, setting it to 2 will cause changes in external `.py` files to take effect immediately.
- `%run` - Allows you to execute Python code from external .py files and other Jupyter Notebooks directly within your current notebook.
- `%load` - Allows you to insert code from an external script into the current cell of your Jupyter Notebook.
-



Below is an example of using the `timeit` function checks how long it takes some function to run.

To view a list of all available magic commands, simply run the following code in a cell:

In [None]:
%lsmagic

Below is an example of using the `timeit` function checks how long it takes some function to run.

The `%%time` magic command provides information about a single run of the code in your cell. It measures the time taken to execute the code and displays the result.

In [None]:
%%time
import time
for _ in range(1000):
    time.sleep(0.01) # sleep for 0.01 seconds

the `%timeit` magic command uses the Python timeit module to run a statement multiple times (default is 100,000 times) and provides the mean of the fastest three execution times.

In [None]:
import numpy
%timeit numpy.random.normal(size=100)

In [None]:
%%html
<marquee style='width: 30%; color: blue;'><b>Whee!</b></marquee>

To learn more, see [Jupyter's magics page](http://nbviewer.jupyter.org/github/ipython/ipython/blob/1.x/examples/notebooks/Cell%20Magics.ipynb).


## Errors

Syntax and runtime errors will not cause the kernel to crash, similarly to how a terminal shell will not crash if given a bad command. After the error has occurred, the kernel's memory is still in tact, allowing us to access previously defined variables. For example, the next two cells will yield a syntax error and a runtime error, but the third cell can still run and yields the correct value of `x`.

In [None]:
# syntax error
x 0

In [None]:
# runtime error
x / 0

In [None]:
# x still exists in memory
x

## Exporting

We can export a jupyter notebook into a large number of sharable formats out of the box. To do this, navigate via the menu to:  
File --> Save and export notebook as --> \<format\>  
Available formats include:
* HTML
* LaTeX
* PDF (requires an installation of `tex` on your local machine)
* rst
* python script
* reveal.js

<p><img alt="Colaboratory logo" height="45px" src="https://www.dataquest.io/wp-content/uploads/2023/06/Google_Colaboratory_SVG_Logo.svg" align="left" hspace="10px" vspace="0px"></p>

## Google Colab


Colab is a free Jupyter notebook environment that requires no setup and runs entirely in the cloud.

With Colaboratory you can write and execute code, save and share your analyses, and access powerful computing resources, all for free from your browser.

In [None]:
#@title Introducing Colab { display-mode: "form" }
#@markdown This 3-minute video gives an overview of the key features of Colab:
from IPython.display import YouTubeVideo
YouTubeVideo('inN8seMm7UI', width=600, height=400)

# Charting in Colab

A common use for notebooks is data visualization using charts. Colab makes this easy with several charting tools available as Python imports.

## Matplotlib

[Matplotlib](http://matplotlib.org/) is the most common charting package, see its [documentation](http://matplotlib.org/api/pyplot_api.html) for details, and its [examples](http://matplotlib.org/gallery.html#statistics) for inspiration.

In [None]:
import matplotlib.pyplot as plt

x  = [1, 2, 3, 4, 5, 6, 7, 8, 9]
y1 = [1, 3, 5, 3, 1, 3, 5, 3, 1]
y2 = [2, 4, 6, 4, 2, 4, 6, 4, 2]
plt.plot(x, y1, label="line L")
plt.plot(x, y2, label="line H")
plt.plot()

plt.xlabel("x axis")
plt.ylabel("y axis")
plt.title("Line Graph Example")
plt.legend()
plt.show()

## Altair
[Altair](http://altair-viz.github.io) is a declarative visualization library for creating interactive visualizations in Python, and is installed and enabled in Colab by default.

For example, here is an interactive scatter plot:

In [None]:
import altair as alt
from vega_datasets import data
cars = data.cars()

alt.Chart(cars).mark_point().encode(
    x='Horsepower',
    y='Miles_per_Gallon',
    color='Origin',
).interactive()

For more examples of Altair plots, see the [Altair snippets notebook](/notebooks/snippets/altair.ipynb) or the external [Altair Example Gallery](https://altair-viz.github.io/gallery/).

# Virtual Machine
The most powerful feature of google colab is the ability to use cloud GPU for free. Like the other desktop environment you can also access most of the bash command with a `!` added in the front of the command.

At first turn on the GPU from `Runtime`->`Change Runtime Type`->`Hardware Acceleration`

The entire colab runs in a cloud VM. Let's investigate the VM. You will see that the current colab notebook is running on top of `Ubuntu 18.04.3 LTS` (at the time of this writing.)

In [None]:
!ls -l
!pwd 

In [None]:
!ls -l /


In [None]:
!cat /etc/*release

# GPU Details




###  Getting a GPU

You may already know what's going on when I say GPU. But if not, there are a few ways to get access to one.

| **Method** | **Difficulty to setup** | **Pros** | **Cons** | **How to setup** |
| ----- | ----- | ----- | ----- | ----- |
| Google Colab | Easy | Free to use, almost zero setup required, can share work with others as easy as a link | Doesn't save your data outputs, limited compute, subject to timeouts | [Follow the Google Colab Guide](https://colab.research.google.com/notebooks/gpu.ipynb) |
| Use your own | Medium | Run everything locally on your own machine | GPUs aren't free, require upfront cost | Follow the [PyTorch installation guidelines](https://pytorch.org/get-started/locally/) |
| Cloud computing (AWS, GCP, Azure) | Medium-Hard | Small upfront cost, access to almost infinite compute | Can get expensive if running continually, takes some time to setup right | Follow the [PyTorch installation guidelines](https://pytorch.org/get-started/cloud-partners/) |

There are more options for using GPUs but the above three will suffice for now.

Personally, I use a combination of Google Colab and my own personal computer for small scale experiments (and creating this course) and go to cloud resources when I need more compute power.

> **Resource:** If you're looking to purchase a GPU of your own but not sure what to get, [Tim Dettmers has an excellent guide](https://timdettmers.com/2020/09/07/which-gpu-for-deep-learning/).

To check if you've got access to a Nvidia GPU, you can run `!nvidia-smi`



In [None]:
!nvidia-smi

If you don't have a Nvidia GPU accessible, the above will output something like:

```
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.
```

In that case, go back up and follow the install steps.

If you do have a GPU, the line above will output something like:

```
Wed Jan 19 22:09:08 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 495.46       Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla P100-PCIE...  Off  | 00000000:00:04.0 Off |                    0 |
| N/A   35C    P0    27W / 250W |      0MiB / 16280MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
```

# Ubuntu 18.04.3 LTS and drivers

Now that, as you are using ubuntu, you can use any command that you use from GNU terminal. This makes life very simple.

However, for your project you have have to find a way to transfer/download your data files here in this VM. The easiest way is to mound your google drive here in this VM and use git for version control of your codes.

Another easiest way is to get your file is to use `wget` from a file server or dropbox.



To install a python package with pip use following.
You can also use `apt-get` to install any package in ubuntu.

In [None]:
%%capture
!pip install gpustat
!pip install fairseq
!apt-get install build-essential

In [None]:
!gpustat


Please check the nvidia driver version (see the output of `!nvidia-smi` above), cuda version in current OS. Because not all version support the latest cuda, cudnn etc. Deep learning libraries are changing at a rapid pace. So make sure that you can install your preferred deep learning library with the current nvidia-driver, cuda and cudnn.

In [None]:
from platform import python_version
import torch
print("Python version", python_version())
print("Pytorch - version", torch.__version__)
print("Pytorch - cuDNN version :", torch.backends.cudnn.version())

# Local file system

## Uploading files from your local file system

`files.upload` returns a dictionary of the files which were uploaded.
The dictionary is keyed by the file name and values are the data which were uploaded.

In [None]:
from google.colab import files

uploaded = files.upload()

for fn in uploaded.keys():
  print('User uploaded file "{name}" with length {length} bytes'.format(
      name=fn, length=len(uploaded[fn])))

## Downloading files to your local file system

`files.download` will invoke a browser download of the file to your local computer.


In [None]:
from google.colab import files

with open('example.txt', 'w') as f:
  f.write('some content')

files.download('example.txt')

## Mounting Google Drive locally

The example below shows how to mount your Google Drive on your runtime using an authorization code, and how to write and read files there. Once executed, you will be able to see the new file (`foo.txt`) at [https://drive.google.com/](https://drive.google.com/).

This only supports reading, writing, and moving files; to programmatically modify sharing settings or other metadata, use one of the other options below.

**Note:** When using the 'Mount Drive' button in the file browser, no authentication codes are necessary for notebooks that have only been edited by the current user.

In [None]:
from google.colab import drive
drive.mount('/content/drive')

In [None]:
with open('/content/drive/My Drive/foo.txt', 'w') as f:
  f.write('Hello Google Drive!')
!cat /content/drive/My\ Drive/foo.txt

In [None]:
drive.flush_and_unmount()
print('All changes made in this colab session should now be visible in Drive.')

# Conclusion

Jupyter Lab is a useful tool for scientific programming that contains a wide range of tools that let us create tidy and versatile documents that incorporate live code. It has many more features that were not covered in this notebook, not including the large number of extensions available. This notebook can be exported to a variety of common formats and shared with others. Jupyter notebooks have become a staple in the Python community, and Jupyter Lab is the next generation of notebook processing. It's worth noting the distinction between using it locally and utilizing platforms like Google Colab. While both offer similar functionalities, Colab provides the advantage of cloud-based computing resources, allowing for collaborative work and access to powerful hardware without the need for local installation. On the other hand, working locally with Jupyter Lab offers greater control over the environment and data privacy, albeit with potential limitations in computational resources compared to cloud-based platforms.

# Important Resources

Learn how to make the most of Python, Jupyter, Colaboratory, and related tools with these resources:

### Working with Notebooks in Colaboratory
- [Overview of Colaboratory](/notebooks/basic_features_overview.ipynb)  
- [Guide to Markdown](/notebooks/markdown_guide.ipynb)
- [Importing libraries and installing dependencies](/notebooks/snippets/importing_libraries.ipynb)
- [Saving and loading notebooks in GitHub](https://colab.research.google.com/github/googlecolab/colabtools/blob/master/notebooks/colab-github-demo.ipynb)

### Additional Content
- [Interactive forms](/notebooks/forms.ipynb)
- [Interactive widgets](/notebooks/widgets.ipynb)


### Working with Data
- [Loading data: Drive, Sheets, and Google Cloud Storage](/notebooks/io.ipynb)
- [Charts: visualizing data](/notebooks/charts.ipynb)
