## Chapter Contents

- [Chapter 01 Summary](#1.Summary)
- [1.1 The deep learning revolution](#1.1.)
- [1.2 PyTorch for deep learning](#1.2.)
- [1.3 Why PyTorch?](#1.3.)
    - [1.3.1 The deep learning competitive landscape](#1.3.1)
- [1.4 An overview of how PyTorch supports deep learning](#1.4.)
- [1.5 Hardware and software requirements](#1.5.)
    - [1.5.1 Using Jupyter Notebooks](#1.5.1)
- [1.6 Exercises](#1.6.)

<a id="1.Summary"></a>
## Chapter 01 Summary

- Deep learning models automatically learn to associate inputs and desired outputs from examples
- Libraries like PyTorch allow you to build and train neural network models efficiently
- PyTorch minimizes cognitive overhead while focusing on flexibility and speed. It also defaults to immediate execution for operations
- Since the release of PyTorch in early 2017, the deep learning tooling ecosystem has consolidated significantly.
- PyTorch provides a number of utility libraries to facilitate deep learning projects.

#### about "deep learning"
- deep learning is basically a general class of algorithms that are able to approximate complicated, nonlinear processes very very effectively
- in fact, according to the *Universal Approximation Theorem*, it is implied that neural networks theoretically can approximate any non-linear function given the right set of weights (parameters)

#### prerequisites for the book
- python programming experience
- willingness to go hands-on

#### structure of the book
- part 1 : foundations (basic concepts and using PyTorch)
- part 2 : end-to-end project (classifying tumors in CT scans)
- part 3 : deploying deep learning models to production (short)

<a id="1.1."></a>
## 1.1. The deep learning revolution

#### deep learning has become very powerful
- example: screenshot demo of [GPT-2 language model](https;//talktotransformer.com)
- though factually nonsense, the output sounds coherent
- a more advanced model [GPT-3](https://openai.com/blog/openai-api/) has already been trained, but access is limited for fear of abuse and misuse as it may be too powerful


#### how did deep learning revolutionize machine learning?
##### before: feature engineering (right)
- up until last decade, field of machine learning used "feature engineering"
- e.g. transformations such as filters that detect edges of characcters, allowing the system to differentiate digits based on distribution of edge directions, or number of enclosed holes

##### now: primarily deep learning (left)
- iterate through pairs of examples and target labels to automatically refine filters for features
- feature engineering still used to injuect prior knowledge into the system, but less effective than the above

![](../images/dlwpt-screenshots/dlwpt-01-01.png)

<a id="1.2."></a>
## 1.2. PyTorch for deep learning

<a id="1.3."></a>
## 1.3. Why PyTorch?

- Python library that facilitates building deep learning projects
    - emphasis : library, not a framework
    - mechanics:
        - tensor (core data structure) multidimensional array similar to NumPy arrays
        - features to perform accelerated mathematical operations on GPUs
    - benefits:
        - allows deep learning models to be expressed in idiomatic Python
        - simple : clear syntax, easy to debug and learn
        - versatile : good for real-world, high-profile work
        - fast : high-performance C++ runtime

<a id="1.3.1."></a>
### 1.3.1. The deep learning competitive landscape

#### current contenders
(PyTorch and Tensorflow have seen feature sets starting to converge)
- PyTorch : developed by Facebook, easy to use -> welcome by academics
- Tensorflow : developed by Google, more robust pipeline to production -> well-used by industry
- JAX : developed by Google (independently from Tensorflow)
(with students moving to industry, we see PyTorch eating up Tensorflow market share)

#### other notable libraries in history
- Theano ceased active development
- Keras consumed by Tensorflow
- Caffe2 consumed by PyTorch

<a id="1.4."></a>
## 1.4. An overview of how PyTorch supports deep learning

- mostly written in C+ and CUDA (NVIDIA language for parallel processing on GPUs
- mostly interact via Python API

- ability to track operations on tensors
- ability to analytically compute derivatives of an output with respect to any of its inuts (autograd engine)

![](../images/dlwpt-screenshots/dlwpt-01-02.png)

- large amounts of data -> tensors -> data loading (batches) -> training loop -> trained model -> deployment

#### notable submodules
- `torch.nn` : core PyTorch modules for building neural networks 
    - fully connected layers
    - convolutional layers
    - activation functions (see chapter 05)
    - loss functions (see chapter 05)
- `torch.optim` : optimizers (see chapter 05)
- `torch.utils.data.Dataset` (see chapter 07)
- `torch.utils.data.DataLoader` (see chapter 07)
- `torch.nn.parallel.Distributed-DataParallel`
- `torch.distributed`

#### TorchScript
- compiles models ahead of time
- invoked independently from Python (from C++ programs on mobiles)
- allows export of models as TorchScript or ONNX

<a id="1.5."></a>
## 1.5. Hardware and software requirements

In [1]:
import platform
platform.uname()

uname_result(system='Linux', node='ntsr9pxrmo', release='5.4.0-65-generic', version='#73~18.04.1-Ubuntu SMP Tue Jan 19 09:02:24 UTC 2021', machine='x86_64', processor='x86_64')

In [2]:
!nvidia-smi

Wed Jul 14 02:52:09 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.36.06    Driver Version: 450.36.06    CUDA Version: 11.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Quadro P5000        On   | 00000000:00:05.0 Off |                  Off |
| 26%   31C    P8     6W / 180W |      1MiB / 16278MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

check out the [DAWNBench page](https://dawn.cs.stanford.edu/benchmark/index.html) for benchmarks on training time and costs related to common deep learning tasks on publicly available datasets.

#### tech specifications

- [installation of PyTorch](https://pytorch.org/get-started/locally/)
- package managers such as conda, pip
- 200GB hard disk space for Part 2 (120GB storage, 80GB training)

<a id="1.5.1."></a>
## 1.5.1. Using Jupyter Notebooks

many useful online resources, e.g. [Dataquest's tutorial](https://www.dataquest.io/blog/jupyter-notebook-tutorial/)

#### some useful shortcut keys (see [here](https://towardsdatascience.com/jypyter-notebook-shortcuts-bf0101a98330) for more)

Shortcuts in both modes:
- Shift + Enter run the current cell, select below
- Ctrl + Enter run selected cells
- Alt + Enter run the current cell, insert below
- Ctrl + S save and checkpoint

While in command mode (press Esc to activate):

- Enter take you into edit mode
- H show all shortcuts
- Up select cell above
- Down select cell below
- Shift + Up extend selected cells above
- Shift + Down extend selected cells below
- A insert cell above
- B insert cell below
- X cut selected cells
- C copy selected cells
- V paste cells below
- Shift + V paste cells above
- D, D (press the key twice) delete selected cells
- Z undo cell deletion
- S Save and Checkpoint
- Y change the cell type to Code
- M change the cell type to Markdown
- P open the command palette. 
- Shift + Space scroll notebook up
- Space scroll notebook down

While in edit mode (pressEnter to activate)

- Esc take you into command mode
- Tab code completion or indent
- Shift + Tab tooltip
- Ctrl + ] indent
- Ctrl + [ dedent
- Ctrl + A select all
- Ctrl + Z undo
- Ctrl + Shift + Z or Ctrl + Y redo
- Ctrl + Home go to cell start
- Ctrl + End go to cell end
- Ctrl + Left go one word left
- Ctrl + Right go one word right
- Ctrl + Shift + P open the command palette
- Down move cursor down
- Up move cursor up


#### magic methods
useful methods specific to and provided by the IPython kernel

In [3]:
# list all magic methods
%lsmagic

Available line magics:
%alias  %alias_magic  %autoawait  %autocall  %automagic  %autosave  %bookmark  %cat  %cd  %clear  %colors  %conda  %config  %connect_info  %cp  %debug  %dhist  %dirs  %doctest_mode  %ed  %edit  %env  %gui  %hist  %history  %killbgscripts  %ldir  %less  %lf  %lk  %ll  %load  %load_ext  %loadpy  %logoff  %logon  %logstart  %logstate  %logstop  %ls  %lsmagic  %lx  %macro  %magic  %man  %matplotlib  %mkdir  %more  %mv  %notebook  %page  %pastebin  %pdb  %pdef  %pdoc  %pfile  %pinfo  %pinfo2  %pip  %popd  %pprint  %precision  %prun  %psearch  %psource  %pushd  %pwd  %pycat  %pylab  %qtconsole  %quickref  %recall  %rehashx  %reload_ext  %rep  %rerun  %reset  %reset_selective  %rm  %rmdir  %run  %save  %sc  %set_env  %store  %sx  %system  %tb  %time  %timeit  %unalias  %unload_ext  %who  %who_ls  %whos  %xdel  %xmode

Available cell magics:
%%!  %%HTML  %%SVG  %%bash  %%capture  %%debug  %%file  %%html  %%javascript  %%js  %%latex  %%markdown  %%perl  %%prun  %%pypy  %%

#### noteable magic methods

##### %autosave  n 
Changes how often your notebook will autosave to its checkpoint file.
i.e. Autosaving every n seconds

##### %matplotlib inline
Providing the inline argument instructs IPython to show Matplotlib plot images inline, within your cell outputs, enabling you to include charts inside your notebooks. Be sure to include this magic before you import Matplotlib, as it may not work if you do not; many import it at the start of their notebook, in the first code cell.

##### %pdb  /  %debug
Executing the %pdb line magic will toggle on/off the automatic triggering of pdb on error across all cells in your notebook. This exposes an interactive mode in which you can use the pdb commands.

Another handy debugging magic is %debug, which you can execute after an exception has been raised to delve back into the call stack at the time of failure

##### %load
Tasks such as importing the same set of packages over and over for every project project are a perfect candidate for the %load magic, which will load an external script into the cell in which it’s executed.

###### %run 
Runs an external script file as part of the cell being executed.
For example, if %run myscript.py appears in a code cell, myscript.py will be executed by the kernel as part of that cell.

##### %timeit
Counts loops, measures and reports how long a code cell takes to execute.

##### %writefile
Save the contents of a cell to a file.
For example, %savefile myscript.py would save the code cell as an external file called myscript.py.

##### %env
report_date = %env REPORT_DATE
The %env line magic makes it easy to assign the value of an environment variable to a Python variable.

##### %store
Save a variable for use in a different notebook.

##### %pwd
Print the directory path you're currently working in.

##### %%javascript  /  %%latex  /  %%HTML / %SQL
Runs the cell as JavaScript, latex, html code, or SQL query.
Latex will be useful presenting equations.


For parametizing and executing Jupyter notebooks (like Python source files), see [Papermill](https://papermill.readthedocs.io/en/latest/). 

For using SQL in notebooks, see [here](https://towardsdatascience.com/heres-how-to-run-sql-in-jupyter-notebooks-f26eb90f3259).

<a id="1.6."></a>
## 1.6. Exercises

1. Start Python to get an interactive prompt.
    1. What Python version are you using? We hope it is at least 3.6!
    1. Can you import torch ? What version of PyTorch do you get?
    1. What is the result of torch.cuda.is_available() ? Does it match your expectation based on the hardware you’re using?
2. Start the Jupyter notebook server.
    1. What version of Python is Jupyter using?
    1. Is the location of the torch library used by Jupyter the same as the one you imported from the interactive prompt?

In [4]:
#1a
!python3 --version

Python 3.8.6


In [5]:
#1b
import torch
torch.__version__

'1.7.0'

In [6]:
#1c
torch.cuda.is_available()

True

In [7]:
#2b
torch.__file__

'/opt/conda/envs/fastai/lib/python3.8/site-packages/torch/__init__.py'

##### Making sure that everything works

In [8]:
a = torch.ones(3, 3)
b = torch.ones(3, 3)

a + b

tensor([[2., 2., 2.],
        [2., 2., 2.],
        [2., 2., 2.]])

In [9]:
a.to('cuda') + b.to('cuda')

tensor([[2., 2., 2.],
        [2., 2., 2.],
        [2., 2., 2.]], device='cuda:0')