# Recitation 0 - Introduction
Welcome to the Recitation 0 for the Fall 2020 iteration of the course 11-785: Intro to Deep Learning. 
Recitation 0 is split into 3 bigger pictures, with every part divided into sub-parts. 
* Big Picture 1 (Recitation 0A, 0B, 0C) - Introduction to Python, Numpy, and Pytorch
* Big Picture 2 (Recitation 0D, 0E)     - Introduction to AWS, Google Colab, and Jupyter
* Big Picture 3 (Recitation 0F, 0G)     - Introduction to Debugging, Tensorboard, TSNE, and Visualization


# *Recitation 0A - Introduction to Python*



Python

All information about python, from downloading to documentation, can be found here: https://www.python.org. We recommend that you use python 3 for the homeworks in this course.

External Modules

Install external modules using the pip package manager(https://pypi.org/project/pip/). 
The main modules we'll be using in the course include  

*   numpy
*   torch (Pytorch)

To check if both the modules got installed correctly, please type the below code in your terminal and check if some version comes up.

```
python -c "import torch; print(torch.__version__)"
python -c "import numpy; print(numpy.version.version)"
```



### Importing Modules code snippets

In [1]:
import numpy
a = numpy.array([1,2,3])

import numpy as np
a = np.array([1,2,3])

from math import ceil

from math import * 

### File Formats and Loading Data



*   .txt: plain text file
*   .pkl: python objects
*   .csv: tabular data - fields separated by commas
*   .npy: numpy arrays (saved using numpy library)
*   .npz: zipped archive of npy files

In [2]:
# traditional way of reading files with explicit open and close statements

f = open("recitation0a.txt", "r")
print(f.readlines())
f.close()

['This is the first line of text.\n', 'This is the second line of text.']


In [3]:
# no need to mention explict close statement in the "with" statement

with open("recitation0a.txt", "r") as f:
    print(f.readlines())

['This is the first line of text.\n', 'This is the second line of text.']


More about file access modes: https://stackoverflow.com/questions/16208206/confused-by-python-file-mode-w

In [5]:
# to read a csv file

import csv

with open ("sampledata.csv", "r") as f:
    reader = csv.reader(f, delimiter=",")
    
    for row in reader:
        print(row)

['Year', 'Variable_name', 'Variable_category', 'Value']
['2018', 'Total income', 'Financial performance', '691859']
['2018', 'Sales, government funding, grants and subsidies', 'Financial performance', '605766']
['2018', 'Interest, dividends and donations', 'Financial performance', '63509']
['2018', 'Non-operating income', 'Financial performance', '22583']
['2018', 'Total expenditure', 'Financial performance', '597623']
['2018', 'Interest and donations', 'Financial performance', '34223']
['2018', 'Indirect taxes', 'Financial performance', '7124']
['2018', 'Depreciation', 'Financial performance', '19863']
['2018', 'Salaries and wages paid', 'Financial performance', '106351']
['2018', 'Redundancy and severance', 'Financial performance', '297']
['2018', 'Salaries and wages to self employed commission agents', 'Financial performance', '1659']
['2018', 'Purchases and other operating expenses', 'Financial performance', '418642']
['2018', 'Non-operating expenses', 'Financial performance', '976

In [7]:
import pickle

mydict = {"student1": "Alice", "Student2": "Bob", "Student3": "Rachel"}

pickle.dump(mydict, open("store.pkl", "wb"))

loaded = pickle.load(open("store.pkl", "rb"))
print(loaded)

{'student1': 'Alice', 'Student2': 'Bob', 'Student3': 'Rachel'}


### Data Storing


 - lists: generic container - allow for numeric indexing
 - tuples: immutable lists
 - dictionaries: keys act as indices - keys must be unique
 - sets: group of unique elements

In [8]:
same_type_list = [1, 3, 4, 89, 23, 43, 90]
diff_type_list = [1, 3, "hello", 4.9, "c"]

print(same_type_list[3])
print(len(same_type_list))

same_type_list[0] = "I'm new"
print(same_type_list)

89
7
["I'm new", 3, 4, 89, 23, 43, 90]


In [9]:
# Concatenation
new_list = same_type_list + diff_type_list
print(new_list)

["I'm new", 3, 4, 89, 23, 43, 90, 1, 3, 'hello', 4.9, 'c']


In [10]:
new_list_2 = ["hi", "hello"] * 2
print(new_list_2)

['hi', 'hello', 'hi', 'hello']


In [11]:
same_type_tuple = (1, 10, 7)
diff_type_tuple = (1, 2, "foo") 

print(diff_type_tuple[2])
print(same_type_tuple[1])

# same_type_tuple[0] = 3

foo
10


In [12]:
# some more list/tuple functions

print(max(same_type_tuple))
print(min(same_type_tuple))
print(sorted(same_type_tuple))

#print(min(same_type_list))

10
1
[1, 7, 10]


In [18]:
my_dict = {"student1": "Alice", "student2": "Bob", "student3": "Rachel"}

print(my_dict["student1"])
# print(my_dict[0])         # No Index Based searching
# print(my_dict["student4"])

# Sensible Error message generation using .get() 
print(my_dict.get("student4", "student does not exist"))

Alice
student does not exist


In [14]:
my_dict["student1"] = "Billy"
print(my_dict)

{'student1': 'Billy', 'student2': 'Bob', 'student3': 'Rachel'}


In [19]:
my_set = {"obj1", "obj2", "obj3"}
print(my_set)
# print(my_set[1]) 

{'obj1', 'obj2', 'obj3'}


### Random

In [None]:
# use of random in selecting data from dictionaries

import random
countries = {'VENEZUELA':'CARACAS', 'CANADA':'OTTAWA'}
bex = random.choice(list(countries))
print(bex)

CANADA


More about Random: https://docs.python.org/3/library/random.html

### List of lists to numpy array

In [20]:
some_array = [[[64,64,32],
                 [22,23,24]],
                [[64,4,32],
                 [22,3,24]]
               ]
some_array = np.asarray(some_array)
print(some_array.shape)

another_array = list(range(3))
print(another_array)

(2, 2, 3)
[0, 1, 2]


### Filtering Lists

1. Slicing and Dicing
2. List comprehensions

In [None]:
# slicing & dicing
# general format: sliced_list = [start_idx : end_idx+1 : step]

some_list = [2, 5, 2, 45, 7, 9, 76, 80, 21, 53]
print(some_list[5:])
print(some_list[:3])
print(some_list[3:9:2])

[9, 76, 80, 21, 53]
[2, 5, 2]
[45, 9, 80]


### Slicing with 2D and 3D arrays

3D arrays are indexed across the 3 dimensions as follows:

![3D array](3d-array-stack.png)

Where if you consider your 3D array to be a stack of matrices, i selects the matrix, j selects the row in that matrix and k selects the column in that matrix.

In [22]:
# slicing 3D array examples:

three_d_array = np.array([[[10, 11, 12], [13, 14, 15], [16, 17, 18]],
               [[20, 21, 22], [23, 24, 25], [26, 27, 28]],
               [[30, 31, 32], [33, 34, 35], [36, 37, 38]]])

print(three_d_array)

[[[10 11 12]
  [13 14 15]
  [16 17 18]]

 [[20 21 22]
  [23 24 25]
  [26 27 28]]

 [[30 31 32]
  [33 34 35]
  [36 37 38]]]


In [27]:
# ------------ selecting a row ------------
# you want to specify the matrix, then the row
# Note the difference a comma and a colon

print(three_d_array[0,2]) # matrix 0, row 2
print()
print(three_d_array[0:2])

[16 17 18]

[[[10 11 12]
  [13 14 15]
  [16 17 18]]

 [[20 21 22]
  [23 24 25]
  [26 27 28]]]


In [28]:
# ------------ selecting a column ------------
# you want to specify the matrix, ignore the row, and then specify the column

print(three_d_array[1, :, 1]) # matrix 1, column 1

[21 24 27]


In [29]:
# ------------ selecting a matrix ------------

print(three_d_array[2]) # matrix 2

[[30 31 32]
 [33 34 35]
 [36 37 38]]


In [30]:
# ------------ creating a row across matrices ------------
print(three_d_array[:, 1, 2]) # for every matrix, row 1, column 2 

[15 25 35]


In [31]:
# ------------ creating a matrix from rows ------------
print(three_d_array[:, 1]) # for every matrix, row 1

[[13 14 15]
 [23 24 25]
 [33 34 35]]


In [32]:
# ------------ creating a matrix from columns ------------
print(three_d_array[:, :, 1]) # for every matrix, for every row, column 1

[[11 14 17]
 [21 24 27]
 [31 34 37]]


You can also slice within rows, columns and matrices in a 3D array, the same way you would in 1D arrays.

### List Comprehensions

In [34]:
# general format: new_list = [expression for_loop_one_or_more condtions]

res = [num for num in same_type_list if isinstance(num, int) and num>0]
print(res)

[3, 4, 89, 23, 43, 90]


In [35]:
%%timeit 

res2 = []
res2 = [i for i in range(500)]

14.8 µs ± 409 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


In [36]:
%%timeit

res3 = []
for i in range(500):
    res3.append(i)

31.3 µs ± 278 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)


# Classes

You'll use classes extensively in the homeworks in this course. You will need classes for defining your models as well as your datasets. We'll consider an example of implementing the dataset class here.

In [None]:
from torch.utils.data import Dataset

class MyDataset(Dataset):
    def __init__(self, x, y):
        self.x = x
        self.y = y
        
    def __getitem__(self, index):    
        return (self.x[index], self.y[index])
    
    def __len__(self):
        return len(self.x) 

# Debugging

You'll we extensively using pdb for debugging across the entire semester, especially in Part 2s. Few useful debugging shortcuts:

* n - next statement (excludes function calls)
* s - step into (includes function calls)
* q - quit
* b - breakpoints

You can read more about pdb and its commands here https://docs.python.org/3/library/pdb.html

In [None]:
# notice the use of "n", "s", "c" when pdb sets the trace. 
# If "s" is commanded, it would go inside the set_trace() method
# If you need to step out, just command "c"

import pdb

for i in range(3):
    pdb.set_trace()
    print("this is iteration " + str(i))

> <ipython-input-37-b00ef2ee1653>(8)<module>()
-> print("this is iteration " + str(i))
(Pdb) n
this is iteration 0
> <ipython-input-37-b00ef2ee1653>(6)<module>()
-> for i in range(3):
(Pdb) s
> <ipython-input-37-b00ef2ee1653>(7)<module>()
-> pdb.set_trace()
(Pdb) s
--Call--
> /usr/lib/python3.6/pdb.py(1584)set_trace()
-> def set_trace():
(Pdb) s
> /usr/lib/python3.6/pdb.py(1585)set_trace()
-> Pdb().set_trace(sys._getframe().f_back)
(Pdb) n
--Return--
> /usr/lib/python3.6/pdb.py(1585)set_trace()->None
-> Pdb().set_trace(sys._getframe().f_back)
(Pdb) c
this is iteration 1
> <ipython-input-37-b00ef2ee1653>(8)<module>()
-> print("this is iteration " + str(i))
(Pdb) c
this is iteration 2
