# CME538 - Introduction to Data Science
## Lecture 1.3 - Python Basics
### Goals
Prodive a broad overview of basic Python and NumPy functionality. Further refresher material is available via [LinkedIn-Learning](https://www.linkedin.com/learning-login/).

### Lecture Structure
1. [Strings, Numbers, and Booleans](#section1)
2. [Operators](#section2)
3. [Tuples, lists, and dictionaries](#section3)
4. [conditional statements](#section4)
5. [loops](#section5)
6. [functions](#section6)
7. [variable scope](#section7)
8. [objects and classes](#section8)
9. [List comprehension](#section9)
10. [NumPy Basics](#section10)
11. [Writing clean code](#section11)

## Setup Notebook
At the start of a notebook, we need to import the Python packages we plan to use.
* [Time](https://docs.python.org/3/library/time.html) - This module provides various time-related functions. 
* [NumPy](https://numpy.org/) - A library for the Python programming language, adding support for large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays. NumPy was introcuded in Lecture 4 and we will learn more about its functionality in this lecture. It is customary to `import numpy as np`.

In [None]:
# Import 3rd party libraries
import time
import numpy as np

<a id='section1'></a>
## 1. Strings, Numbers, and Booleans
In programming, data type is an important concept. Variables can store data of different types, and different types can do different things.

Python has the following data types built-in by default, in these categories:

- **Text Type: str**
- **Numeric Types: int, float, complex**
- Sequence Types: list, tuple, range
- Mapping Type: dict
- Set Types: set, frozenset
- **Boolean Type: bool**
- Binary Types: bytes, bytearray, memoryview

In this section will will review Strings, Numbers, and Booleans

### Strings
Strings can contain numbers and / or characters. For example, a string might be a word, a sentence, or several sentences. 

Below are a few examples of strings.

In [None]:
var1 = 'hello world'
var2 = '5'
var3 = '5.789'
var4 = 'The quick brown fox jumps over the lazy dog.'

We can check the data of any variable using the built-in function `type()`.

In [None]:
type(var1)

We can also convert numbers to strings using the `str()` function.

In [None]:
str(5.789)

Strings in Python are iterable, and often used as such. However, they are also often considered, not as sequences of characters, but as atomic entities. 

In [None]:
var1

We can use the built-in `len()` function to compute the number of characters in a string.

In [None]:
len(var1)

String can be sliced using the following `[]` operator.

In [None]:
var1[0]

In [None]:
var1[0:8]

In [None]:
for char in var1:
    print(char)

Strings can be added together, which works like concatenation.

In [None]:
var2 = '5'
var3 = '5.789'
var2 + var3

In [None]:
var3 + var2

### Numbers
Integer: Numerical Data Type
Float: Numerical Data Type
Complex: Numerical Data Type

#### Integers
An integer will never have a decimal point. 
Native Python integers are 64-bit (int64), which simply refers to how many numbers it can encode. 

In [None]:
var2 = 5
type(var2)

If a float is passed to the `int()` opperator, the number is rounded down to the closest Integer.

In [None]:
var3 = 5.789
int(var3)

Python integers are signed, meaning they can be positive or negative.

In [None]:
var2 = 5
var2 = - var2
print(var2)

#### Floats

A floating point (also known as a float) number has decimal points even if that decimal point value is 0. Native Python floats are 64-bit (float64), which simply refers to how many numbers it can encode.

In [None]:
var3 = 5.789
type(var3)

When adding a float and an int, the output is a float.

In [None]:
var2 = 5
var3 = 5.789
type(var2 + var3)

Floats can also be represented in Scientific Notation.

In [None]:
1.1

In [None]:
1.1e0  # 1.1 * 10**0

In [None]:
1.1e1  # 1.1 * 10**1

In [None]:
1.1e2  # 1.1 * 10**2

In [None]:
1.1e3  # 1.1 * 10**3

In [None]:
type(1.1e3)

#### Complex
In python, you can put `j` or `J` after a number to make it imaginary.

In [None]:
var5 = 2 + 3j
type(var5)

In [None]:
type(2 + 3J)

In [None]:
var5.real

In [None]:
var5.imag

In [None]:
var5.conjugate()

### Booleans
Booleans represent one of two values: True or False. When you compare two values, Python returns a Boolean answer.

In [None]:
print(10 > 9)

In [None]:
type(10 > 9)

In [None]:
type(True)

In [None]:
type(False)

<a id='section2'></a>
## 2. Operators
### Arithmetic Operators
Arithmetic operators are used with numeric values to perform common mathematical operations.
<br>
<img src="images/arithmetic_operators.png" alt="drawing" width="400"/>
<br>
Division

In [None]:
5 / 3

Floor Division

In [None]:
5 // 3

The % operator (Modulus) yields the remainder from the division of the first argument by the second. 

In [None]:
5 % 3

In [None]:
# Try out the rest for yourself!

### Assignment Operators
Assignment operators are used to assign values to variables.
<br>
<img src="images/assignment_operators.png" alt="drawing" width="400"/>
<br>

In [None]:
var3 = 5.789
var3 += 10
var3

In [None]:
var3 = 5.789
var3 = var3 + 10
var3

In [None]:
# Try out the rest for yourself!

### Comparison Operators
Comparison operators are used to compare two values.
<br>
<img src="images/comparison_operators.png" alt="drawing" width="450"/>
<br>

In [None]:
5 > 10

In [None]:
5 != 10

In [None]:
10 <= 10

In [None]:
# Try out the rest for yourself!

### Logical Operators
Logical operators are used to combine conditional statements.
<br>
<img src="images/logical_operators.png" alt="drawing" width="450"/>
<br>

In [None]:
5 > 10 and 6 != 10

In [None]:
5 > 10 or 6 != 10

In [None]:
5 > 10 or not 6 != 10

### Identity Operators
Identity operators are used to compare the objects, not if they are equal, but if they are the same object, with the same memory location.
<br>
<img src="images/identity_operators.png" alt="drawing" width="400"/>
<br>

In [None]:
x = ['car', 'boat']
y = ['car', 'boat']
z = x

In [None]:
x is y

In [None]:
x is z

In [None]:
x == y

### Membership Operators
Membership operators are used to test if a sequence is presented in an object.
<br>
<img src="images/membership_operators.png" alt="drawing" width="400"/>
<br>

In [None]:
'b' not in 'sandwich'

In [None]:
1 in [1, 2, 3, 4]

In [None]:
1 in 10

<a id='section3'></a>
## 3. Tuples, lists and dictionaries
### Tuple
A tuple is a collection which is ordered and  immutable (cannot be changed). In Python tuples are written with round brackets.

In [None]:
thistuple = ('high five', 1, 50.345, True, ('apple', 'yogurt'))

In [None]:
len(thistuple)

In [None]:
thistuple[0:3]

In [None]:
thistuple[0]

In [None]:
thistuple[4][0]

In [None]:
thistuple[0] = 'low five'

### List
A list is a collection which is ordered and changeable. In Python lists are written with square brackets.

In [None]:
thislist = ['high five', 1, 50.345, True, ('apple', 'yogurt')]

In [None]:
len(thislist)

In [None]:
thislist [0:3]

In [None]:
thislist[0] = 'low five'

In [None]:
print(thislist)

In [None]:
thistuple[4][0]

### Dictionary
A dictionary is a collection which is unordered, changeable and indexed. In Python dictionaries are written with curly brackets, and they have keys and values.

In [None]:
thisdict = {'title': 'The Lion King', 
            'year': 1994, 
            'directors': ['Roger Allers', 'Rob Minkoff']}

In [None]:
len(thisdict)

In [None]:
thisdict['title']

In [None]:
thisdict['genres'] = 'animation'

In [None]:
print(thisdict)

<a id='section4'></a>
## 4. Conditional statements
### if, elif, else statements
Allows for conditional execution of a statement or group of statements.
<br>
<img src="images/conditional_statement.png" alt="drawing" width="400"/>
<br>

In [None]:
a = 10
b = 50
if b > a:
    print('b is greater than a')
print('Finished!')

In [None]:
a = 50
b = 10
if b > a:
    print('b is greater than a')
print('Finished!')

In [None]:
a = 100
b = 50
if b > a or b * a == 5:
    print('b is greater than a')
elif a == b and a is not None:
    print('a and b are equal')
else:
    print('a is greater than b')

<a id='section5'></a>
## 5. Loops
### while loop
With the while loop we can execute a set of statements if a condition is true.

In [None]:
i = 0
while i < 6:
    i += 1
    if i == 3:
        continue
    if i == 5:
        break
    print(i)
else:
    print('i is no longer less than 6')

### for loop
A for loop is used for iterating over a sequence (that is either a list, a tuple, a dictionary, a set, or a string).

In [None]:
tasty_foods = ['pizza', 'ice cream', 'french fries']
for x in tasty_foods:
    if x == 'pizza':
        continue
    print(x)

<a id='section6'></a>
## 6. Functions
A function is a block of code which only runs when it is called. You can pass data, known as parameters, into a function. A function can return data as a result.

In [None]:
def absolute_value(number):
    if number >= 0:
        return number
    else:
        return -number
    print('Done.')

In [None]:
absolute_value(number=10)

In [None]:
absolute_value(number=-10)

In [None]:
absolute_value(number=’10’)

In [None]:
def absolute_value(number):
    number = float(number)
    if number >= 0:
        return number
    else:
        return -number
    print('Done.')

In [None]:
absolute_value(number=10)

In [None]:
absolute_value(number=-10)

In [None]:
absolute_value(number='10')

In [None]:
absolute_value(number='ten')

In [None]:
def absolute_value(number):
    if isinstance(number, int) or isinstance(number, float):
        if number >= 0:
            return number
        else:
            return -number
    else:
        print('number is not integer or float')
        return None
    print('Done.')

In [None]:
absolute_value(number=10)

In [None]:
absolute_value(number=-10)

In [None]:
absolute_value(number='ten')

In [None]:
def get_greeting(first_name, last_name, country='Canada'):
    greeting = 'Hello, my name is {} {} and I am from {}'.format(first_name, 
                                                                 last_name, 
                                                                 country)
    return greeting, greeting.upper()

In [None]:
get_greeting(first_name='Sebastian', last_name='Goodfellow')

In [None]:
out = get_greeting(first_name='Sebastian', 
                   last_name='Goodfellow', 
                   country='France')
out[0]

In [None]:
out1, out2 = get_greeting(first_name='Sebastian', 
                          last_name='Goodfellow', 
                          country='France')
print(out1, out2)

<a id='section7'></a>
## 7. Variable Scope
A variable is only available from inside the region it is created, which is called scope. Python has four different scopes: local, enclosing, global, and built-in. These scopes together form the basis for the `LEGB` rule used by the Python interpreter when working with variables. Read more about variable scope [here](https://python-textbok.readthedocs.io/en/1.0/Variables_and_Scope.html).
<br>
<img src="images/variable_scopes.png" alt="drawing" width="450"/>
<br>
### Local Scope
Whenever you define a variable within a function, its scope lies ONLY within the function. It is accessible from the point at which it is defined until the end of the function and exists for as long as the function is executing. Which means its value cannot be changed or even accessed from outside the function.

In [None]:
def myfunc():
    var6 = 300
    print(var6)

In [None]:
myfunc()

In [None]:
print(var6)

Notice how we cannot access `var6` outside of the function. The same is true for functions. Any function defined inside of a function is not accessible outside of that function.

In [None]:
def myfunc():
    var6 = 300
    def myinnerfunc():
        print(var6)
    myinnerfunc()

In [None]:
myfunc()

In [None]:
myinnerfunc() 

### Enclosed Scope
The enclosed scope related to the situation where we have a nested function (function defined inside another function).

In [None]:
def outer():
    first_num = 1
    def inner():
        second_num = 2
        # Print statement 1 - Scope: Inner
        print('first_num from outer: ', first_num)
        # Print statement 2 - Scope: Inner
        print('second_num from inner: ', second_num)
    inner()
    # Print statement 3 - Scope: Outer
    print('second_num from inner: ', second_num)

outer()

### Global Scope
Whenever a variable is defined outside any function, it becomes a global variable, and its scope is anywhere within the program. This means that variables and functions defined outside of a function are accessible inside of a function.

In [None]:
def myfunc():
    print(x)

In [None]:
x = 'hello world'
myfunc()

In [None]:
print(x)

Variables and functions defined in the global scope can have the same name as those defined in a function scope but they'll be two distinct units in memory.

In [None]:
def myfunc():
    x = 200
    print(x)

In [None]:
x = 300
myfunc() 

In [None]:
print(x)

By using the `global` operator, a new global variable can be defined that will be accessible outside of the function it was defined in.

In [None]:
def myfunc():
    global x
    x = 200 

Let's first use the `del` operator to delete the `x` variable from memory.

In [None]:
del x

Now, run the function and see if the `x` variable is available in the global scope.

In [None]:
myfunc()
print(x)

Similarly, we can change the value of a `global` variable from within a function.

In [None]:
def myfunc():
    global x
    x = 200 

In [None]:
x = 300
myfunc()
print(x)

### Built-In Scope
This is the widest scope that exists! All the special reserved keywords fall under this scope. We can call the keywords (`def`, `class`, `in`, `and`, `True`, `False`, etc.) anywhere within our program without having to define them before use.

### The LEGB Rule
<br>
<img src="images/variable_scopes.png" alt="drawing" width="450"/>
<br>
LEGB (Local -> Enclosing -> Global -> Built-in) is the logic followed by a Python interpreter when it is executing your program.

For example, consider the function below.

In [None]:
# Global scope
x = 0

def outer():
    
    # Enclosed scope
    x = 1
    
    def inner():
        
        # Local scope
        x = 2
        print('inner x: {}'.format(x))
        
    print('outer x: {}'.format(x))
        
    inner()

Python will first look if `x` was defined locally within `inner()` (local scope). If not, the variable defined in `outer()` (enclosed scope) will be used. If it also wasn't defined there, the Python interpreter will go up another level to the global scope. Above that, you will only find the built-in scope, which contains special variables reserved for Python itself.

In [None]:
outer()
print('global x: {}'.format(x))

<a id='section8'></a>
## 8. Objects and Classes
Python is an object-oriented programming (OOP) language, which is a programming paradigm based on the concept of `objects`. A Class is an object blueprint for creating objects.

Lets start by using the `class` keyword to define a simple **Car** `class`. 

In [None]:
from datetime import datetime

class Car:
    
    def __init__(self, manufacturer, model, year, color, num_doors, sun_roof):
        
        self.manufacturer = manufacturer
        self.model = model
        self.year = year
        self.color = color
        self.num_doors = num_doors
        self.sun_roof = sun_roof
        
    def get_age(self):
        return datetime.now().year - self.year

    def print(self):
        print('{} {} {}'.format(self.year, self.manufacturer, self.model))

Let's start by instantiating a version of the **Car** `class`.

In [None]:
car = Car(manufacturer='Honda', 
          model='Accord',
          year=1992,
          color='black',
          num_doors=4,
          sun_roof=False)

In [None]:
car.year

In [None]:
car.get_age()

In [None]:
car.print()

### Almost Everything in Python is an Object!
Lets define a string and inspect its attributes and methods.

In [None]:
mystring = 'hello world!'

By typing a period `.` after any python variable and pressing `tab`, Jupyter will present a list of attributes and methods available for that object.

In [None]:
mystring.

`.capitalize()` is a **string** method that will capitalize the first letter.

In [None]:
mystring.capitalize()

`.split()` will split a string according to a given delimeter.

In [None]:
mystring.split(' ')

In [None]:
mystring.split('ll')

`.endswith()` checks for a specified ending character(s).

In [None]:
mystring.endswith('!')

In [None]:
mystring.endswith('d!')

<a id='section9'></a>
## 9. List Comprehension
List comprehensions provide a concise way to create lists and consists of brackets containing an expression followed by a `for` statement.

`list_variable = [expression for item in collection]`

As a simple example, consider we want to create a list constaining integers from `0` to `9`.

In [None]:
[integer for integer in range(10)]

We can add conditional statements to filter the list at the time of creation.

In [None]:
[integer for integer in range(10) if integer > 3]

List comprehensions are good alternatives to for loops, as they are more compact. 

In the example below, we use a `for` loop to accomplish the same task as above but requiring **4** lines of code.

In [None]:
new_list = []

for integer in range(10):
    if integer > 3: 
        new_list.append(integer) 

print(new_list)

<a id='section10'></a>
## 10. NumPy Basics
NumPy is a python library used for working with arrays. It also has functions for working in domain of linear algebra, Fourier transform, and matrices. NumPy stands for Numerical Python.
<br>
<img src="images/numpy.png" alt="drawing" width="300"/>
<br>
First, NumPy is a third part package that does not come preinstalled with python. When working on the UofT JupyterHub platform, NumPy will always be installed and you can import the package as follows.

In [None]:
import numpy as np

Lets construct a simple 1D array.

In [None]:
array = np.array([1, 2, 3, 4, 5])

In [None]:
print(array)

In [None]:
type(array)

In [None]:
array.shape

We created a 1D array but we can creat ND arrays with NumPy. Let's create a 2D array below.

In [None]:
array = np.array([[1, 2, 3, 4, 5]])

In [None]:
print(array)

In [None]:
array.shape

In [None]:
array = np.array([[1, 2, 3, 4, 5], 
                  [1, 2, 3, 4, 5],
                  [1, 2, 3, 4, 5]])

In [None]:
print(array)

In [None]:
array.shape

We can also transpose array using the built-in `.T` operator.

In [None]:
array.shape

In [None]:
array.T.shape

We can index arrays as follows.

In [None]:
print(array)

In [None]:
array[0, 2]

In [None]:
array[0]

In [None]:
array[-1, 4]

Slicing is used to collect multiple values at one time.

In [None]:
print(array)

In [None]:
array[0, 0:3]

In [None]:
array[0, 3:]

In [None]:
array[1, -3:-1]

In [None]:
array[1, 1:4:2]

In [None]:
array[1, ::2]

In [None]:
array[1, 4:1:-1]

### Vectorization
Python for loops are inherently slow. NumPy offers vectorized actions on numpy arrays, which push the for loop you would usually do in Python down to the `C` level, which is much faster.

To test this, lets write a slow `for` loop in Python. Lets create two arrays that we would like to add together.

In [None]:
array_size = 1000000
array1 = np.zeros(array_size)
array2 = np.zeros(array_size)

In [None]:
def pure_python_version(array1, array2, array_size):
    array = list()
    for idx in range(array_size):
        array.append(array1[idx] + array2[idx])

We can use some built-in Jupyter `%magic` so help us time our function.

In [None]:
%timeit pure_python_version(array1, array2, array_size)

Now let's vectorize the function using NumPy.

In [None]:
def numpy_version(array1, array2):
    array = array1 + array2

In [None]:
%timeit numpy_version(array1, array2)

NumPy is over two orders of magnitude faster. This difference may not matter for `array_size = 1000000`, but try increasing it and you'll quickly become quite impatient.

<a id='section11'></a>
## 11. Writing Clean Code
**"Writing clean code is what you must do in order to call yourself a professional. There is no reasonable excuse for doing anything less than your best."**
-Robert C. Martin
<br>
<img src="images/wtf.png" alt="drawing" width="500"/>
<br>
- As you hone your Data Science skills in this class and beyond, remember that you will likely be working in a team one day. Your teammates will depend on your code and you will depend on theirs.
- Be aware of and learn how to write clean code.
- It will be one of the best investments you ever make.
<br>
<img src="images/ETL.png" alt="drawing" width="800"/>
<br>
- I can still look at code I wrote a year ago and cringe.
- As for code I wrote during my Master and PhD, I’d prefer not to talk about it.
- Writing clean code is a lifelong pursuit.

For those interest in learning how to write clean code, the book [Clean Code](https://enos.itcollege.ee/~jpoial/oop/naited/Clean%20Code.pdf) is an excellent free resource.
<br><br>
<img src="images/clean_code.png" alt="drawing" width="250"/>
<br>