# Python Tutorial for Beginners

## Outline

1. Data Types
2. Functions
3. Data Structures
4. Loops
5. Conditionals
6. List Comprehension
7. Yield Statement
8. Exception Handling
9. Classes

## 1. Data Types

### `float` and `integer`: 
These are the types that represent numerical values. The difference is that `float` has decimal places, but `integer` doesn't.

In [1]:
# integers and floats

print(type(2))
print(type(3))

<class 'int'>
<class 'int'>


- The arithmetic operators work as expected for `integer` and `float`. Very often representing a numerical value with either is good.

In [2]:
print(12 / 3)     # int / int -> float
print(17 // 3)    # floor division
print(17.0 // 3)  # return float if one of the values is float 
print(17 % 3)     # remainder
print(2 ** 7)     # 2 to the power of 7

4.0
5
5.0
2
128


- **Note**: Some functions only take one format. We will discuss the function `range` later.

In [3]:
range(2)

range(0, 2)

In [4]:
range(2.0)

TypeError: 'float' object cannot be interpreted as an integer

#### `string`: 
This is the type that represents text values. A `string` object can be created with a pair of quotes `'` or double quotes `"`.

In [5]:
print('hello world!')
print("Python is awesome.")

hello world!
Python is awesome.


- A `string` object can be taken for an ordered collection of characters. Each element in the collection can be accessed:

In [6]:
# Here we use a variable to save a value
txt = 'Python is awesome!'

print(txt[0]) # This is 'Python is awesome!'[0]
print(txt[1]) # This is 'Python is awesome!'[1]
print(txt[2]) # This is 'Python is awesome!'[2]

P
y
t


- Some operators can be apply to a string:

In [7]:
print(txt + " I love it!")
print(len(txt))

Python is awesome! I love it!
18


#### `boolean`: 
This is the type that has either `True` or `False` value.

In [8]:
# Notice that the keywords are case sensitive

print(type(True))
print(type(False))
print(type(true))

<class 'bool'>
<class 'bool'>


NameError: name 'true' is not defined

- Usual logic relationship applies:

In [9]:
print(True and True)
print(True and False)
print(False and False)

True
False
False


In [10]:
print(True or True)
print(True or False)
print(False or False)

True
True
False


## 2. Functions

We have seen the usage of a variable help to avoid excessive typing. Very often, if a process is complicated and likely to be reused, one can pack it as a **function**. To call (use) a particular function:

In [11]:
abs(-1)

1

In [12]:
# Here we use a module, math
import math

print(math.exp(0))
print(math.log(1))
print(math.sin(0))
print(math.cos(0))

1.0
0.0
0.0
1.0


#### Custom functions: 

We might come up with our own function. To define a custom function, one can use either `def` or `lambda`.

In [13]:
# Note that this is our first cell of code without any value returned

import math

# def notation
def vector_length(x, y):
    return math.sqrt(x**2 + y**2)

# lambda notation
Vector_length = lambda x, y: math.sqrt(x**2 + y**2)

Calling a custom function is not different from a built-in function:

In [14]:
print(vector_length(3, 4))
print(Vector_length(3, 4))

5.0
5.0


## 3. Data Structures

#### List: 

`list` is a widely used data type to collect elements in a linear order. To create a list, use `[]`:

In [15]:
numlst = [1,2,3,4,5]
stringlst = ['a', 'b', 'c']
mixlst = [1, 'b', True]

print(type(numlst))
print(type(stringlst))
print(type(mixlst))

<class 'list'>
<class 'list'>
<class 'list'>


- Accessing a single element in a list:

In [16]:
print(mixlst[0])
print(stringlst[1])
print(numlst[2])

1
b
3


- Slicing a list:

In [17]:
print(mixlst[1:3])
print(mixlst[1:])

['b', True]
['b', True]


In [18]:
print(numlst[0:4])
print(numlst[:4])

[1, 2, 3, 4]
[1, 2, 3, 4]


- Slicing returns a list, even when there is only one element selected:

In [19]:
print(stringlst[1:2])
print(stringlst[1])

['b']
b


- Updating: **assignment** can be used to update an list element.

In [20]:
print(mixlst)
mixlst[2] = False  
print(mixlst)

[1, 'b', True]
[1, 'b', False]


- Some built-in funcitons for list:

In [21]:
print(len(stringlst))
print(max(stringlst))
print(sum(numlst))

3
c
15


In [22]:
print(all([True, True, True]))
print(all([True, True, False]))

True
False


In [23]:
print(any([False, False, False]))
print(any([True, False, False]))

False
True


- Iterables. to create the `numlst` we did 
```
numlst = [1,2,3,4,5]
```
Imagine if we need a list from 0 to 1 million, we definitely don't want to type all of the numbers. A good way to do so is to use the function: `range`.

In [24]:
rng = range(1000001)

print(rng)
print(type(rng))

range(0, 1000001)
<class 'range'>


The outpur `range(0, 1000000)` is of the type `range`. This is an iterable type, so it can be used in a `for` loop (as we will see soon). 

**Note**: `range` is not a list, meaning the numbers are not saved in the memory until we request some explicit output.

In [25]:
print(max(rng))
print(min(rng))
print(100 in rng)  # one can replace 100 by any number from 0 to 1 million
print(1000001 in rng)

1000000
0
True
False


#### `tuple`

Another popular iterable is called `tuple`. The most important difference between `list` and `tuple` is that
- a `tuple` is created with `()`.
- one cannot update element in a `tuple`.

In [26]:
tup = (1,2,3,4)

print(tup)
print(type(tup))
print(tup[0])
print(tup[1:])

(1, 2, 3, 4)
<class 'tuple'>
1
(2, 3, 4)


In [27]:
# No updating!

tup[0] = 5

TypeError: 'tuple' object does not support item assignment

#### Dictionary

Another popular way to collect data is to arrange them into pairs of key and value. Instead of indexing each value as in list, we name it.

In [28]:
me = {"name":"Newton", "gender":"male"}

print(type(me))
print(me)

<class 'dict'>
{'name': 'Newton', 'gender': 'male'}


- To access any value in a dictionary, we specify the key:

In [29]:
me["name"]

'Newton'

- Updating is similar to `list`:

In [30]:
print(me["name"])
me["name"]="Isaac Newton"
print(me)

Newton
{'name': 'Isaac Newton', 'gender': 'male'}


**Note**: There is **NO** order for the dictionary - `name` doesn't come before `gender` for the dictionary `me`. Also, indexing doesn't work:

In [31]:
me[0]

KeyError: 0

#### Built-in attributes of `dictionary`

A dictionary comes with key, values and items. `dict.items()` returns an object that behaves like a list of tuples, but one should be careful about the resulted order in the list, because it was not there in the dictionary.

In [32]:
print(me.keys())
print(me.values())
print(me.items())

dict_keys(['name', 'gender'])
dict_values(['Isaac Newton', 'male'])
dict_items([('name', 'Isaac Newton'), ('gender', 'male')])


## 4. Loops

- for:
    iterate through: list, string, dictionary
    - cumulating? updating? fibonassi?

- while:
    when stopping rule is unclear

#### Iterating through an iterator

**Ex 1**: A common way to use a iterator is to use with a `for` loop. Cumulating is a common example of using loop:

In [33]:
sum_ = 0

for i in numlst:
    sum_ += i   # equivalent to sum_ = sum_ + i
    
print(sum_)
print(sum(numlst))

15
15


**EX 2**: `for` loop can also be used to update a list.

**Note**: 
- The first cell below extract the elements from `numlst` without changing it.
- A `range` object is used to generate the indexes in `numlst`.

In [34]:
n = len(numlst) 

for i in range(n):
    print(numlst[i] + 3)

4
5
6
7
8


In [35]:
print(numlst)

[1, 2, 3, 4, 5]


Now we try to update the list.

**Note**: No output from the first cell of code:

In [36]:
n = len(numlst) 

for i in range(n):
    numlst[i] = numlst[i] + 3

In [37]:
print(numlst)

[4, 5, 6, 7, 8]


**Ex 3**: A more complicated case is when each step depends on the previous steps. <a href="https://en.wikipedia.org/wiki/Fibonacci_number">Fibonacci number</a> is a good example. 

In [38]:
n = 8

prev = 1
fibo = 1
for i in range(n):
    tmp = fibo
    fibo = fibo + prev
    prev = tmp
    
print("The {}th Fibonacci number is {}".format(n+2, fibo))

The 10th Fibonacci number is 55


### `while`

Another common type of loops is `while` loop. In a `for` loop we specify explicitly the range of elements to iterate through, but very often we don't know that (without more complicated math) in advance.

For example, if we want to sum the elements in `range(100001)` up to when the sum exceeds 1000. To avoid figuring out how many numbers we need, we can use a while loop as below:

In [39]:
collections = range(100001)

i = 0
sum_ = 0
while sum_ < 1000:  # Continue adding if sum_ is less then 1000
    sum_ += collections[i]  # Update the sum_
    i+=1                    # Update the index so that in the next round the next 
                            # element in the collections would be added
    
print("The last number added was: {}".format(i-1))
print("The final sum is {}".format(sum_))

The last number added was: 45
The final sum is 1035


## 5. Conditionals

Very often we want to treat different values in different ways. **Conditionals** come in handy. The basic usage is like:
```python
if c1:
    <do> A
else:
    <do> B
```

- The `if...else` statements can be nested:
```python
if c1:
    <do> A
else:
    if c2:
        <do> B
    else:
        <do> C
```
- Having an `if` follow an `else` is so common there is special syntax for it:
```python
if c1:
    <do> A
elif c2:
    <do> B
else:
    <do> C
```
- Return A if c1 and c2 are true, B if c1 is true but not c2, C if c1 is false but c3 is true, and D if c1 and c3 are both false:
```python
if c1:
    if c2:
        <do> A
    else:
        <do> B
elif c3:
    <do> C
else:
    <do> D
```

- Use conditionals in `for` loop. Let's print only even numbers

In [40]:
n = len(numlst) 

for i in range(n):
    if numlst[i] % 2 == 0:
        print(numlst[i])
    # Nothing to do otherwise, so no `else` is needed

4
6
8


- Use conditionals in a function.

In [41]:
def classify(x):
    if x % 2 != 0:
        return "odd"
    elif x != 0:
        return "even"
    else:
        return "zero"

In [42]:
print(classify(0))
print(classify(11))
print(classify(28))

zero
odd
even


In [43]:
for i in numlst:
    if classify(i) == "even":
        print(i)

4
6
8


## 6. List Comprehension

List comprehensions are another notation for defining lists. They mimic the mathematical notation of “set comprehensions” and have a concise syntax. A list comprehension has the form: 
```
[ <expresion> for <element> in <list> if <boolean> ]
```

- First consider the list comprehension that squares every element in a list, as follows:

In [44]:
[ x* x for x in range(5)]

[0, 1, 4, 9, 16]

- combined with conditionals

In [45]:
[ x* x for x in range(5) if x%2==0]

[0, 4, 16]

What we did above is equivalent to the for loop below:

In [46]:
result_lst = []

for x in range(5):
    if x%2 == 0:
        result_lst.append(x*x) # The function `append` appends an element to the tail of a list
        
result_lst

[0, 4, 16]

## 7. Yield Statement

Recall the difference between `range` and `list`. `list` save all the elements in the memory while `range` only memorize the rule to generate the elements. Saving a sequence of numbers from 0 to 1 million as a list takes a lot of spaces, save it as `range` reduces the memory used.

Assume we want to collect all the even numbers in the sequence from 0 to 1 million. **First let's save this in a list with two different ways**:

In [47]:
# list comprehension

even_lst_comp = [x for x in range(100001) if x%2==0]

In [48]:
# list for

even_for_loop = []

for x in range(100001):
    if x%2==0:
        even_for_loop.append(x)

In [49]:
# They gave teh same result

even_lst_comp == even_for_loop

True

In [50]:
# But they takes a lot of space:
import sys

print(sys.getsizeof(even_lst_comp))
print(sys.getsizeof(even_for_loop))

406496
406496


**Then let's save them as a generator**. 

To do this with a list comprehension, just use `()` instead of `[]`:

In [51]:
even_iter = (x for x in range(100001) if x%2==0)

print(type(even_iter))
print(sys.getsizeof(even_iter)) # Much smaller!

<class 'generator'>
88


**Note**: This **generator** is like `range` where the actual number is not saved, but the rule to generate these numbers is memorized so that one can extracts the values when needed:

In [52]:
# Remark: a generator can be used only once!

for i in even_iter:
    print(i)
    
    if i > 10: # This is here because we don't want to print the whole 500 thousand numbers
        break

0
2
4
6
8
10
12


It is a little bit abstract to create a **generator** with a for loop. We also need a function for this:

In [53]:
def get_gener():
    for x in range(100001):
        if x%2==0:
            yield x

even_gener = get_gener()
print(type(even_gener))
print(sys.getsizeof(even_gener)) # Much smaller!

<class 'generator'>
88


In [54]:
for i in even_gener:
    print(i)
    
    if i > 10: # This is here because we don't want to print the whole 500 thousand numbers
        break

0
2
4
6
8
10
12


## 8. Exception Handling

- Exceptions and errors are messages given by Python indicating there is a problem in the interpretation or running of a program.  
- Without exception handling these errors will cause the program to stop.  

Typical errors/exceptions arise from:
- FileNotFoundError: a file or folder referenced can't be found
- ZeroDivisionError: division by zero
- TypeError: incorrect object type used in a statement
- SyntaxError: statement syntax incorrectly structured
- ValueError: incorrect value used in a statement
- IndexError: index referenced does not exist  

The complete list is here: https://docs.python.org/3.6/library/exceptions.html


A common usage is when streaming data, we might not have full control on the structure. Suppose we are streaming data about scientists and would like to know who won Nobel Prize and when. Consider the code below:

In [55]:
streaming_data = [
    {"name":"Issac Newton", "gender":"male"},
    {"name":"Maries Curie", "gender":"female", "Nobel Prize in Physics year": 1903},
    {"name":"Albert Einstein", "gender":"male", "Nobel Prize in Physics year":1921}
]

result = []
for scientist in streaming_data:
    tup = (scientist["name"], scientist["Nobel Prize in Physics year"])
    result.append(tup)

KeyError: 'Nobel Prize in Physics year'

Here is simple fix with `try...except`.

In [56]:
result = []

for scientist in streaming_data:
    try:
        tup = (scientist["name"], scientist["Nobel Prize in Physics year"])
        result.append(tup)
    except KeyError:
        pass

In [57]:
result

[('Maries Curie', 1903), ('Albert Einstein', 1921)]

## 9. Classes

We have learned to create variables and functions to perform some tasks. Very often we would like to pack them to organize our code better. A typical way is to implement a **class**.

For example, we would like to maintain a collection of books, we might need to save the name and the author of each book, as well as to print out these pieces of information. We may then
- create the **attributes** to save name and author
- define the **method** `print_info` to show the information

In [58]:
# Define the class as the template for the real instance

class Book(object):
    def __init__(self, name, author = None):
        self.name = name
        self.author = author
    
    def print_info(self):
        return '<%s> by %s' % (self.name, self.author)

In [59]:
# Create the actual instances (books)

book_1 = Book('The little SAS book', 'Lora D. Delwiche')
book_2 = Book('R CookBook', 'Paul Teetor')

In [60]:
print(book_1.name)
print(book_1.author)
print(book_1.print_info())
print("\n")
print(book_2.name)
print(book_2.author)
print(book_2.print_info())

The little SAS book
Lora D. Delwiche
<The little SAS book> by Lora D. Delwiche


R CookBook
Paul Teetor
<R CookBook> by Paul Teetor


#### Magic functions

Instead of `print_info`, Python allow some special magic functions. A good example is:

In [61]:
class Book(object):
    def __init__(self, name, author = None):
        self.name = name
        self.author = author
    
    def __str__(self):
        return '<%s> by %s' % (self.name, self.author)
    
book_1 = Book('The little SAS book', 'Lora D. Delwiche')
book_2 = Book('R CookBook', 'Paul Teetor')

In [62]:
# The function `print` actually calls __str__

print(book_1)
print(book_2)

<The little SAS book> by Lora D. Delwiche
<R CookBook> by Paul Teetor


### `inheritance`

A very common practice is to build new attirbutes on top of the existing class. For example, we already have the class `Book`, we might want to define a new `EBook` class which needs the `format` attribute that is not there in `Book`. We might also need to modify the `__str__` function since now we need to print more.

In [63]:
class EBook(Book):
    def __init__(self, name, fmt, author = None):
        Book.__init__(self, name, author) # this creates attributes we have in Book
        self.fmt = fmt      # new attibute
        
    def __str__(self):      # override __str__() method
        return super().__str__() + ', format: '+ self.fmt

In [64]:
ebook_1 = EBook(
    "An Introduction to Statistical Learning - with Applications in R", "pdf",
    "Trevor Hastie and Robert Tibshirani"
)

In [65]:
print(ebook_1)

<An Introduction to Statistical Learning - with Applications in R> by Trevor Hastie and Robert Tibshirani, format: pdf
