# Part 1 - Jupyter notebook and Python basics

### Jupyter notebook

#### Jupyter notebook is the environment we will be using to learn Python programming. For our purpose, there isn't much you need to learn about it. To run the cell, press shift + enter. The other thing you need to know is a shift + tab command. When you press shift tab, additional information that is contained in a docstring format will pop up (more about this later). Another thing is autocompletion. If you press tab it will autocomplete the word you are trying to type or give you available options. Finally, multiple single line comments can be done by pressing ctrl + / (more about comments later). You can find all you need about Jupyter notebooks here: https://jupyter.brynmawr.edu/services/public/dblank/Jupyter%20Notebook%20Users%20Manual.ipynb

### Comments

In [4]:
# Comments are parts of code that do not get executed executed when you run the code
To give you an example, If I simply run the text, Jupyter notebook (or any other IDE (Integrated Development Environment <- 
                                                                                      Thing you use to write your code))
will give you an error
# Also not that or and any are highlighted, indicating that Jupyter notebook considers them a command

SyntaxError: invalid syntax (<ipython-input-4-1086e6ae2610>, line 2)

In [5]:
# On the other side, when you put a hashtag in front of it, IDE will ignore whatever follows after

In [7]:
# This is a single line comment as it only covers this line
It does not cover anything after that # gives an error

SyntaxError: invalid syntax (<ipython-input-7-50c2c9938cb4>, line 2)

In [8]:
# Unlike many other languages, Python does not have multi line comments, so you should use multiple single line comments to
# comment out multiple lines
# Just like that
# If the code is long, selecting individual lines to comment out a whole section might take a lot of time.

### Remember that programming is all about automation, so there's a shortcut to make multiple single line comments by simply pressing ctrl + / (located above numbers on numpad or next to the right shift)

In [10]:
# Try using it yourself and run the cell, so you don't get an error
Comment 1
Comment 2
Comment 3

SyntaxError: invalid syntax (<ipython-input-10-03416f13355e>, line 2)

In [None]:
'''
It is also possible to use docstring for multi line commenting, however it is not recommended. Docstrings are generally used 
to add information to functions. There's an example of docstring below.
Docstrings are created by using 3 single ' or double " quotation marks to open and 3 ' or 3 " to close, respectively
'''
"""
Also works.
"""

### Libraries

#### You can think of libraries as a toolbox, in the sense that it has various tools you can use. For example talib (https://pypi.python.org/pypi/TA-Lib), which stands for technical analysis library, contains various useful "tools" such as ATR, Bollinger Bands, RSI, etc... This eliminates the need for you to write your own functions, as you can simply import what you need from according library.

In [12]:
# To import a library you need to use import statement. Some libraries have to be downloaded on your computer prior using those
# To install a library on Windows, simply open command prompt and type pip install TA-Lib (to install talib as in this example)
# Some libraries come preinstalled with IDE, so first try to import a library, and, if you get an error, download and install it

#### Talib is a bit tricky to install, so let's install Statsmodels instead. Statsmodels is a Python module that provides classes and functions for the estimation of many different statistical models, as well as for conducting statistical tests, and statistical data exploration. The online documentation is hosted at http://www.statsmodels.org.

#### To obtain the latest released version of statsmodels using pip: pip install -U statsmodels . Open your command prompt if you are on Windows and type pip install -U statsmodels . (Do not include the period/the full stop). If it doens't work try: pip install --upgrade --no-deps statsmodels . That's all you need to do.

In [1]:
# Now we can import statsmodels
import statsmodels
import statsmodels.api as sm #needed to use time series analysis for the example below

In [3]:
# So far we have imported the whole tool box. To access individual elements use . First of all, go to 
# http://www.statsmodels.org/dev/index.html
# Then, go down to table of content -> time series analysis -> descriptive statistics and tests
# Notice the code: statsmodels.tsa.stattools.acovf, statsmodels.tsa.stattools.acf, statsmodels.tsa.stattools.pacf, etc...
# . is used to access the elements. So firstly we chosee library, then time series analysis, then 
# descriptive statistics and tests, and from that we get what we want.

In [4]:
# If we wanted to desribe location of Bloor-Yonge station using code, we could do something like:
# canada.ontario.toronto.bloor-yonge

In [6]:
# Type statsmodels. and then press tab. You will see all options available to you. 

In [2]:
# Let's try to grab estimated partial autocorrelation
# It lies in statsmodels libabry -> time series analysis -> descriptive statistics and tests (statstools) -> pacf
# Try to press tab each time after . (a period)
# Firstly type statsmodels.
# Press tab
# Etc... until you get

statsmodels.tsa.stattools.pacf

# If you press shift + tab, you will see docstring info. Docstring explains what the function is doing and what goes parameters
# it takes. Press + sign in the top right corner to see more

<function statsmodels.tsa.stattools.pacf>

### Common libraries that are used in data analytics for Python are numpy, pandas, matplotlib. You will NEED TO INSTALL those. 
**NumPy (or Numpy)** is a Linear Algebra Library for Python, the reason it is so important for Finance with Python is that almost all of the libraries in the PyData Ecosystem rely on NumPy as one of their main building blocks. <br>Plus we will use it to generate data for our analysis examples later on! Numpy is also incredibly fast, as it has bindings to C libraries. <br>For more info on why you would want to use Arrays instead of lists, check out this great StackOverflow post (https://stackoverflow.com/questions/993984/why-numpy-instead-of-python-lists).<br><br>

**Pandas** stand for Panel Data. Pandas is built on top of NumPy. This is the main library for data analytics and we will focus most of the time on it.<br><br>
**Matplotlib** is the "grandfather" library of data visualization with Python. It was created by John Hunter. He created it to try to replicate MatLab's (another programming language) plotting capabilities in Python. So if you happen to be familiar with matlab, matplotlib will feel natural to you.

It is an excellent 2D and 3D graphics library for generating scientific figures. 

Some of the major Pros of Matplotlib are:

* Generally easy to get started for simple plots
* Support for custom labels and texts
* Great control of every element in a figure
* High-quality output in many formats
* Very customizable in general



In [3]:
# One last point about libraries is that you can abbreviate a full name of a library with an AS statement. 
import matplotlib

In [4]:
# Now, if we want to access individual elements, we need to type the whole name matplotlib each time
matplotlib.contextlib

<module 'contextlib' from 'C:\\Users\\Biarys\\Anaconda3\\lib\\contextlib.py'>

In [10]:
# By using as statement, we can assign a new name to matplotlib
import matplotlib as jelly

In [11]:
jelly.contextlib

<module 'contextlib' from 'C:\\Users\\Biarys\\Anaconda3\\lib\\contextlib.py'>

In [12]:
matplotlib.contextlib

<module 'contextlib' from 'C:\\Users\\Biarys\\Anaconda3\\lib\\contextlib.py'>

In [7]:
# General abbreviations are:
import numpy as np
import pandas as pd
import matplotlib as mpt

### Data types

#### Numbers
Generally, there are two types of numbers. Whole numbers called integers (int), numbers with decimal points called floats (float).

In [13]:
1 + 1

2

In [14]:
2 - 1

1

In [15]:
3 * 4

12

In [18]:
15/3 # Output is a float in this case, but it is always better to make sure you explicitely state that a number is a float by
# adding .0 at the end

5.0

In [20]:
15/4 # Works in Jupyter notebook

3.75

In [21]:
15.0/4 # Works everywhere

3.75

In [22]:
15/4.0 # Works everywhere

3.75

In [24]:
15.0/4.0 # Works everywhere

3.75

In [28]:
3 ** 2 # ** is the power operator. 3 to the power of 2

9

In [31]:
5 % 4 # % is called modulo. It is the remainder that is left after division. If we divide 5 by 4 we get 1 and 1/4.
# In 1/4, 1 is the result of modulo

1

In [32]:
10 % 8

2

In [33]:
3 % 2 

1

In [34]:
10 % 5

0

In [35]:
(2 + 3) * (5 + 5)

50

#### Strings

In [36]:
'single quotes'

'single quotes'

In [37]:
"double quotes"

'double quotes'

In [38]:
" wrap lot's of other quotes"

" wrap lot's of other quotes"

In [40]:
' wrap lot"s of other quotes' # Seems to work, but looks pretty wierd

' wrap lot"s of other quotes'

#### Variable assignment

In [46]:
# use assignment operator =
# In Python, every variable is an object, which means we do not need to specify the type of the variable. Only the name.

In [42]:
x = 'hello'

In [43]:
x

'hello'

In [44]:
print(x)

hello


In [45]:
num = 12
name = 'Sam'

#### Lists

In [47]:
# A collection of variables. Assigning:
x1 = 1
x2 = 2
x3 = 3
# is burdensome. It get's pretty messy if we need to initiate a large number of variables, let's say a 100

In [58]:
# Much better to use lists. Initiate a list using square brackets. Square brackets let Python know that the object is a list
[1,2,3]

[1, 2, 3]

In [62]:
z = [] # Empty list

In [61]:
type(z)

list

In [75]:
# Can contain multiple data types in the same list
['hi',1,[1,2]] # The [1,2] is a list inside of a list (nested list)

['hi', 1, [1, 2]]

In [52]:
# Let's assign ['hi',1,[1,2]] to my_list
my_list = ['hi', 1, [1, 2]]

In [53]:
my_list

['hi', 1, [1, 2]]

In [56]:
print(my_list) # Same

['hi', 1, [1, 2]]


In [64]:
# You can access individual elements by using [] next to the name
my_list[0] # First element

'hi'

In [66]:
my_list[2] # Third element

[1, 2]

In [70]:
# We can see that the third element has 2 other elements in it. To narrow down use second pair of [] brackets
my_list[2][0]

2

In [71]:
my_list[2][1]

2

In [73]:
my_list[0][0]

'h'

In [74]:
my_list[0][1]

'i'

#### Booleans

Just a binary statement. Either True or False.

In [83]:
True

True

In [84]:
False

False

In [85]:
x = True

In [86]:
x

True

In [87]:
x = False

In [88]:
x

False

#### Comparison operators

In [89]:
1 > 2

False

In [90]:
1 < 2

True

In [91]:
1 >= 4 # More or equal

False

In [92]:
1 <= 4 #Less or equal

True

In [93]:
1 == 4 # Just equal. Need to use double equal sign, as a single equal sign is an assignment operator

False

In [94]:
1 == 1

True

In [100]:
x = 2
y = 2
z = 3

In [101]:
x == y

True

In [102]:
x == z

False

In [99]:
x != y # != means not equal

False

In [103]:
z != x

True

In [104]:
'hi' == 'bye'

False

In [105]:
(1 > 2) and (2 < 3)

False

#### Precedence rule

Determines in what order an expression should be executed. Operators have different priorities. Round brackets () have the highest priority, that is whatever is in the brackets will be executed first. On the other hand, logical OR has the lowest priority. More about it here: https://www.programiz.com/python-programming/precedence-associativity

#### For loops, while loops, if, elif, else statements

A for loop acts as an iterator in Python, it goes through items that are in a sequence or any other iterable item.
The reason it is called a 'loop' is because the code statements are looped through over and over again 
until the condition is no longer met.
The general format for a for loop in Python:

    

In [None]:
for item in object:
    statements to do stuff

The variable name used for the item is completely up to the coder, so use your best judgment for choosing a name that makes sense and you will be able to understand when revisiting your code. This item name can then be referenced inside you loop, for example if you wanted to use if statements to perform checks.

In [2]:
# Iterating through a list:
l = [1,2,3,4,5,6,7,8,9,10]

In [4]:
for afswe in l:
    print(afswe)

1
2
3
4
5
6
7
8
9
10


In [5]:
# Note that afswe is just a randomly chosen name. It is better to choose a name that has more meaning to the context and can
# help understand your code other programmers if they were to read it.
# A better name for the item could be number
for number in l:
    print(number)

1
2
3
4
5
6
7
8
9
10


The while statement in Python is one of most general ways to perform iteration. A while statement will repeatedly execute a single statement or group of statements as long as the condition is true. The general format of a while loop is:

In [None]:
while test:
    code statement
else:
    final code statements

In [31]:
x = 0

while x < 10:
    print('x is currently: ',x)
    print(' x is still less than 10, adding 1 to x')
    x+=1 # += means that we take x and add 1 to it. It is an equivalent of x = x + 1. You might be wondering how is it possible
    # to take x and assign it to itself. Won't it cause a confusion? You can think it's because of the rule of precedence. 
    # *(I think this is not entirely true, but easy enough to remember) 
    # + sign has a higher priority compared the assignment operator =. So in x = x + 1, x + 1 gets evaluated first, and then 
    # the result get's assigned to the variable x. More about it here: 
    # https://softwareengineering.stackexchange.com/questions/134118/why-are-shortcuts-like-x-y-considered-good-practice
    # You can also use x -= 1 for subtraction, x *= 1 for multiplication, x /= 1 for division

x is currently:  0
 x is still less than 10, adding 1 to x
x is currently:  1
 x is still less than 10, adding 1 to x
x is currently:  2
 x is still less than 10, adding 1 to x
x is currently:  3
 x is still less than 10, adding 1 to x
x is currently:  4
 x is still less than 10, adding 1 to x
x is currently:  5
 x is still less than 10, adding 1 to x
x is currently:  6
 x is still less than 10, adding 1 to x
x is currently:  7
 x is still less than 10, adding 1 to x
x is currently:  8
 x is still less than 10, adding 1 to x
x is currently:  9
 x is still less than 10, adding 1 to x


Notice how many times the print statements occurred and how the while loop kept going until the True condition was met, which occurred once x==10. Its important to note that once this occurred the code stopped. Lets see how we could add an else statement:

In [17]:
x = 0

while x < 10:
    print('x is currently: ',x)
    print(' x is still less than 10, adding 1 to x')
    x+=1
    
else:
    print('All Done!')

x is currently:  0
 x is still less than 10, adding 1 to x
x is currently:  1
 x is still less than 10, adding 1 to x
x is currently:  2
 x is still less than 10, adding 1 to x
x is currently:  3
 x is still less than 10, adding 1 to x
x is currently:  4
 x is still less than 10, adding 1 to x
x is currently:  5
 x is still less than 10, adding 1 to x
x is currently:  6
 x is still less than 10, adding 1 to x
x is currently:  7
 x is still less than 10, adding 1 to x
x is currently:  8
 x is still less than 10, adding 1 to x
x is currently:  9
 x is still less than 10, adding 1 to x
All Done!


if Statements in Python allows us to tell the computer to perform alternative actions based on a certain set of results. <br>

Verbally, we can imagine we are telling the computer:<br>

"Hey if this case happens, perform some action"<br>

We can then expand the idea further with elif and else statements, which allow us to tell the computer:<br>

"Hey if this case happens, perform some action. Else if another case happens, perform some other action. Else-- none of the above cases happened, perform this action"<br>

Let's go ahead and look at the syntax format for if statements to get a better idea of this:<br>

In [None]:
if case1:
    perform action1
elif case2:
    perform action2
else: 
    perform action 3

In [19]:
if True:
    print('It was true!')

It was true!


In [20]:
x = False

if x: # if x is an equivalent of if x == True:
    print('x was True!')
else:
    print('I will be printed in any case where x is not true')

I will be printed in any case where x is not true


In [22]:
x = False

if x == True: # Same as above
    print('x was True!')
else:
    print('I will be printed in any case where x is not true')

I will be printed in any case where x is not true


In [24]:
# Using multiple if statements
loc = 'Bank'

if loc == 'Auto Shop':
    print('Welcome to the Auto Shop!')
elif loc == 'Bank':
    print('Welcome to the bank!')
else:
    print("Where are you?")

Welcome to the bank!


In [28]:
# elif can be used multiple times
x = 4

if x == 1:
    print("One")
elif x == 2:
    print("Two")
elif x == 3:
    print("Three")
elif x == 4:
    print("Four")
elif x == 5:
    print("Five")
elif x == 6:
    print("Six")
else: # no need to specify a condition, since it is executed if no other condition was met
    print("Else")


Four


In [30]:
# It is important to note that:
x = 4

if x == 1:
    print("One")
elif x == 2:
    print("Two")
elif x == 3:
    print("Three")
elif x == 4:
    print("Four")
elif x == 5:
    print("Five")
elif x == 6:
    print("Six")
else: # no need to specify a condition, since it is executed if no other condition was met
    print("Else")
    
# is different from:
if x == 1:
    print("One")
if x == 2:
    print("Two")
if x == 3:
    print("Three")
if x == 4:
    print("Four")
if x == 5:
    print("Five")
if x == 6:
    print("Six")
else:
    print("Else")


Four
Four
Else


The first if-elif-else tests only as many as needed: if it finds one condition that is True, it stops and doesn't evaluate the rest. The second form if-if-if tests all conditions. In other words: if-elif-else is used when the conditions are mutually exclusive.<br>

In the example above, x = 4. In the first case, Python looks for the first True condition to be met, then exits the if statement.

In [32]:
if x == 1: # False
    print("One")
elif x == 2: # False
    print("Two")
elif x == 3: # False
    print("Three")
elif x == 4: # True, go to the next level -> print("Four")
    print("Four")

In the second case, it checks each if statement individually.

In [None]:
if x == 1: # False
    print("One")
if x == 2: # False
    print("Two")
if x == 3: # False
    print("Three")
if x == 4: # True, print("Four")
    print("Four")
if x == 5: # False
    print("Five")
if x == 6: # False
    print("Six")
else: # True, because x is not equal to 6. Print("Else")
    print("Else")

In [34]:
# We can use if statements with while and for loops.
# Print only the even numbers from previous list:
for num in l:
    if num % 2 == 0:
        print(num)

2
4
6
8
10


In [35]:
# Try to print odd numbers only


1
3
5
7
9


In [38]:
# We could have also put in else statement in there:
for num in l:
    if num % 2 == 0:
        print(num)
    else:
        print('Odd number')

Odd number
2
Odd number
4
Odd number
6
Odd number
8
Odd number
10


In [40]:
# We've used for loops with lists, how about with strings? 
# Remember strings are a sequence so when we iterate through them we will be accessing each item in that string.
for letter in 'This is a string.':
    print(letter)

T
h
i
s
 
i
s
 
a
 
s
t
r
i
n
g
.


In [41]:
# Iterating through a nested list
nest_list = [[2,4],[6,8],[10,12]]

In [42]:
for num1, num2 in nest_list:
    print(num1)

2
6
10


In [43]:
# Try to print the other number yourself

4
8
12


In [44]:
# If we simply use one item in the for loop, the result will be the whole nested object itself, as Python doesn't know which
# one to pick
for num in nest_list:
    print(num)

[2, 4]
[6, 8]
[10, 12]


#### break, continue, pass
We can use break, continue, and pass statements in our loops to add additional functionality for various cases. The three statements are defined by:<br>

break: Breaks out of the current closest enclosing loop.<br>
continue: Goes to the top of the closest enclosing loop.<br>
pass: Does nothing at all.

Thinking about break and continue statements, the general format of the while loop looks like this:

In [None]:
while test: 
    code statement
    if test: 
        break
    if test: 
        continue 
else:

break and continue statements can appear anywhere inside the loop’s body, but we will usually put them further nested in conjunction with an if statement to perform an action based on some condition.

In [48]:
x = 0

while x < 10:
    print('x is currently: ',x)
    print(' x is still less than 10, adding 1 to x')
    x+=1
    if x ==3:
        print('x==3')
    else:
        print('continuing...')
        continue

x is currently:  0
 x is still less than 10, adding 1 to x
continuing...
x is currently:  1
 x is still less than 10, adding 1 to x
continuing...
x is currently:  2
 x is still less than 10, adding 1 to x
x==3
x is currently:  3
 x is still less than 10, adding 1 to x
continuing...
x is currently:  4
 x is still less than 10, adding 1 to x
continuing...
x is currently:  5
 x is still less than 10, adding 1 to x
continuing...
x is currently:  6
 x is still less than 10, adding 1 to x
continuing...
x is currently:  7
 x is still less than 10, adding 1 to x
continuing...
x is currently:  8
 x is still less than 10, adding 1 to x
continuing...
x is currently:  9
 x is still less than 10, adding 1 to x
continuing...


In [50]:
# Using break statement
x = 0

while x < 10:
    print('x is currently: ',x)
    print(' x is still less than 10, adding 1 to x')
    x+=1
    if x ==3:
        print('Breaking because x==3')
        break
    else:
        print('continuing...')
        continue

x is currently:  0
 x is still less than 10, adding 1 to x
continuing...
x is currently:  1
 x is still less than 10, adding 1 to x
continuing...
x is currently:  2
 x is still less than 10, adding 1 to x
Breaking because x==3


Note how the other else statement wasn't reached and continuing was never printed!<br>
**A word of caution however! It is possible to create an infinitely running loop with while statements. For example:**

In [None]:
# DO NOT RUN THIS CODE!!!! 
# This while loop runs forever because the condition is always True. You will have to manually shut it down using Task Manager,
# pressing the stop button located under the cell tab in Jupyter notebook, or by closing command prompt and restarting 
# Jupyter notebook
while True:
    print 'Uh Oh infinite Loop!'

#### List slicing and list comprehension

Now that you know quite a bit of Python, we can get into a bit more advanced topics.<br>

List slicing allows you to grab a specific range in a list. 

In [53]:
l = [0,1,2,3,4,5,6,7,8,9,10]

In [55]:
# Let's say we wanna grab the first 5 elements:
l[:5] # Remember that in Python, the first element has an index of 0

[0, 1, 2, 3, 4]

The general structure for list slicing is:<br>
list[start:end:step]. End only goes up to that index, but does not include it. In the example above, 5 was not included

In [57]:
# Grabbing every second number in the list:
l[::2] # You don't have to specify start and end

[0, 2, 4, 6, 8, 10]

In [58]:
l[1::3]

[1, 4, 7, 10]

In [61]:
l[:9:3]

[0, 3, 6]

In [64]:
# Using in conjuction with a for loop
for i in l[4::2]: # Start at 4, to the end, in the step of 2
    print(i)

4
6
8
10


In addition to sequence operations and list methods, Python includes a more advanced operation called a list comprehension.<br>

List comprehensions allow us to build out lists using a different notation. You can think of it as essentially a one line for loop built inside of brackets. For a simple example:

In [65]:
# Grab every letter in string
lst = [x for x in 'word']

In [66]:
# Check
lst

['w', 'o', 'r', 'd']

In [68]:
# The above statement is an equivalent of a for loop:
lst1 = []
for x in 'word':
    lst1 += x

In [69]:
lst1

['w', 'o', 'r', 'd']

In [70]:
# Square numbers in range and turn into list
lst = [x**2 for x in range(0,11)]

In [71]:
lst

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 100]

In [72]:
# Check for even numbers in a range
lst = [x for x in range(11) if x % 2 == 0]

In [73]:
lst

[0, 2, 4, 6, 8, 10]

In [75]:
# Can also do more complicated arithmetic:
# Convert Celsius to Fahrenheit
celsius = [0,10,20.1,34.5]

fahrenheit = [ ((float(9)/5)*temp + 32) for temp in celsius ]

fahrenheit

[32.0, 50.0, 68.18, 94.1]

In [76]:
# We can also perform nested list comprehensions, for example:
lst = [ x**2 for x in [x**2 for x in range(11)]]
lst

[0, 1, 16, 81, 256, 625, 1296, 2401, 4096, 6561, 10000]

In [77]:
matrix = [[1,2],['a','c'],['hi',53]]

In [79]:
# Build a list comprehension by deconstructing a for loop within a list

first_col = [row[0] for row in matrix]
first_col

[1, 'a', 'hi']

## The end