# Basics

Learning Objectives
* * *
* Understand Jupyter notebook basics
* Become familiar with Python's primitive types
* Become familiar with boolean operators and comparison operators
* See some examples of conditional statements
* Grasp and understand flow control (e.g. `for` loops)
* Discern the differences between functions, methods and generators

### Welcome to Jupyter!  Here are a few notebook notes
<br>
<b>This is a little diagram of the anatomy of the notebook toolbar:</b><br>
<img src='https://raw.githubusercontent.com/michhar/python-jupyter-notebooks/master/general/nb_diagram.png' alt="Smiley face" align="center">

## Shortcuts!!!
* A complete list is [here](https://sowingseasons.com/blog/reference/2016/01/jupyter-keyboard-shortcuts/23298516), but these are my favorites.  There is a *command* mode and *edit* mode much like the unix editor `vi/vim`.  `Esc` will take you into command mode.  `Enter` (when a cell is highlighted) will take you into edit mode.

Mode  |  What  | Shortcut
------------- | ------------- | -------------
Command (Press `Esc` to enter)  | Run cell | Shift-Enter
Command  | Add cell below | B
Command | Add cell above | A
Command | Delete a cell | d-d
Command | Go into edit mode | Enter
Edit (Press `Enter` to enable) | Run cell | Shift-Enter
Edit | Indent | Clrl-]
Edit | Unindent | Ctrl-[
Edit | Comment section | Ctrl-/
Edit | Function introspection | Shift-Tab

Try some below

**A Code cell is grey (by the way this is a Markdown cell)**

In [None]:
# This is a comment
help(print)
print('this line is Python code')

# Hit Shift+Enter at same time as a shortcut to run this cell

<b>Jupyter notebooks have tab-completion in code cells</b><br><br>
<b>To get help on any module, function or variable surround it with `help()`</b>

In [None]:
help(print)

In [None]:
### The import statement

In [None]:
# This is what an import statement looks like, here we are importing the json module
import json

# We can use the 'from' syntax to import a submodule
from sklearn import datasets

# We can rename a module during import to make it easier to type later
import numpy as np

### Indentation

* Python uses indentation instead of demarcation with punctuation, such as semicolons, to tell the interpreter how to run the code.
* It is standard to use four spaces (and not recommended to use tabs)
* Whitespace like this makes for easier-to-read code

### Reserved Words

In [None]:
None=2

#  Calculator

In [None]:
# Addition and subtraction
print(5 + 6)
print(5 - 5)

# Multiplication and division
print(3 * 5)
print(10 / 2)

# Exponentiation
print(4 ** 2)

# Modulo
print(18 % 7)

# How much is your $20 worth after 17 years?

# Variables

Variables allows you to refer to a value by it's name. We use "=" to create a variable.

We can't emphasize enough that "=" means ASSIGNMENT and not equality

In [None]:
x=10
print(x)
# create a variable called portfoliovalue with the initial
#value of 20 and check the value by printing it as above 

to calculate using variables, just plug the variable name instead of the value. For example in the compound interest calculation above you can write:

In [None]:
years=7
interestrate=0.1
(1+interestrate)**7-1
# now use years, portfoliovalue and interestrate instead
#of the numbers 

The most common data types are

    1. float (a number that has both an integer and fraactional part: 3.14345)
    2. int (an integer: ...-3,-2,-1,0,1,2,3,..) 
    3. str (a string, i.e. text)
    you can create a string with double or single quotes (see below)
    4. bool (a boolean, a logical type that can be True or False)
    
You can figure out the type of variable x by typing type(x).
    

In [None]:
description='future value'
#description="future value" also works
print(type(interestrate))
type(description)



### String
Strings are type <code>str</code> and all strings in Python 3 are Unicode.  There is no separate <code>character</code> class as in other languages.  Strings are either surrounded by single, double or triple quotes, a style feature up to the coder, however when one wishes to place a single quoted string inside a double quoted string, this feature is useful.

In [None]:
"a single quoted string 'hi' inside a double quoted string"

In [None]:
'a double quoted string "hi" inside a single quoted string'

### Multiple type operation


Note that different types will respond different to an operator depending on it's type

For example, try to 

1. add two strings
2. multiply a string by 4

What do you expect to happen?

In [None]:
description="future"+ " " + "value"+ " "
description

In [None]:
description*4


### Type conversion

a very useful tool is the conversion between types. If you write str(x) than you convert the x variable to string. Same thing works for int(), float(), bool().

Here is an example

In [None]:
portfoliovalue=2
print("the value of your portfolio is $" + str(portfoliovalue) + " dollars")

# Lists

To work with multiple data points we use lists. Lists can contain any type. In particular it can contain lists.


In [None]:
[1.23,3.4,6.7]
portfoliosvalue=[1.23, 3.4, 6.7]
portfoliosowner=['sarah','jack','andrea']
portfolios=[['sarah',1.23],['jack',3.4],['andrea',6.7]]

What is a valid list?

    A. [1, 3, 4, 25] 
    B. [[1, 25, 3], [4, 55, 75]]
    C. [1 + 6, "sarah" * 3, 2]

# Indexing

1. the first entry is zero
2. you can acess the list from the end by using negative numbers with -1 being the last, -2 the one before the last and so on

In [None]:
print(portfoliosvalue[:])


We can also recover a range of values through slicing. 

We specify a range that recovers a new list that is a slice of the original list.

We write listname[a:b], where a and b are integers smaller that the total list size. This recover all items between a and b-1. 

As a convention the last item is excluded.

You can also type

1. listname[:b] to recover everything up to b (excluding b)
2. listname[a:] to recover everything starting at a (including a)
3. listname[a:b] to recove everything from a to b (excluding b)

In [None]:
portfoliosvalue[0:]

To subset lists of lists, you can use the same technique as before: square brackets. Try out the commands in the following code sample in the IPython Shell:

portfolios[2][0]
portfolios[2][:2]

portfolios[2] results in a list, that you can subset again by adding additional square brackets.

In [None]:
portfolios[0][1]

We can also manipulate lists!

Lets say Sarah's portfolio grew to 1.5 dollars, then we write

In [None]:
# in the case of the list of list construction
portfolios[0][1]=1.5
print(portfolios)
# or in the case of a simple list
portfoliosvalue[0]=1.5
print(portfoliosvalue)

You can also change multiple entries at once with slicing

In [None]:
portfoliosvalue

In [None]:
portfoliosvalue[0:2]=[1.4,1.7]
print(portfoliosvalue)

You can add and delete elements to a list as well:

1. to delete just write del(listname[positiontobedeleted])
2. to add just write listname=listname+newelement

In [None]:
print(portfoliosvalue)
portfoliosvalue=portfoliosvalue+[2.5]
print(portfoliosvalue)
del(portfoliosvalue[3])
print(portfoliosvalue)



Important note regarding copying lists

b=a

creates b to be equal to a, so if you change a, b also changes. Under the hood, a and b point to the same list object.

To make a true independent copy of the list you can write b=a.copy()

to see the difference:

In [None]:

p1=portfoliosvalue
p2=portfoliosvalue.copy()
print(portfoliosvalue)
print(p1)
print(p2)


In [None]:
portfoliosvalue[0]=0.1
print(portfoliosvalue)
print(p1)
print(p2)

### Functions vs. methods

Python has many built-in functions such as <code>print</code> and <code>help</code>.  If a function is part of the implementation of a specific type it is called a method.  Methods take their first argument before the function name followed by a period.  Here's a guide:
<table style="width:75%" align="left">
  <tr>
    <td>Function</td>
    <td>Built-in or user defined.  Syntax is name followed by arguments in parenthesis.</td>
    <td>Example usage:<br> 
    `print('hello world.')`</td>		
  </tr>
  <tr>
    <td>Method</td>
    <td>Part of implementation of a specific type.  Syntax is the variable of the specific type followed by a period and then the method name with arguments in parentheses.</td>
    <td>Example usage:<br>
    `s = 'abc'`<br>
    `s.count('a')`
    </td>		
  </tr>
</table>
<br>

## Functions

Function are build in or custom made snippets of code that avoid repetition. A skilled programmer will use functions instead of repeating code. As this will minimize error.

This is often know as DRY- Don't repeat yourself
(https://en.wikipedia.org/wiki/Don%27t_repeat_yourself)

Functions are also the way to tap the huge library of code available for python

What is a function?

1. piece of reusable code: print(), type()
2. solves a specific task
3. allows you to avoid repetition

a general recipe to call a function is

output = function_name(input)

EXAMPLES

In [None]:
# how to find the lowest portfolio value?

print(min(portfoliosvalue))

# how to find the largest?

print(max(portfoliosvalue))

# max and min are functions that solve this task. You can also implement this without using any functions. How would you do it?

#How many portfolios do we have in our list?

print(len(portfoliosvalue))




In [None]:
# yet another illustrative example is the function round wich takes two inputs as standard
print(round(1.2345,3))
# you can also just call with one input
round(1.2345)

<b>Our first user-defined function</b>
* Use the `def` syntax to define the function as follows:
```python
def func_name(args):
    ...
```

In [None]:
def fibonacci(limit):
    # This is a docstring (always good idea!):
    '''The fibonacci function prints fibonacci sequence up to a limit, returning a list.''' 
    
    fibs = []
    a, b = 0, 1
    while b < limit:
        # Append method works on lists, in this case fibs
        fibs.append(b)
        
        # Reset a and b with new values
        a, b = b, a + b
    
    # Our function returns the final list of requested fibonacci numbers
    return fibs
        
# Use our function and test results
result = fibonacci(10)
print(result)

In [None]:
# Get docstring like so
fibonacci.__doc__

In [None]:
# why both work?
round?

Note that :

1. help(funname) opens the help of a function with name funname. This is very useful first step to understand what a function does. 
    - You can also write ?funname instead of help(funname)
2. number [,ndigits] means that ndigits is an optional argument. 
3. The text above also tells us that 0 is the default value of ndigits is not assigned


Python has also an alternative way of specify optional/default inputs. For example, lets conside the function sorted

In [None]:
help(sorted)

Above we see that 

1. the iterable, for example a list, is required, 
2. if you do not specify the reverse input it is set to False
3. if you do not specify the key input it is set to none
4. You can choose to input none of the optional inputs, only key, only reverse, or both. Examples
    a. sorted(alist)
    b. sorted(alist,reverse=True)
5. The key parameter can be used to specify a function to make comparisons. For example, you can use this to specify which column you want the sort to be based on. Or apply some transformation before sorting. Ignore this for now. For more info see (https://docs.python.org/3/howto/sorting.html)
    


In [None]:
sorted(portfolios,reverse=True)

# Methods

each object in python comes with a bunch of methods. Methods are functions specific to a method.

you call a method as follows:

object.method(input)

you can type objet.(command Tab) to see the available methods

ypu can also write help(type) to see available methods for a type

Each object type will have specifc methods associater with it




In [None]:
portfoliosowner.sort?



String in particular have several methods

In [None]:
description

In [None]:
print(description.capitalize())
print(description.replace('future','past'))
print(description.endswith('e'))
print(description.index('e'))
print(description.count('e'))


String manipulation is something that is really really useful and is really an entire course

We will not use that much at all, since we will use numbers as data

But by the end of the class you should be ready to learn string manipulation by yourself.


List also have 

In [None]:
portfolios.append(['alan',0])
print(portfolios)
portfolios.pop(3)


## How to get help on a method?

(command)?

Help on built-in function pop:

pop(...) method of builtins.list instance
    L.pop([index]) -> item -- remove and return item at index (default last).
    Raises IndexError if list is empty or index is out of range.
    
    
Or just google: 
 - python (command) 
 - python how to ....
 

I google a lot. It is the easiest way to get help



**`try`/`except` statements**


What if we try to access an element in a list which is out of it's range?
```python
letters = ['a', 'b', 'c', 'd']
for i in range(10):
    print(letters[i])
```
We get an IndexError as shown in this traceback print-out:

```python
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-17-7700b7ec9e04> in <module>()
      1 letters = list('abcde')
      2 for i in range(10):
----> 3     print(letters[i])

IndexError: list index out of range
```

<b>The syntax is as follows:</b>

```python
try:
    ...
except ErrorName as e:
   ...
```

In [None]:
# Handling exceptions with try/except statements

# A list of strings
letters = ['a', 'b', 'c', 'd']

# Here we iterate over a list from 0 up to an index of 9 (that's what range(10) does)
for i in range(10):
    try:
        # Print current letter
        print(letters[i])
    except IndexError as e:
        # IndexError is the particular error and only error we catch here
        print('Oops!  Something went wrong:', e)
        # Break out of loop now (so we don't keep getting this error)
        break

<b>Order of prescedence, i.e. what gets exectuded first</b> 
1. parentheses
* multiplication (*), division (/), remainder (%)
* addition and subtraction
* comparisons, membership and identity (`in`, `not in`, `is`, `is not`, and the six comparison operators above)
* `not x`
* `and`
* `or`

* If you are not sure, use parenthesis to make sure python follows the order you intended!

# Packages

 - directories of python script
 - each script specify functions, methods, types
 - Lots of packages available for specific types of tasks
 
     * matplotlib: data visualization
     * numpy: array manipulation
     * pandas: database
     * Scipy: statisitical package

## How to install a package?


If you have the anaconda distribution:

https://conda.io/docs/using/pkgs.html

 * conda list
 * conda install packagename
 * conda update packagename
 
 
 Once installed you have to import the package in a specific stance.
 
 This needs to be imported in each new active python session.
 
 This way you only import what you need
 
 Several different options
 
  - Import a particular method o f the package
   
       * from numpy import array
       
  - Import the package with it's own name
   
       * import numpy
       
  - change the package name
    
      * import numpy as np
      

# NUMPY

NumPy is a first-rate library for numerical programming

Widely used in academia, finance and industry

Mature, fast, stable and under continuous development

In this lecture we introduce NumPy arrays and the fundamental array processing operations provided by NumPy

Important Notes

Moreover, we’ll be using the new syntax A @ B for matrix multiplication, as opposed to the old syntax np.dot(A, B)

This will work if you’ve installed version Python 3.5 or later of Anaconda

The essential problem that NumPy solves is fast array processing

For example, suppose we want to create an array of 1 million random draws from a uniform distribution and compute the mean

If we did this in pure Python it would be orders of magnitude slower than C or Fortran

This is because
 - Loops in Python over Python data types like lists carry significant overhead
 - C and Fortran code contains a lot of type information that can be used for optimization
 - Various optimizations can be carried out during compilation, when the compiler sees the instructions as a whole

However, for a task like the one described above there’s no need to switch back to C or Fortran

Instead we can use NumPy, where the instructions look like this:

In [None]:
from numpy import array
x=array([2,3,4])
print(x.cumprod())
import numpy 
x=numpy.array([2,3,4])
print(x.mean())
import numpy as np
x=np.array([2,3,4])
print(x.cumsum())


Lets start by creating a vector of random numbers

Here we are sampling from a normal distribution

$$X\sim N(0.1,0.2^2)$$

In [None]:
X=np.random.normal(0.1,0.2,120)
X

In [None]:
[X.mean(),X.max(),x.min(),X.std()]

# A Comment on Vectorization

NumPy is great for operations that are naturally vectorized

Vectorized operations are precompiled routines that can be sent in batches, like
 - matrix multiplication and other linear algebra routines
 - generating a vector of random numbers
 - applying a fixed transformation (e.g., sine or cosine) to an entire array


# NumPy Arrays

The most important thing that NumPy defines is an array data type formally called a numpy.ndarray

NumPy arrays power a large proportion of the scientific Python ecosystem

To create a NumPy array containing only zeros we use np.zeros

In [None]:
a = np.zeros(3)
a

In [None]:
type(a)

NumPy arrays are somewhat like native Python lists, except that
 - Data must be homogeneous (all elements of the same type)
 - These types must be one of the data types (dtypes) provided by NumPy
The most important of these dtypes are:
 - float64: 64 bit floating point number
 - int64: 64 bit integer
 - bool: 8 bit True or False

There are also dtypes to represent complex numbers, unsigned integers, etc

On modern machines, the default dtype for arrays is float64

In [None]:
type(a[0])

# Shape and Dimension

Consider the following assignment

In [None]:
z = np.zeros(10)

Here z is a flat array with no dimension — neither row nor column vector

The dimension is recorded in the shape attribute, which is a tuple

In [None]:
z.shape

In [None]:
z = np.zeros((10,2))
z.shape

We can easily reshape an array

In [None]:
z.shape=(5,4)
z

# Creating arrays

As we’ve seen, the np.zeros function creates an array of zeros

You can probably guess what np.ones creates



In [None]:
np.array([1,3,4])

In [None]:
np.array([[3,4],[5,6]])

# Array Methods

Arrays have useful methods, all of which are carefully optimized

X.sort()              # Sorts A in place
X.sum()               # Sum
A.mean()              # Mean
A.max()               # Max
A.argmax()            # Returns the index of the maximal element
A.cumsum()            # Cumulative sum of the elements of A
A.cumprod()           # Cumulative product of the elements of A
A.var()               # Variance

# Operations on Arrays

The algebraic operators +, -, *, / and ** all act element-wise on arrays

In [None]:
a = np.array([1, 2, 3, 4])
b = np.array([5, 6, 7, 8])
a + b

In [None]:
a*b

The two dimensional arrays follow the same general rules

In [None]:
A = np.ones((2, 2))
B = np.ones((2, 2))*2
A + B

# Matrix Multiplication

With Anaconda’s scientific Python package based around Python 3.5 and above, one can use the @ symbol for matrix multiplication, as follows:

In particular, A * B is not the matrix product, it is an element-wise product

In [None]:
A = np.array((1, 2))
B = np.array((10, 20))
A @ B

This will be very useful when we construct portfolios!

# Important observation about NUMPY

numpy is great for doing vector arithmetic. However, some things have changed relative to Python lists.
1. numpy arrays cannot contain elements with different types. 
2. If you try to build such a list, some of the elements' types are changed to end up with a homogeneous list.

3. the typical arithmetic operators, such as +, -, * and / have a different meaning for  numpy arrays.

Example:

In [None]:
print([3,4,7]+[2,4,6])
print(np.array([3,4,7])+np.array([2,4,6]))


# Visualization

To plot data the library matplotlib is great.

First start importing it and telling jupyter to do the plots in the notebook

here is everything you might need: https://matplotlib.org/

Here is a crahs course: https://matplotlib.org/users/pyplot_tutorial.html

In [None]:
import matplotlib.pyplot as plt
%matplotlib inline
# This line makes sure that the plot is shown in the notebook

In [None]:
# here is a histogram
plt.hist(X,bins=30);

In [None]:
# here is a time-series plot
plt.plot(X)
plt.legend(['returns'])

In [None]:
# here are some bar plots 
plt.subplot(1,2,1)
plt.bar(range(0,120),X)
plt.subplot(1,2,2)
plt.plot(range(0,120),X)
# why the risk-free asset has positive variance?
# is this risk? What is the difference between the variance on the stock market factors and the risk-free rate?


Lets create another variable with a varying degree of correlation with X

In [None]:

rho=0.5
Y=rho*X+np.sqrt(1-np.power(rho,2))*np.random.normal(0.1,0.2,120)

# here are some scatter plots 
plt.scatter(X,Y)




# Conditional operators

1. Comparison operators: ==, !=, >,<, >=,<=
2. Boolean operators: and (&),or (|) ,not (!)
3. Coditional statments: if,else,elif, while

In [None]:
#comparing numbers

print(2==1+1)
print(2!=1+1)
print(2>1+1)
print(2>=1+1)

#Booleans
print(2==1+1 and 2>1+1)

print(2==1+1 or 2>1+1)

print(2==1+1 & 3==2+1)

print(not 2>1+1)

## Program Flow

- Like IKEA furniture assembly instructions, a program is a sequence of steps to be done in order (add link to ikea furniture instrutions)
- Some steps are conditional—they will only be executed under certain conditions
- Some steps are repeated multiple time
- Some steps are stored and used all around through our program



# Sequential steps

When a program is running, it flows from one step to the next.  As programmers, we set up “paths” for the program to follow.

Program

x=2

print(x)

x=x+1

print(x)


# Conditional Steps

Program:

x = 5

if x < 10:
    
    print('Smaller')
    
if x > 20:
    
    print('Bigger')
    

print('Finis')






In [None]:
x=22
if x < 10:
    print('Smaller')
    
if x > 20:
    print('Bigger')
    

print('Finis')

# Indentation

In python identation has meaning!

4 spaces imply that the idented space belongs to the above statement

Note:

In [None]:
x=22
if x < 10:
print('Smaller')
    
if x > 20:
print('Bigger')
    

print('Finis')

So the 4 white space have meaning!

#### does every white space has meaning?


No, not in general. Only the indentation level of your statements is significant (i.e. the whitespace at the very left of your statements). Everywhere else, whitespace is not significant and can be used as you like, just like in any other language. You can also insert empty lines that contain nothing (or only arbitrary whitespace) anywhere.

But you can also write in line


In [None]:
x=22
if x < 10: print('Smaller')
    
if x > 20: print('Bigger')
    

print('Finis')

Jupyter will help you out here.

LEts try typing code

# DO NOT USE TAB to IDENT!!!!

This can make the code change it's meaning as you change computer/plataform

always use 4 spaces

You can also use Ctrl+] to ident and Ctrl+[ to deident

Only relative idnet matters!

In [None]:
x=4
if x < 10: 
    print('Small')
    if x < 5: 
        print('and tiny')

Note that the second if statement "belongs" to the first. It is only execture if the program reaches that branch

In [None]:
x=20
if x < 10: 
    print('Small')
    if x > 15: 
        print('and tiny')

# Repeated Steps


Program:

n = 5
while n > 0 :

    print(n)       
    
    n = n – 1
    
print('Blastoff!')

Or also 

n=5

for i in range(0,5):
    
    print(5-i)
    
print('Blastoff')    

Python has two main ways to do repeated steps

1. while: repeats a particular code while some condition is statisfied. 
2. for: goes through a prespecified list and repead the code until this list ends



In [None]:
n = 5
while n > 0 :
    print(n)       
    n = n-1
print('Blastoff!')


for i in range(0,5):
    print(5-i)
print('Blastoff') 


### Identation always key!