[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Humboldt-WI/adams/blob/master/exercises/Ex01-Python-Primer.ipynb) 


# Python primer #1: basics
Here is the outline of our first tutorial introducing Python and Jupyter notebooks.

1. Jupyter notebook
2. Python basics
3. Packages and installation
3. numpy
4. pandas
5. loading data
6. matpotlib


## Working with a Jupyter notebook
### Managing cells
Jupyter has two modes. When the cell marker is blue, you can select cells with the keyboard arrows and handle them by creating new cells, moving, copying or deleting them. When you press Enter while selecting a cell, you jump into editing mode and can change the content of a cell, e.g. by writing code.

The types of actions you can perform are listed under Help > Keyboard shortcuts 

- New cells are inserted with *a* (above) or *b* (below).
- Cells can be *c* copied, *x* cut and *v* pasted
- Cells are deleted by pressing *d* two times.
- To run a cell and select the next cell: Shift + Enter
- While typing, use Tab to finish the object name for you and Shift + Tab to see the documentation

It is worth remembering the shortcuts to work efficiently. Also check out notebook *magic* commands starting with % when things get more interesting.

### Cell types ###
A Jupyter notebook usually consists of text describing the context of the problem and describing the code as well as the actual code. Text is written in markup cells, which allow nice formatting similar to LaTeX, while code should be written in code cells. There, the code can be run interactively and the output will be visualized in the notebook.

You can change the cell content with 'm' for markup and 'y' for code. Remember that you need to be in selection (or command) mode. 

##functionality demo

## Python  ## 

### Object types

1. _Numbers_: 1234, 3.1415, 999L, 3+4j, Decimal

2. _Strings_: 'ourstring', "text or comment"

3. _Lists_: [1, [2, 'three'], 4]

4. _Dictionaries_: {'employment': 'fulltime', 'gender': 'male'}

5. _Tuples_: (1,'spam', 4, 'U')

6. _Files_: myfile = open('goodcode', 'python')

also booleans, functions, classes, etc


## Lists and assignment rules
Unlike R, Python is primarily a programming language, so its standard object classes behave a little different. When indexing lists and similar objects, always remember that python starts counting indices at 0. When slicing x[start:end], the end value is *not* included in the output. You can index from the end using the minus like x[0:-2].

The biggest difference is that many objects can be modified in-place, i.e. without reassignment common in R like x <- x[1:2]. This increases speed and memory efficiency, but be careful when running code multiple times or in a different order. Also be aware that objects are not copied when reassigned, but instead linked to the original content, which may change. If you'd like to copy a list, use a complete index [:].



In [2]:
# Lists
x = [1, 2, 3, 4]
print(x)

print(x[1:3])

print(x)

[1, 2, 3, 4]
[2, 3]
[1, 2, 3, 4]


In [3]:
# Delete an element of x in-place
# Also note that you can index from the end by using -
del x[-2]
print(x)

[1, 2, 4]


In [7]:
y = x
z = x[:]
# the : means that z is a copy of array x whereas y = x means that y is literally the same thing as x 
print(y)
print(z)
del z[0]
print(z)
print(x)

[2, 4]
[2, 4]
[4]
[2, 4]


In [None]:
del y[0]
y

In [None]:
# See that x has changed when we deleted y[0]
print(x); print(z)

### Methods 

Python is an object-oriented programming language, so in addition to functions, it has *methods*. Methods are associated with objects. That means they are called for a specific object instead of passing the object as an explicit argument and have access to information saved within the object.

Objects, in turn, are an instanciation of a class. Think of classes as templates, which define all the data and behavior (i.e., methods) of an entity. For example, a class _student_ could provide a programmer with means to store data such as _name_, _email_, and _student id_. In addition, the class could define some behavior that students exhibit, such as _enrollForCourse_, _registerForExam_, etc. The way to implement behavior is to write a corresponding method. Using the __class__ _student_, that is our template, we could then create __objects__ that represent individual students, such as a student with _name_ Eve and _student id_ 12345, and a different student with _name_ Paul and _student id_ 67890. 

In [None]:
# A method for class string
a = "hello world"
print(type(a))

a.upper()

In [None]:
a.split()

In [None]:
# Find all methods for an object
dir(a)

In [None]:
# As for lists, one of the most frequent methods would be append (when we want to add smth to the list)
out = [1,2,3,4,5]
out.append(100)
print(out)

We will look more into methods when we are talking about packages

## Dictionaries
Dictionaries are similar to named lists in R, in that they save a value filed under a name (=key). When creating object classes, python makes use of different brackets. You will need it for data management.

In [None]:
# Dictionary
d = {"a":2, "b":"cheese", "c":[1,2,3]}
print(d)
print(d["c"])
print(d["b"])

In [None]:
d[3] = 'NEW'
d

Mind the difference!

You can get the keys, items or both (as a list of tuples) at any time from the dictionary. Note that the 'functions' are methods that are part of each dictionary, so we use d.keys() instead of keys(d). We will talk more about the difference later.

In [None]:
print(d.keys())
print(d.values())
print(d.items())

## Conditions and Loops
Conditional structures are very important in order to create efficient programms. The logic resembles R, however, there is a bunch of notions you should be aware of. For example, __indentation__ (4 spaces per indentation level). Unlike R, Python is sensitive to it.

In [None]:
language="Python"
if language == 'Python':
 print('Mind the dents!')
else:
 print("Ah, do whatever you want")
#just a space will also work in most cases, however, the Python community has developed a Style Guide for Python Code (PEP8) where the agreement is on 4 spaces

Sometimes we need a certain logical flow to be executed, that is where _else_ and _elif_ would be useful:

In [None]:
if 1 == 2:
    print('first')
elif 3 == 3:
    print('middle')
else:
    print('Last')

Let's try to apply a certain action to every component of a list with a use of looping command __for__:

In [None]:
ourstring=[5,8,9,7]
for thingy in ourstring:
    print(thingy)
#You won't be able to do much with the list - that is why we will need Numpy or a function

In [None]:
#Some other actions are also possible
for dingy in ourstring:
    print(dingy+dingy)

__While__ and __for__ are other examples of looping command:

In [None]:
j = -1
while j < 8:
    print('i is: {}'.format(j))
    j = j+1

In [None]:
for z in range(7):
    print(z)

It is often useful to loop over several values at the same time, for example over each key and value pair in a dictionary. 

In [None]:
for key, value in d.items():
    print(f"This is one of the keys: {key}")
    print(f"And this is its value: {value}")

Very often, we can use comprehensions that return an object with the results directly.

In [None]:
v=[1,5,9,19]
[x*2 for x in v] #

In [None]:
d.items()

In [None]:
{key:value=="cheese" for key,value in d.items()}

### Functions
Functions work as in R. You create new function by *defining* them. You see that much of structural programming is indicated by the structure of the code, i.e. a tab or four spaces.


In [None]:
def ourFunction (arg1, arg2, arg3 ...):
    program statement1
    program statement3
    program statement3
    ....
    return;

In [None]:
def happyBirthdayAlisa(): #program does nothing as written
    print("Happy Birthday to you!")
    print("Happy Birthday to you!")
    print("Happy Birthday, dear Alisa.")
    print("Happy Birthday to you!")

In [None]:
happyBirthdayAlisa # what happened thata

In [None]:
happyBirthdayAlisa()

In [None]:
def happyBirthday(person):
    print("Happy Birthday to you!")
    print("Happy Birthday to you!")
    print("Happy Birthday, dear " + person + ".")
    print("Happy Birthday to you!")
happyBirthday("Christian")

In [None]:
def multpl(x, y):
    return x ** y

multpl(5,3)

In [None]:
# In simple cases like the one above we can use an anonymous function like the ones we know from R:

r = lambda x, y: x * y
r(120, 5) 

In [None]:
# Let's look more into the structure
def example_function(x, y = 2):
    x += y # This is short for add and reassign x = x + y
    return x

In [None]:
example_function(1)

In [None]:
#You can build your whole code as execution of certain functions
def main():
    happyBirthday('Bobby')
    happyBirthday('Anna')
main()

### Methods as functions attached to a class

In [None]:
class person():
    """Class person for human interaction"""
    def __init__(self, name, age=None):
        """
        Cool, we can write help text like this starting and ending 
        with three quotations.
        
        name : str
          The name of the person
        """
        self.name = name
        self.age = age
        
    def happy_birthday(self):
        """Wish a happy birtday"""
        print(f"Happy Birthday, dear {self.name}.")
        if self.age is not None:
            self.age += 1
            print(f"{self.age} years, eh?")
        return None
    
    def greeting(self):
        """Greet the person"""
        print(f"Hi {self.name}! How are you doing?")
        return None

In [None]:
example = person("Johannes", 20)
example.greeting()

In [None]:
example.happy_birthday()

In [None]:
example.age

## Loading libraries aka packages
Most of the functionality that we need is provided by additional libraries to the core python. You can do it in several ways:
 1. Install manually via Anaconda Netvigator (Environments)
 2. Install via Anaconda terminal (advised at set-up stage) **- recommended**
 3. Install in Jupyter notebook (!{sys.executable} -m pip install numpy __OR__ !conda install --yes --prefix {sys.prefix} numpy)
 4. Install in Jupyter notebook via !pip install  - not recommended

In [None]:
import sys
!conda install  --prefix {sys.prefix} numpy

At the beginning of the code, we need to load these packages. Python is stricter than R when it comes to namespaces. When loading a function from a package, the package is usually included in the function call. For convenience, package names are therefore often abbreviated using import [...] as [...]. It is also possible to import only specific function from a library using from 'package' import 'function'.

The top 5 packages to remember for now are:
 - numpy (numeric manipulation)
 - pandas (data management)
 - matplotlib (plots)
 - seaborn (advanced plots, stat visualizations)
 - scikit-learn (standard package for machine learning in python)
 
When loading packages, you can give them a nickname in order to save some typing time. There usually is a convention for each package's nickname, but you can choose a different one.

In [None]:
# Math functionality
import numpy as np
# Data frame capabilities similar to data.table in R
import pandas as pd
# Plotting functionality
import matplotlib.pyplot as plt
%matplotlib inline
# Statistical data visualization
import seaborn as sns
# scikit-learn (sklearn) is *the* standard package for machine learning in python

## Numpy

Numpy will do most of the linear algebra operations on the way. We will look into:
- arrays
- indexing
- operations

In [None]:
import numpy as np

# numpy array is one of the most basic object types, we can get it by converting a basic list

In [None]:
ourlist = [1,2,3,4,5]
ourlist

Note how we specify the package by its nickname before calling a function from the package

In [None]:
np.array(ourlist)

In [None]:
#Some basic operations
print(np.sqrt(ourlist),np.exp(ourlist))

In [None]:
listoflists=[[1,2,3],[4,5,6],[7,8,9]]
listoflists

In [None]:
np.array(listoflists) # for our neural networks it will be the most basic type

In [None]:
#Learning to operate them would be important
np.arange(0,10)

In [None]:
np.arange(0,12,3)

In [None]:
np.ones((3,4)) # generating a matrix of 1, could be 0

In [None]:
np.linspace(0,5,10) # evenly spaced numbers in specified interval

In [None]:
#identity matrix
np.eye(6)

In [None]:
#Setting a value with index range (Broadcasting)
ar = np.arange(0,20)
ar[0:5]=0
ar

In [None]:
#Important note!
slice_of_ar = ar[0:4]
slice_of_ar

In [None]:
slice_of_ar[:]=999
slice_of_ar

Note that change occurs in initial array!

In [None]:
ar

We can copy objects actively to avoid this

In [None]:
slice_of_ar=ar[0:4].copy()
slice_of_ar

In [None]:
slice_of_ar[:]=0
ar

Indexing is similar to R and works by integer index or logical index

In [None]:
x = 10
ar[ar>x]

In [None]:
#Randomization will matter a lot for us
np.random.randn(2,4)

In [None]:
#Return random integers from `low` (inclusive) to `high` (exclusive).
np.random.randint(1,100,5) 

In [None]:
arr = np.arange(25)
arr #arr.TAB will show you which methods you can use

In [None]:
arr.max()

In [None]:
arr.reshape(5,5) # another important method 

In [None]:
arr.shape

In [None]:
arr.reshape(25,1)

In [None]:
arr.shape # mind that the shape wasn't reassigned in-place

In [None]:
arr.reshape(25,1).shape

In [None]:
arr.dtype # for neural networks a specific type of float would be necessary, we will get back to  it later

In [None]:
w =np.array([12,11,10])
v = np.array([1,2,3])

In [None]:
# Inner product of vectors;
print(v.dot(w))
print(np.dot(v, w))

In [None]:
# Matrix/vector multiplication
import numpy as np
A = np.array([[ 5, 1 ,3], [ 1, 1 ,1], [ 1, 2 ,1]])
b = np.array([1, 2, 3])
print (A.dot(b))

In [None]:
#Just a quick reminder about the element selection
print(A[0,0])
A

In [None]:
#Let's say we have A x = b, we need to find x

A = np.array([[2,1,-2],[3,0,1],[1,1,-1]])
b = np.transpose(np.array([[-3,5,-2]]))
                 
#To solve the system we do
x=np.linalg.solve(A,b)
x

That was it for our first session. See you soon for another round of programming.
