## Python Hands-on 

### Variables and types

#### Creating a new variable with python

Similar to Matlab 

``` VARNAME =  VALUE ```



In [None]:
a = 'hello'

Printing is similar to Matlab 'disp' command and maybe more intuitive than SAS

(SAS)
```SAS
proc print;
run;
```

In [None]:
#show a value
print(a)

#### Dynamic type
Python and Matlab are interpreted languages with strong dynamic type. What does it mean?
Let's have a look with examples

In [None]:
#defining a variable a 
a = 1
#showing evaluation of a
print(a)

Types are dynamics in python (like in Matlab) whereas in SAS where it is stuck to structured data manipulation so if you cannot allocate a value to a variable if format is different. 

In python, format change dynamically according to the last value affected to the variable

In [None]:
a = 1
print(type(a))

a = 1.0 
print(type(a))

#### The variable type in python

Python does not make difference between decimal, float or double whereas in SAS or Matlab. 
Numeric values are either `float` or `int`. 

When specific numbers or type are necessary, you usually have to use a library or explicitly call it. For instance to manipulate complex numbers, `cmath` library has to be used and explicitly defined with the `complex` function.

In [None]:
#illustration
import math #import of a library to call Pi value.
print(type(3.14)) #double or decimal expected in other language
print(type(math.pi)) #float 
print(type(1))
print(type(complex(1,2)))

The difference between interger and floating values is quite important in python (especially in 2.X version). For instance, when you operate a division of two numbers, if both of them or integers, it will compute euclidian division. 

In [None]:
#euclidian division 
a = 1
b = 10
print(a/b)

In the former operation 0.1 was expected. As mentioned previously, python (2.X) computes euclidian division when numerator AND denominator are integer.
To get the expected division, several tricks can be used to get the proper results.

Note that in python 3.X versions you don't need to care about that anymore. Float division is always computed by default.

In [None]:
# real division
a = 1
b = 10
print((a+0.0)/b)
print(float(a)/b)

More generally speaking, Python does have few elementary types you can use without loading a specific library: 

* integer: `a = 1` 
* float: `a = 1.0`
* string: `a="hello"`
* list: `a = [ 1, 2, 3]`
* tuple: `a = ( 0, 1, 2)`
* dictionary: `a={ 'k1':1, 'k2':2 }`

In python, every element is an object. That means that you can easily find functions (methods) you can apply on them by following the syntax and using auto completion. In other words, when you have an object, you can for instance start by writing the object name followed by '.' then press TAB. You should see a list a methods poping.

In [None]:
#let's try it
a = 1.0
#a.

###  Dictonary, tuples and list

Whereas in Matlab, python is not a matricial language by default. We will see later how to manipulate matrix in python. However, python has different objects to store information. At the first glance they look pretty similar. Let see how to use them

#### List

This is similar to an array. Each element is stored in the order and can be called by their index.

Be careful: in python indexes start at 0.

In [None]:
a = [ 1, 2, 3 ]
print(a[1])

In [None]:
#once defined you can change a value by calling the index
a[1] = 25
print(a)

List can contain any type of element. As far as, it is convenient to call them by their index you can mix different type inside a list

In [None]:
a = [ 1, "hello", [2,3] ] 
a[2]

Different operators exist for a list object. Here are the most useful

In [None]:
a = [] #empty list

In [None]:
#adding a new element to a list. The element is add at the end by default.
a.append(1)
print(a)

In [None]:
a[0]

In [None]:
#get the length of an element
len(a)

In [None]:
#removing one element with a specific value (if several exists, removing the first found).
#Note that removing an element imply a potential change of indexes
a = [ 1, 2, 1 ]
a.remove(1)
print(a)

In [None]:
#sorting a list , reverse = True means descending 
a = [5,2,8,3,6,1,4,7]
sorted(a,reverse=True)

In [None]:
#example to define a sequence of numbers from 1 to 10 with a list
list(range(5,11))

In [None]:
list(range(11))

In [None]:
#concatenate two lists
[1,2] + [3,4]

**Be careful**

The addition symbol with list does not sum term by term elements since by default list could have different length.

#### Tuple

This object is quite similar to a list at the exeption that once defined a tuple cannot be modified. It is useful to protect information inside a code and ensure that indexes won't change

In [None]:
t = ( 1, "toto", [1,2])

In [None]:
t[0]

In [None]:
#You cannot modify an element
t[1] = 2

#### Dictionary

Dictionary is a very common object in python. Technically this is a Hashmap array. For each key, it can match a value. Whereas with list and tuple object, indexes are not ordered and could be a string or a numeric value.

In Matlab this is similar to containers object.

```Matlab 
c = containers.Map
c('foo') = 1
c(' not a var name ') = 2
```

This type of object is really convenient to store unstructured data. It is similar to a JSON file.
In other words, each time you face a data that cannot really be put into a data table (SAS table), this type of storage should work since it is very flexible. 

In [None]:
mydict = { "key1": 123, 5: ["a","b","c"]}
mydict["key1"]

In [None]:
mydict["A"] = 5

In [None]:
mydict

In [None]:
mydict[5]

In [None]:
# as far as your dict is already instanciated you can add new element easily. 
# if a key does not exist the object create a new mapping
mydict["newkey"] = "a new element"

In [None]:
mydict

In [None]:
#get the keys of my object
mydict.keys()

In [None]:
#get values of my object
mydict.values()

Notice that when you call methods of a dictionary, elements are output in a "random" order.  

### Functions and loops

The python syntax is quite simple. 

* There are no brackets, no semi-columns needed. 
* Indentation plays the role of bracket to delimit a logic instruction (loop, if, function). 
* A new logical instruction always end by a column ":".
* Once the level of the indentation changes, that means you start (or end) a new instruction. 

#### Loops

The python syntax is the following

```python

for i in iterators:
    #YOUR OPERATION
```

As comparison, it would have been :

(Matlab)

``` 
for v = iterators
    #YOUR OPERATION
end
```


(SAS)
```SAS
data [dataName];
do i = [start_] to [stop_];
   #YOUR OPERATION
   output;
end;
run;
```
Be careful about the indentation. After the ":" sign, indentation is expected and every instruction at the same indentation level will be executed in the loop. Once the loop is finished, the following lines (non indented) are executed.

Iterators can be a list/matrix, a tuple or a set. 

In [None]:
#example
for i in range(10):
    print(i)

In [None]:
for i in [5,6,111]:
    print(min(i,50))

In [None]:
#outside the loop, the iterator stay at it last value.
for i in [5,6,111]:
    print(min(i,50))
print(i)

#### Functions
Syntax is pretty intuitive and follow the same spirit  as Matlab but with different order of arguments.
In SAS equivalent would be %macro but it seems harder to define non data based functions with SAS

Example: let's implement the following Matlab code in python

(Matlab)
```Matlab 

function [m,s] = stat(x)
n = length(x);
m = sum(x)/n;
s = sum((x-m).^2/n);
end

```


(SAS)
```SAS 

%macro stat(x,dataName);
    proc sql noprint;
    select mean(&x) format 6.2 into: m
    from &dataName;

    select mean((&x-mean(&x))*(&x-mean(&x))) format 6.2 into: s
    from &dataName;
    quit;
%mend stat;
```


In [None]:
def stat(x):
    n = len(x)
    m = float(sum(x)) / n #to avoid to compute euclidian division
    s = sum((x-m)**2*1.0/n)
    return(m,s)

stat([1,2])

Python is not matricial by default. That means it doesn't understand the operation of addition as matricial by default. We need to specify we are using matrix to do so. Let just admit the following correct code. We will explain how to manipulate matrix in the next part. 

In [None]:
import numpy as np
def stat(x):
    x = np.array(x)
    n = len(x)
    m = float(sum(x)) / n #to avoid to compute euclidian division
    s = sum((x-m)**2*1.0/n)
    return(m,s)
stat([1,2])

To summarize, to define a function. 

1. It starts by the key word "def" 
2. followed by the function name 
3. Then inputs in parantesis
4. end by "return( outputname)"

Here is the template

```python

def function_name(arg1, arg2):
    #YOUR OPERATION
    output = ...
    return(output)
```


**Exercise 1**

Define a function that take a list as an argument an return the concatenate of each element as a string. Example: 

INPUT : ["a",1,"b"]

(expected) OUTPUT : "a1b"

In [None]:
def concat(x):
    #COMPLETE HERE
    return(output)

#### If/Else condition

The syntax follows same rules and is quite intuitive. Compare it to SAS and Matlab Syntax

(python)
```python

if CONDITION1 and CONDITION2:
    #YOUR OPERATION
elif CONDITION3: #can be skipped
    #YOUR OPERATION
else:
    #YOUR OPERATION
```

(Matlab)
```Matlab
if (CONDITION1) && (CONDITION2)
    #YOUR OPERATION;
elseif CONDITION3 
    #YOUR OPERATION;
else
    #YOUR OPERATION;
end
```

(SAS)
```SAS
if CONDITION1 and CONDITION2 then
   do;
      #YOUR OPERATION;
   end;
else if CONDITION3 then
   do;
       #YOUR OPERATION;
   end;
else
   do;
       #YOUR OPERATION;
   end;
```

In [None]:
#Example test if a number is odd or even
x = 11
if x % 2 == 0:
    print("even")
else:
    print("odd")

**Exercise 2** 

Define a function that take in input two list and return a dictionary object:
* The first list argument contains the keys of the dictionary
* The second list contains the values of the dictionary
* if the two arguments have different length then return the message "ERROR"

example build_dict(["a","b"], [1,2]) should return  {'a': 1, 'b': 2}

In [None]:
def build_dict(keys, values):
    #COMPLETE HERE
    if len(keys) != len(values):
        return "ERROR"
    
    output = dict(zip(keys, values))
    return(output)

In [None]:
build_dict(["a","b"], [1,2]) # Output: {'a': 1, 'b': 2}

In [None]:
build_dict(["a"], [1,2]) # it should return 'ERROR'

#### Objects in Programming

Python is an **object-oriented programming language**. This means that instead of writing code with a focus on what the computer needs to do, we focus our attention on what data we are working with and how it should interact. An object is an important concept in programming. They allow modular implementation and make maintenance and development easier than without.

What is an object? As the name implies, the initial idea of object programming is to represent physical objects. Every physical object can be described by its properties (attributes) and can interact with the outside (methods).

Thus, a programming object can be pictured as an atomic unit.

* Its name (type).
* Its attributes (properties).
* Its methods (functions you can apply to it).

So far, we manipulated objects without really being aware of it. For instance, when we defined a list:

```python
mylist = [1, 2, 3]
```

* `mylist` is the name of the object.
* Attributes are the values of items inside the list.
* Methods are all the functions you can call on a list. For example:

```python
mylist.append(4)
```

We will then be able to define our own objects with Python. Many libraries are object-oriented. Thus, controlling objects in Python is quite important. Let's look at the following example. Imagine we want to implement an Employee as an object.

An Employee can be described by his/her name and different properties. For example:

* Name
* Age
* Time he/she has spent in the company
* Helpfulness (notation between 1 and 10)

An Employee also has some skills like being able to compute the cross-product or add two numbers. Imagine also they have a special skill to return the square of the sum of their Age and the time they have spent in the company. The methods can be then listed as:

* cross_product
* addition
* special_skill

In Python, the syntax to define an object is:

```python
class object_name():
    def __init__(self, attribute1, attribute2): # constructor: present in every object
        self.v1 = attribute1
        self.v2 = attribute2

    def methods1(self, x1, x2):
        result = self.v1 + x1 * x2
        return result
```

Once defined, below is an example of how to declare a new object and use it.

```python
# construction
myobj = object_name(attribute1, attribute2) # attributes required in the __init__ function. Note that self is not required since it is the object itself

# call a method
myobj.methods1(x1, x2) # call of methods1 associated with the object

# access to attributes
myobj.v1
myobj.v2

# change attribute v1
myobj.v1 = # new value
```
```

1. Every object definition starts by key word "class" followed by the object name.
2. Every object has a constructor called `__init__` method (convention).
3. Within the class, `self` is used to call the object itself
4. Arguments (except self) of `__init__` method are optional
5. All defined methods can be called by the object only. That means you cannot call the function you defined in the object without using the object. That avoids conflict between function names since if we use correctly objects we always know from which initial object the function comes.

**Exercise 3** 

1. Implement the Employee object . 
2. Declare Christophe as a new Employee of age 30, seniority 25, helpfulness 8
3. Test your object

In [None]:
#beginning together
class Employee:
    def __init__(self, age, seniority, helpfulness):
        self.age = age
        self.seniority = seniority
        self.helpfulness = helpfulness

In [None]:
christophe = Employee(age=30, seniority=25, helpfulness=8)

In [None]:
christophe.age