# Introduction to Python

NOTE: This is a Jupyter Notebook

- Combine text and code easily
- Reproducible outcome
- Easy to share
- Other way: Python script (see example)

## Fundamentals

- Console for simple calculations
- Variable assignment: `=`
- Different basic data types: integers, floats, strings, booleans
- As you execute commands, you accumulate a namespace

In [1]:
score = 12

In [2]:
score

12

In [3]:
course = "algebra"

In [4]:
passed = True

In [5]:
dir()

['In',
 'Out',
 '_',
 '_2',
 '__',
 '___',
 '__builtin__',
 '__builtins__',
 '__doc__',
 '__loader__',
 '__name__',
 '__package__',
 '__spec__',
 '_dh',
 '_i',
 '_i1',
 '_i2',
 '_i3',
 '_i4',
 '_i5',
 '_ih',
 '_ii',
 '_iii',
 '_oh',
 '_sh',
 'course',
 'exit',
 'get_ipython',
 'passed',
 'quit',
 'score']

## Data structures

### Lists

- 1-dimensional
- heterogeneous content
- Create and subset with `[]`
- Can't name elements
- No element-wise calculations

In [6]:
scores = [12, 17, 19, 9]

In [7]:
scores

[12, 17, 19, 9]

In [8]:
courses = ["algebra", "physics", "management", "religion"]

In [9]:
courses

['algebra', 'physics', 'management', 'religion']

In [11]:
courses[0] # zero based indexing!!

'algebra'

In [72]:
courses[-1]

'religion'

In [73]:
courses[0:2] # last index is not kept!

['algebra', 'physics']

In [12]:
other_scores = [10, 18, 18, 14]

In [13]:
scores - other_scores

TypeError: unsupported operand type(s) for -: 'list' and 'list'

### Dictionaries

- Look-up table of key value pairs
- Easy to name elements
- Can't do vectorized calculations
- Can easily be nested


In [14]:
grades = {'algebra': 12, 'physics': 17, 'management': 19, 'religion': 9}

In [15]:
grades

{'algebra': 12, 'management': 19, 'physics': 17, 'religion': 9}

In [16]:
grades['algebra']

12

### Combining dictionaries and lists

In [17]:
summ = [{'class': 'algebra', 'score': 12, 'passed': True },
        {'class': 'physics', 'score': 17, 'passed': True },
        {'class': 'management', 'score':19, 'passed': True },
        {'class': 'religion', 'score': 9, 'passed': False }]

In [18]:
summ

[{'class': 'algebra', 'passed': True, 'score': 12},
 {'class': 'physics', 'passed': True, 'score': 17},
 {'class': 'management', 'passed': True, 'score': 19},
 {'class': 'religion', 'passed': False, 'score': 9}]

### Better alternatives for Data Science: Numpy

- Single-type array
- Element-wise calculations
- Very fast (optimized in C)



In [19]:
import numpy as np
scores = np.array([12, 17, 19, 9])
other_scores = np.array([10, 18, 18, 14])
scores - other_scores

array([ 2, -1,  1, -5])

### Better alternatvies for Data Science: Pandas DataFrame

- To store tabular data
- Rows and columns
- Easy indexing of rows and columns


In [20]:
import pandas as pd
summ = [{'class': 'algebra', 'score': 12, 'passed': True },
        {'class': 'physics', 'score': 17, 'passed': True },
        {'class': 'management', 'score':19, 'passed': True },
        {'class': 'religion', 'score': 9, 'passed': False }]
df = pd.DataFrame(summ)
df

Unnamed: 0,class,passed,score
0,algebra,True,12
1,physics,True,17
2,management,True,19
3,religion,False,9


In [21]:
df['class']

0       algebra
1       physics
2    management
3      religion
Name: class, dtype: object

In [84]:
df.iloc[1, 2]

17

In [85]:
df['score'][1]

17

## Functions and packages

### Functions
- Solve a particular, well-defined problem
- Black-box principle
- Ton of them available in packages
- Possible to write your own


In [25]:
def my_mean(lst):
    sum = 0.0;
    for el in lst:
        sum += el;
    return sum / len(lst)


In [26]:
scores = [12, 17, 19, 9]
my_mean(scores)

14.25

### Packages

- Google is your friend
- Packages for practially everything
- Easy installation with `pip`

In [24]:
import numpy as np
scores = np.array([12, 17, 19, 9])
np.mean(scores)

14.25

## Let's practice!

Go to https://www.datacamp.com/courses/2311



## Other programming concepts

### Control Structures


In [89]:
temp_celsius = 68
if temp_celsius <= 0 :
  print("Brrrr!")
else :
  print("It's not freezing.")

It's not freezing.


In [90]:
hobbies = ["cycling", "movies", "data science"]
for h in hobbies :
  print(h)

cycling
movies
data science
