# Lecture Notes 1: Python Basics

## Hello world

In [16]:
print('hello world')

hello world


In [17]:
'hello world'

'hello world'

## Operators, Types and Casting

In [18]:
4.0 / 3.0, type(4.0), type(3.0), type(4.0 / 3.0)

(1.3333333333333333, float, float, float)

In [19]:
4 / 3, type(4), type(3), type(4 / 3)

(1.3333333333333333, int, int, float)

In [20]:
int(4.0) / int(3.0), int(4.0 / 3.0)

(1.3333333333333333, 1)

In [21]:
type(False), type([1, 2, 3]), type((1, 2, 3)), type('hello world')

(bool, list, tuple, str)

In [22]:
int(True), int(False)

(1, 0)

Operators can be applied to more complex types of objects, and the way they apply depend on these types:

In [23]:
1 + 2

3

In [24]:
[1, 2, 3] + [2, 3, 4]

[1, 2, 3, 2, 3, 4]

## Booleans

In [25]:
a = True
not a

False

In [26]:
True or False, True and False

(True, False)

In [27]:
2 == 2, 2 == 4, 2 != 4, 2 is not 4

(True, False, True, True)

In [28]:
"hello" is "world", "hello" is "hello"

(False, True)

## Lists

In [29]:
# Basic indexing
l = [4, 2, 1, 5, 3]
print(l[1])

2


In [30]:
# Slicing
print(l[1:3], l[:2], l[2:])

[2, 1] [4, 2] [1, 5, 3]


In [31]:
# Negative indices
print(l[-2])
print(l[:-1])

5
[4, 2, 1, 5]


In [32]:
# Repetition
3 * [1, 2]

[1, 2, 1, 2, 1, 2]

In [33]:
# Number of elements
print(len(l))

5


In [34]:
# Different datatypes
["Hello world", True, 4]

['Hello world', True, 4]

## Strings

In [35]:
# Concatenation
"hello" + " " + "world"

'hello world'

In [36]:
# Repetition
3 * "Python"

'PythonPythonPython'

In [37]:
# String formatting
"Today is {}, {}th of {}".format("Monday", 16, "April")

'Today is Monday, 16th of April'

In [38]:
print("{:.2f}".format(4/3))
print("{:04d}".format(15))

1.33
0015


In [39]:
# Number of characters
len("Python")

6

In [40]:
# Contains substring
"ell" in "hello"

True

In [41]:
# Indexing
s = "Hello world"
s[4], s[:5]

('o', 'Hello')

## Precedence of operators

In [42]:
1 * 2 + 3 * 4

14

In [43]:
1 * (2 + 3) * 4

20

Exhaustive list:

![Source: thepythonguru.com](precedence.jpg)

In case you are not sure, add parentheses.

## Functions

In [44]:
def f(x, y):
    z = (x ** 2 + y ** 2) ** 0.5
    return z

In [45]:
f(3, 4)

5.0

A function can be seen as a variable

In [46]:
g = lambda x, y: (x ** 2 + y ** 2) ** 0.5

In [47]:
g(3, 4)

5.0

In [48]:
# Reassign function to variable
my_function = g
my_function(3, 4)

5.0

A function does not even need a name

In [49]:
(lambda x, y: (x ** 2 + y ** 2) ** 0.5)(3, 4)

5.0

## Dictionaries

**Create a data point (e.g. a fruit)**

In [50]:
x = {
    'color': 'green',
    'size': 'medium'
}

In [51]:
type(x)

dict

**Analyze this data point**

In [52]:
x['color']

'green'

## Classifiying Fruits: Conditional Expressions

![](fruits.png)

**A decision tree for watermelon vs. apple vs. other**

In [53]:
def classify(x):
    if x['color'] == 'green':
        if x['size'] == 'big':
            decision = 'watermelon'
        elif x['size'] == 'medium':
            decision = 'apple'
        else:
            decision = 'other'
    else:
        decision = 'other'
    return decision

In [54]:
x_new = {'color': 'green', 'size': 'big'}
classify(x_new)

'watermelon'

In [55]:
classify({'color': 'green', 'size': 'medium'})

'apple'

In [56]:
classify({'color': 'red', 'size': 'small'})

'other'

In [57]:
# Ternary operator
def compare(x, y): 
    return "same" if x == y else "different"
print(compare(1, 2))
print(compare(1, 1))

different
same


## Iterators

Making predictions for multiple observations

In [58]:
for i in range(5):
    print(i)

0
1
2
3
4


In [59]:
for i in [2, 1, 4]:
    print(i)

2
1
4


In [73]:
data = [
  {'color': 'green', 'size': 'big'},
  {'color': 'yellow', 'shape': 'round', 'size': 'big'},
  {'color': 'red', 'size': 'medium'},
  {'color': 'green', 'size': 'big'},
  {'color': 'red', 'size': 'small', 'taste': 'sour'},
  {'color': 'green', 'size': 'small'}
]
type(data), type(data[0])

(list, dict)

In [61]:
results = list()
for x in data:
    results.append(classify(x))
print(results)

['watermelon', 'other', 'other', 'watermelon', 'other', 'other']


The same can be achieved with list comprehensions:

In [62]:
print([classify(x) for x in data])

['watermelon', 'other', 'other', 'watermelon', 'other', 'other']


This can also be combined with conditions:

In [63]:
print([classify(x) for x in data if x['color'] == 'green'])

['watermelon', 'watermelon', 'other']


## Counting the number of objects "watermelon" in the data

In [64]:
result = [classify(x) for x in data]

count = 0
for r in result:
    if r == 'watermelon':
        count = count + 1
print(count)

2


Or in the "pythonic" way using list comprehension:

In [75]:
sum([classify(x) == 'watermelon' for x in data])

2

## Reading Data from a File

Content of file scores.txt that lists the performance of players at a certain game:

`80,55,16,26,37,62,49,13,28,56`

`43,45,47,63,43,65,10,52,30,18`

`63,71,69,24,54,29,79,83,38,56`

`46,42,39,14,47,40,72,43,57,47`

`61,49,65,31,79,62,9,90,65,44`

`10,28,16,6,61,72,78,55,54,48`

The following program reads the file and stores the scores into a list

`with` statement takes care of opening and closing the file.

In [80]:
with open('scores.txt', 'r') as f:
    D = list()
    for line in f:
        D.extend([float(x) for x in str.split(line[:-1], ',')])
print(D)

[80.0, 55.0, 16.0, 26.0, 37.0, 62.0, 49.0, 13.0, 28.0, 56.0, 43.0, 45.0, 47.0, 63.0, 43.0, 65.0, 10.0, 52.0, 30.0, 18.0, 63.0, 71.0, 69.0, 24.0, 54.0, 29.0, 79.0, 83.0, 38.0, 56.0, 46.0, 42.0, 39.0, 14.0, 47.0, 40.0, 72.0, 43.0, 57.0, 47.0, 61.0, 49.0, 65.0, 31.0, 79.0, 62.0, 9.0, 90.0, 65.0, 44.0, 10.0, 28.0, 16.0, 6.0, 61.0, 72.0, 78.0, 55.0, 54.0, 48.0]


Writing results back to a file:

In [67]:
import os
try:
    # Make sure not to overwrite an existing file
    outfile = 'scores_new.txt'
    if os.path.exists(outfile):
        raise Exception("File '{}' already exists.".format(outfile))
        
    with open(outfile, 'w') as f:
        f.write(str(D))
except Exception as e:
    print("Exception occured: {}".format(e))

## Classes

Let's separate our data into training and test data

In [68]:
Dtrain = D[:20]
Dtest  = D[20:]

Classes are useful for modeling anything that has an internal state, for example, machine learning models. The model below classifies whether a score is above/below the average.

In [69]:
class Classifier:
    def train(self, X):
        self.avg = sum(X) / len(X)
        
    def predict(self, X):
        return ['above' if x > self.avg else 'below' for x in X]

Build the classifier:

In [70]:
c = Classifier()

Train the classifier and inspect what the classifier has learned:

In [71]:
c.train(Dtrain)
print(c.avg)

41.9


Apply the model to the test data verifies that it works correctly:

In [72]:
Ytest = c.predict(Dtest)
list(zip(Dtest[:5], Ytest[:5]))

[(63.0, 'above'),
 (71.0, 'above'),
 (69.0, 'above'),
 (24.0, 'below'),
 (54.0, 'above')]