# Python Crash Course

## Variables

Python is a *dynamically typed* language: the data type of the variable is determined at runtime and does not have to be specified explicitly.

In [1]:
x = 5/2

In [2]:
x

2.5

In [3]:
type(x)

float

In [4]:
x = 5

In [5]:
x

5

In [6]:
type(x)

int

In [7]:
type(x)==float, isinstance(x, float)

(False, False)

Variables can change type after they have been set.

In [8]:
x = "Computational Linguistics" # x is now of type str
x

'Computational Linguistics'

In [9]:
type(x)

str

## Arithmetic Operators

In [11]:
600+3     # Addition

603

In [12]:
3*(2+3)+2

17

In [13]:
17 / 3    # Division

5.666666666666667

In [14]:
2**6      # Exponentiation

64

In [15]:
17 // 3   # Floor division

5

In [16]:
-17 // 3


-6

In [17]:
17 % 3    # Modulus

2

## Logical Operators

In [18]:
x = 10

In [19]:
x > 1 and x < 20 # Returns True if both statements are true

True

In [20]:
x > 5 or x > 20  # Returns True if one of the statements is true

True

In [21]:
not(x > 1 and x < 20)

False

## Modules

Python code is organized in *modules*. If we want to use a particular module, we have to *import* it first:

In [22]:
# load the module that is responsible for the calculation of random numbers
import random

In [23]:
# Select a random number between 10 and 20
random.randrange(10,20)

11

We can choose to import only parts from a module, by using the `from` keyword.

In [24]:
from random import randrange
randrange(10)

6

## Control structures

Python's syntax is __intendation-based__ (versus for example bracket-based like many languages). So Code formatting matters!

Python has the usual control structures `if`, `while` and `for`:

###### if

In [25]:
x = random.randrange(10)
y = random.randrange(10)
print(f"x = {x}")
print(f"y = {y}")

x = 1
y = 4


In [26]:
if x > y : 
    print("x is greater than y")
elif x < y:
    print("y is greater than x")
else:
    print("x and y are equal")

y is greater than x


<div class="alert alert-warning">

**Python f-Strings** allow generating strings out of variables with automated formatting. See below.

</div>

#### Conditional Expression

Introduced in PEP 308, and often referred to as a ternary operator:

```python
x = x_if_true if condition else x_if_false
```
which is the succint version of:

```python
if condition:
    x = x_if_true
else:
    x = x_if_false
```

In [27]:
sun_shining = True
x = 35 if sun_shining else -4
print(x)

35


##### while

In [28]:
x = 1
while x / 2 > 0 :
    x /= 2
print(f"{x} is the smallest positive number I know.")

5e-324 is the smallest positive number I know.


In [29]:
type(x)

float

###### for 

In [34]:
for i in range(10,15):
    print(i)
    print(i**2)

10
100
11
121
12
144
13
169
14
196


In [35]:
list(range(10,15,2))

[10, 12, 14]

## Lists, Tuples, Sets, and Dictionaries

There are four collection data types in the Python programming language:

- **List** is a collection which is ordered and changeable. Allows duplicate members.
- **Tuple** is a collection which is ordered and unchangeable. Allows duplicate members.
- **Set** is a collection which is unordered and unindexed. No duplicate members.
- **Dictionary** is a collection which is changeable and indexed. No duplicate members.

In [36]:
my_list = ['one', 'two', 'three', 'four', 'five']
my_list

['one', 'two', 'three', 'four', 'five']

In [37]:
len(my_list)

5

In [38]:
my_list[0]

'one'

In [39]:
my_list[-2]

'four'

In [40]:
my_list[2:4]

['three', 'four']

To add an item to the end of the list, we can use the `append()` method:

In [41]:
my_list.append('six')
my_list

['one', 'two', 'three', 'four', 'five', 'six']

In [42]:
my_list.append(3.14)
my_list

['one', 'two', 'three', 'four', 'five', 'six', 3.14]

In [43]:
my_list.extend(["seven", "eight"])
my_list

['one', 'two', 'three', 'four', 'five', 'six', 3.14, 'seven', 'eight']

A **Tuple** is a collection which is ordered and unchangeable.

In [44]:
a = ('eins', 'zwei', 'drei')

In [45]:
a[0]

'eins'

In [46]:
a[1] = 'vier' # This will throw an error

TypeError: 'tuple' object does not support item assignment

A **set** is a collection which is unordered and unindexed. Once a set is created, you cannot change its items, but you can add new items.

In [47]:
set1 = {"a", "b","c"} 
set1

{'a', 'b', 'c'}

In [48]:
["a", "b","c", "b","c"]

['a', 'b', 'c', 'b', 'c']

In [49]:
set2 = set(["a", "b","c", "b","c"]) 
set2

{'a', 'b', 'c'}

In [50]:
l = ["a", "b","c", "b","c"]
l = list(set(l))
l

['c', 'b', 'a']

In [51]:
# Add an item to a set
set2.add("f")
set2

{'a', 'b', 'c', 'f'}

In [52]:
# Add multiple items to a set
set2.update("g","h")
set2

{'a', 'b', 'c', 'f', 'g', 'h'}

The `union()` method returns a new set with all items from both sets:

In [53]:
things = {"book", "table", "chair"}
colors = {"blue", "red", "green"}
words = colors.union(things) # or males | females
words

{'blue', 'book', 'chair', 'green', 'red', 'table'}

In [54]:
words - colors # OR words.difference(colors)

{'book', 'chair', 'table'}

In [55]:
# Intersection
words & colors # OR things.intersection(colors)

{'blue', 'green', 'red'}

A **dictionary** is a key-value collection which is changeable and indexed. In Python dictionaries are written with curly brackets and colon.

In [56]:
d = { 'abc': 10, 'def' : True }

In [57]:
d['def']

True

In [58]:
d['z'] = ['Python', 'is', 'a', 'great', 'language']

In [59]:
d

{'abc': 10, 'def': True, 'z': ['Python', 'is', 'a', 'great', 'language']}

In [60]:
len(d)

3

In [61]:
#  Iterate by key:
for x in d.keys():
    print(f"key {x}: value {d[x]}")

key abc: value 10
key def: value True
key z: value ['Python', 'is', 'a', 'great', 'language']


In [62]:
#  Iterate by value:
for x in d.values():
    print(x)

10
True
['Python', 'is', 'a', 'great', 'language']


In [63]:
#  Iterate by key and value
for key, value in d.items():
    print(key, value)

abc 10
def True
z ['Python', 'is', 'a', 'great', 'language']


Useful functions if you are uncertain about the set of keys:

In [64]:
d["l"] # This will throw an error

KeyError: 'l'

In [65]:
"z" in d, "l" in d

(True, False)

In [66]:
d.get("abc", "default_value"), d.get("a", "default_value")

(10, 'default_value')

# List Comprehensions

List comprehensions are a tool for transforming one list into another list. During this transformation, elements can be conditionally included in the new list and each element can be transformed as needed.

Every list comprehension can be rewritten as a for loop but not every for loop can be rewritten as a list comprehension.

```python
new_list = []
for ITEM in old_list:
    if condition_based_on(ITEM):
        new_list.append(function(ITEM))
        ```

The above *for* loop can be rewritten as a list comprehension like this:
```python
new_list = [function(ITEM) for ITEM in old_list if condition_based_on(ITEM)]
```

In [67]:
numbers = [1, 2, 3, 4, 5]
doubled_odds = []
for n in numbers:
    if n % 2 == 1:
        doubled_odds.append(n * 2)
doubled_odds

[2, 6, 10]

In [68]:
doubled_odds = [n * 2 for n in numbers if n % 2 == 1]
doubled_odds

[2, 6, 10]

### Nested Loops
Here’s a for loop that flattens a matrix (a list of lists):

In [69]:
matrix = [[1,2,3,4], [5,6,7,8], [9,10,11,12]]
flattened = []
for row in matrix:
    for n in row:
        flattened.append(n)
flattened

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]

In [70]:
flattened = [n for row in matrix for n in row]
flattened

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]

## Strings

Strings are a Python object to manage sequences of characters.

In [71]:
s = 'Python is a great programming language'
print(s)

Python is a great programming language


In [72]:
s = "Python is a great programming language"
print(s)

Python is a great programming language


In [73]:
# Multiline Strings
s = """Python is a great
 programming language."""

print(s)

# Note: the line breaks are inserted at the same position as in the code.

Python is a great
 programming language.


 In effect a string behaves very much like a list:Square brackets can be used to access elements of the string.

In [74]:
s = "Python is a great programming language!"
s[0]

'P'

In [75]:
# String Length
len(s)

39

In [76]:
# Slicing
s[18:29] # Specify the start index and the end index, separated by a colon, to return a part of the string.

'programming'

In [77]:
# Use negative indexes to start the slice from the end of the string
s[-10:-2]

' languag'

In [78]:
# Indexes can also be omitted if they 
# correspond to the beginning or end
s[:15]

'Python is a gre'

In [79]:
s[15:]

'at programming language!'

In [80]:
s[-15:]

'mming language!'

In [81]:
s[:-15]

'Python is a great progra'

In [82]:
# Iterating over the string returns individual characters:
for c in s:
    print(c)

P
y
t
h
o
n
 
i
s
 
a
 
g
r
e
a
t
 
p
r
o
g
r
a
m
m
i
n
g
 
l
a
n
g
u
a
g
e
!


Python has a set of built-in methods that you can use on strings.

In [83]:
s.lower() # returns the string in lower case

'python is a great programming language!'

In [84]:
s.upper() # returns the string in upper case

'PYTHON IS A GREAT PROGRAMMING LANGUAGE!'

In [85]:
s.split(" ") # Splits the string at the specified separator (in this example the white space), and returns a list

['Python', 'is', 'a', 'great', 'programming', 'language!']

In [86]:
s = 'Das ist ein Beispiel'
ss = ' '.join([s, s.upper(), s.lower()])
print(ss)

ss = "_".join(["ein", "beispiel", "hier"])
print(ss)

Das ist ein Beispiel DAS IST EIN BEISPIEL das ist ein beispiel
ein_beispiel_hier


In [87]:
s = "Python "
s.isalpha() # Returns True if all characters in the string are in the alphabet

False

In [88]:
s = "Python"
s.isalpha()

True

In [89]:
s = "Python is a great programming language!"
print(s.split(' '))
[w.isalpha() for w in s.split(' ')]

['Python', 'is', 'a', 'great', 'programming', 'language!']


[True, True, True, True, True, False]

In [90]:
[w for w in s.split(' ') if 'i' in w]

['is', 'programming']

## f-strings

A modern and easy way to format strings, mostly for output purposes.

In [91]:
s = f"A pythonic f-string"
s

'A pythonic f-string'

In [92]:
result = 3.14
number = 42

print(f"The result is {result} and the number is {number}. Added up it's {result + number}")

The result is 3.14 and the number is 42. Added up it's 45.14


## Sorting, Minimum und Maximum

In [93]:
my_list = [2,4,7,12,6,8,2,3,4,5]

In [94]:
min(my_list)

2

In [95]:
max(my_list)

12

In [96]:
sorted(my_list, reverse=False)

[2, 2, 3, 4, 4, 5, 6, 7, 8, 12]

In [97]:
my_list = 'This a sentence that serves as an example for splitting strings.'.split(' ')
my_list

['This',
 'a',
 'sentence',
 'that',
 'serves',
 'as',
 'an',
 'example',
 'for',
 'splitting',
 'strings.']

In [98]:
min(my_list)

'This'

In [99]:
max(my_list)

'that'

In [100]:
sorted(my_list)

['This',
 'a',
 'an',
 'as',
 'example',
 'for',
 'sentence',
 'serves',
 'splitting',
 'strings.',
 'that']

In all three functions above, an additional parameter `key` can be specified - a feature that makes keys out of the elements that are compared:

In [101]:
# Sort words by length
sorted(my_list, key=len)

['a',
 'as',
 'an',
 'for',
 'This',
 'that',
 'serves',
 'example',
 'sentence',
 'strings.',
 'splitting']

In [102]:
sorted(my_list, key=len, reverse = True) # Reversing the Sort Order

['splitting',
 'sentence',
 'strings.',
 'example',
 'serves',
 'This',
 'that',
 'for',
 'as',
 'an',
 'a']

In [103]:
# Ignore Case-sensitivity
sorted(my_list, key=lambda w: w.lower())

['a',
 'an',
 'as',
 'example',
 'for',
 'sentence',
 'serves',
 'splitting',
 'strings.',
 'that',
 'This']

<div class="alert alert-warning">
A <b>lambda</b> function is a small anonymous function, it can take any number of arguments, but can only have one expression.
    </div>

In [104]:
x = lambda a : a.lower()
print(x("Berlin"))

berlin


In [105]:
def myfunc(n):
  return lambda a : a * n

mydoubler = myfunc(2)
mytrippler = myfunc(3)

print(mydoubler(11))
print(mytrippler(11))

22
33


### Sorting Dictionaries

In [106]:
prices = {'Apple': 1.99, 'Banana': 0.99, 'Orange': 1.49, 'Cantaloupe': 3.99, 'Grapes': 0.39}
sorted(prices) # List of Sorted Keys

['Apple', 'Banana', 'Cantaloupe', 'Grapes', 'Orange']

In [107]:
sorted(prices.keys()) # List of Sorted Keys

['Apple', 'Banana', 'Cantaloupe', 'Grapes', 'Orange']

In [108]:
sorted(prices.values()) # List of Sorted Values

[0.39, 0.99, 1.49, 1.99, 3.99]

In [109]:
sorted(prices.items()) #Sorting by Keys

[('Apple', 1.99),
 ('Banana', 0.99),
 ('Cantaloupe', 3.99),
 ('Grapes', 0.39),
 ('Orange', 1.49)]

In [110]:
sorted(prices.items(), key = lambda x : x[1]) # Sorting by Values

[('Grapes', 0.39),
 ('Banana', 0.99),
 ('Orange', 1.49),
 ('Apple', 1.99),
 ('Cantaloupe', 3.99)]

In [111]:
sorted(prices.items(), key = lambda x : x[1], reverse=True)# Reversing the Sort Order

[('Cantaloupe', 3.99),
 ('Apple', 1.99),
 ('Orange', 1.49),
 ('Banana', 0.99),
 ('Grapes', 0.39)]

## Functions

In Python a function is defined using the `def` keyword:

In [112]:
def pow_self(x):
    return x**x

for i in range(10):
    print(pow_self(i))

1
1
4
27
256
3125
46656
823543
16777216
387420489


Since python is a dynamically typed language, problems often occur when data types behave similarly. That means for example methods and functions do **not** fail as long as the the methods exist and have meaning with regards to that type. A useful feature of python is to use _type hinting_ to hint at what a function expects and returns as data types:

In [113]:
def int_pow_self(x: int) -> int:
    return x**x

for i in range(10):
    print(int_pow_self(i))

1
1
4
27
256
3125
46656
823543
16777216
387420489


This is useful for documentation and comments but does **not** fail, when given another data type:

In [114]:
int_pow_self(float(i))


387420489.0

In [115]:
int_pow_self(str(i))

TypeError: unsupported operand type(s) for ** or pow(): 'str' and 'str'

## File Handling

The key function for working with files in Python is the `open()`function. It takes two parameters; `filename`, and `mode`. There are four different methods (modes) for opening a file:

- "r" - Read - Default value. Opens a file for reading, error if the file does not exist
- "a" - Append - Opens a file for appending, creates the file if it does not exist
- "w" - Write - Opens a file for writing, creates the file if it does not exist
- "x" - Create - Creates the specified file, returns an error if the file exists

In [116]:
fs = open("bible.txt","r") 
contents = fs.readlines() # read()
print(contents[:50])
fs.close()

['1:1 In the beginning God created the heaven and the earth.\n', '\n', '1:2 And the earth was without form, and void; and darkness was upon\n', 'the face of the deep. And the Spirit of God moved upon the face of the\n', 'waters.\n', '\n', '1:3 And God said, Let there be light: and there was light.\n', '\n', '1:4 And God saw the light, that it was good: and God divided the light\n', 'from the darkness.\n', '\n', '1:5 And God called the light Day, and the darkness he called Night.\n', 'And the evening and the morning were the first day.\n', '\n', '1:6 And God said, Let there be a firmament in the midst of the waters,\n', 'and let it divide the waters from the waters.\n', '\n', '1:7 And God made the firmament, and divided the waters which were\n', 'under the firmament from the waters which were above the firmament:\n', 'and it was so.\n', '\n', '1:8 And God called the firmament Heaven. And the evening and the\n', 'morning were the second day.\n', '\n', '1:9 And God said, Let the waters un

In [117]:
f = open("test.txt", "w") # Overwrite the content
f.write("Woops! I have deleted the content!") 
f.close()

In [118]:
f = open("test.txt", "a") # Append content to the file
f.write("This is a new Content!") 
f.close()

In [119]:
fs = open("test.txt","r") 
contents = fs.read()
print(contents)
fs.close()

Woops! I have deleted the content!This is a new Content!


## Exercises
<div class="alert alert-info">

### Exercise 1
A) Using a list comprehension, create a new list called "newlist" out of the list "numbers", which contains only the positive numbers from the list.
</div>

In [None]:
numbers = [34.6, -203.4, 44.9, 68.3, -12.2, 44.6, 12.7]
newlist= []
print(newlist)

<div class="alert alert-info">
B) Rewrite the following for loop using a list comprehension


```python
number_list = []
for x in range(100):
    if x % 3 == 0:
        if x % 5 == 0:
            number_list.append(x)
print(number_list)
```
</div>

In [None]:
number_list=[]
print(number_list)

<div class="alert alert-info">

### Exercise 2
Write a function to remove all punctuations from a string

you can use:



```python
import string
string.punctuation```

</div>

In [None]:
sentence = "he said: 'what are you doing?'"


# Output should be 'he said what are you doing'

<div class="alert alert-info">
    
### Excersice 3: 

In folder *Data* you will find a file *bible.txt*, write a python code to:

- Calculate the average word length
- Calculate the average verses length
- Print the number of unique words in the text
- Get a set of unique words in the text and save them into *uniqueWords.txt*
- Get the most frequent words, and save them to file and call it *statistics.txt*

</div>

# References

- https://docs.python.org/3/ 
- https://www.w3schools.com/python/