# Python for Actuaries Part 2

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/DeutscheAktuarvereinigung/Python_fuer_Aktuare/blob/main/02b_python_objects.ipynb) [![Kaggle](https://kaggle.com/static/images/open-in-kaggle.svg)](https://kaggle.com/kernels/welcome?src=https://github.com/DeutscheAktuarvereinigung/Python_fuer_Aktuare/blob/main/02b_python_objects.ipynb)
## Agenda
In this notebook, we will cover:
- Strings
- Lists
- Sets
- Dictionaries

# Built-in Data Types in Python
In addition to user-defined classes, Python offers a variety of **built-in data types** that are extremely useful in everyday programming. These data types are also based on classes, which means they have specific methods and properties that we can work with directly.

The most important built-in data types include:
- **Strings**: A sequence of characters that stores text.
- **Lists**: An ordered collection of elements.
- **Sets**: An unordered collection of unique elements.
- **Dictionaries**: A collection of key-value pairs.

These data types are predefined in Python and can be used directly without needing to create them ourselves. They are very flexible and allow us to efficiently solve many tasks.

In the following sections, we will look at how these data types work and how we can use them in our programs.

## Strings
The basic data types also include the type `string` for text variables.

In [None]:
# whatever strings you want
(
    'hello world!',
    "hello world!"
)

In [None]:
# easy way to use single and double quotes
print('she said, "how are you?"')
print("that's right!")

In [None]:
(
    'hello' + ' ' + 'world',
    'thinking... ' * 3,
    '*' * 3
)

Strings belong to the *sequence* type; https://docs.python.org/3.5/library/stdtypes.html#typesseq

This also includes: *ranges*, *tuples* (both immutable), and *lists* (mutable).

All immutable sequences can perform the [common sequence operations](https://docs.python.org/3/library/stdtypes.html#common-sequence-operations), and mutable sequences can additionally perform the [mutable sequence operations](https://docs.python.org/3/library/stdtypes.html#mutable-sequence-types).

In [None]:
# indexing
greeting = 'hello there'
(
    greeting[0],
    greeting[6],
    len(greeting),
    greeting[len(greeting)-1]
)

In [None]:
# negative indexes
(
    greeting[-1],
    greeting[-2],
    greeting[-len(greeting)]
)

In [None]:
# "slices"
greeting = 'hello there'
(
    greeting[0:11],
    greeting[0:5],
    greeting[6:11]
)

In [None]:
# default slice ranges
(
    greeting[:11],
    greeting[6:],
    greeting[:]
)

In [None]:
# slice "steps"
(
    greeting[0:11:2],
    greeting[::3],
    greeting[6:11:2]
)

In [None]:
# negative steps
greeting[::-1]

In [None]:
# other sequence ops
greeting = 'hello there!'
(
    greeting.count('e'),
    greeting.index('e'),
    greeting.index('e', 2),
    'e' in greeting,
    'ü' not in greeting,
    min(greeting),
    max(greeting)
)

Strings also have a set of [type-specific methods](https://docs.python.org/3/library/stdtypes.html#string-methods).

### Format Strings

It is very easy to replace variables or expressions in strings; this is done with *format strings*.

In [None]:
adjective = 'frigid'
adverb = 'hastily'
noun = 'Alfred'
number = 8
verb = 'eat'

sentence = f'It was a {adjective} day when {noun} decided to {adverb} {verb} {number*100} lines of code.'

print(sentence)

There is also a specific method for it `format`:

In [None]:
sentence = 'It was a {} day when {} decided to {} {} {} lines of code.'.format(adjective, noun, adverb, verb, number*100)

print(sentence)

# Lists, Ranges, Dicts, and Sets

In addition to the basic data types, the Python standard library includes several other very powerful data types.

We start with **Ranges**.

## Range

A range is a span of numbers between a starting number, an ending number, and optionally a step size.

In [None]:
r = range(10)

for i in r:
    print(i)

In [None]:
r = range(3,15)

for i in r:
    print(i)

In [None]:
r = range(1,1000,150)

for i in r:
    print(i)

## Listen

**Lists** are a very popular data type in Python because they are very versatile and easy to use. Nevertheless, they come with very powerful functionalities.

In [None]:
l = [1,2,3,4,1]
print(l)

In [None]:
# length of a list
len(l)

In [None]:
# Indexing
l[2]

In [None]:
# setting a value
l[0] = "Hello"
l

In [None]:
l = ["Hello"]
l + ["World"]

In [None]:
l

Lists of other things:

In [None]:
l = list(range(6))
l

In [None]:
l = list("European Actuaries")
l

In [None]:
'I love Advanced Programming'.split()

In [None]:
'apples, bananas, cats, dogs'.split(',')

In [None]:
# also, strings from lists of strings
'-'.join(['a', 'e', 'i', 'o', 'u'])

In [None]:
' 👏 '.join('this is a beautiful day'.split())

#### Building Lists Simply (List Comprehension)

In [None]:
[x for x in range(10)]

In [None]:
adjs = ('hot', 'blue', 'quick')
nouns = ('table', 'fox', 'sky')
[adj + ' ' + noun for adj in adjs for noun in nouns]

In [None]:
# pythagorean triples
n = 50
[(a,b,c) for a in range(1,n) 
         for b in range(a,n) 
         for c in range(b,n) 
         if a**2 + b**2 == c**2]

### Sets

A [set](https://docs.python.org/3.7/library/stdtypes.html#set-types-set-frozenset) is a data structure that represents an *unordered* collection of unique elements. Just like a mathematical set.

In [None]:
s = {1, 2, 1, 1, 2, 3, 3, 1}

In [None]:
s

In [None]:
t = {2, 3, 4, 5}

In [None]:
s.union(t)

In [None]:
s | t

In [None]:
s.difference(t)

In [None]:
s - t

In [None]:
s.intersection(t)

In [None]:
s & t

### Dicts

A [dictionary](https://docs.python.org/3/library/stdtypes.html#mapping-types-dict) is a data structure that consists of key-value pairs.

In [None]:
d = {
    'Superman':  'Clark Kent',
    'Batman':    'Bruce Wayne',
    'Spiderman': 'Peter Parker',
    'Wonder Woman' : 'Diana Prince',
    'Ironman':   'Tony Stark'
}

In [None]:
d['Ironman']

In [None]:
d['Ironman'] = 'James Rhodes'

In [None]:
d

In [None]:
del d['Ironman']
d

In [None]:
for k in d:
    print(f'{k} => {d[k]}') 

In [None]:
for k in d.keys():
    print(f'{k} => {d[k]}')

In [None]:
for v in d.values():
    print(v)

In [None]:
for k,v in d.items():
    print(f'{k} => {v}')

#### Dictionary comprehensions

In [None]:
{e:2**e for e in range(0,100,10)}

In [None]:
{x:y for x in range(3) for y in range(10)}

# Tasks
Please complete the following tasks.

## Task 1 (Palindrome Check)
Write a program that checks whether a given string is a palindrome. A palindrome is a word or a sentence that reads the same forwards and backwards (spaces and case are ignored).

Examples of palindromes:
- Madam I m Adam
- A man a plan a canal panama
- Was it a car or a cat I saw

In [None]:
# Initialization
text = "Was it a car or a cat I saw"

# Palindrome check
is_palindrome = False
pass

print(f"'{text}' is a palindrome: {is_palindrome}")

## Task 2 (Square Numbers in a Range)

Write a program that calculates the square numbers in a given range and outputs them as a list. Use list comprehension for this purpose.

In [None]:
# Initialization
start = 1
end = 10

# Calculate square numbers
squares = None  # add a suitable list comprehension here

print(squares)


## Task 3 (Reading Hegel)
In this task, we will read the [Phenomenology of Spirit](https://en.wikipedia.org/wiki/Phenomenology_of_Spirit) by Georg Wilhelm Friedrich Hegel. Fortunately, "reading" does not always mean "understanding." 
Your task is to find out:
- How many words did Hegel write?
- How many unique words did Hegel write?
- What is Hegel's most frequently used word?

In [None]:
# Run this cell to download the text of Hegel's "The Philosophy of History"
import urllib.request

hegel_url    = 'https://www.gutenberg.org/cache/epub/6698/pg6698.txt'
pdg_text = urllib.request.urlopen(hegel_url).read().decode()

Take a look at the text, for example, here is the first description of the Hegelian synthesis:

In [None]:
print(pdg_text[7837:8449])

How do you get from the individual characters `pdg_text` to the words of the Phenomenology of Spirit? (*Note:* the split function might help.)

In [None]:
pdg_words = None # hier geeignet ergänzen
len(pdg_words)

And now to the unambiguous words?

In [None]:
pdg_words_unique = None
len(pdg_words_unique)

And now to the frequency of the individual words. How often was each word written? (*Note:* use a `dict`.)

In [None]:
pdg_word_count = {}

for word in pdg_words:
    pass 


In [None]:
print(pdg_word_count)

In [None]:
(pdg_word_count["sein"], pdg_word_count["dasein"])

Now you just need to sort the dictionary. That's a bit tricky:

In [None]:
pdg_word_count_sorted = sorted(pdg_word_count.items(), key=lambda t: t[1], reverse=True)
pdg_word_count_sorted[:10]