# 02. More on Data Types

In the previous notebook we went over numbers, text, booleans, dictionaries and lists. Here we are going to expand on other ways of representing information.

Lets start with **tuples**. Tuples are like lists in the sense that they store items in a sorted manner and they can be accessed by their position in the tuple. But there are very important differences with lists: tuples are immutable (we'll expand on this in a later notebook), once created their contents can not be changed.

In [1]:
my_tuple = (1, 'a', True)

print(my_tuple[0])
print(my_tuple[2])

1
True


An other useful data structure is the **set**. The set is like a dictionary but with only keys and no values. They are usefull to store information without repeating. They are also usefull to store items and check if we have stored them already or not. On the other hand ordering is not guaranteed.

In [2]:
set_a = {1, 3, 5, 7, 9}
set_b = {2, 4, 6, 8, 10}
set_c = {1, 2, 3, 5, 8}

print(1 in set_a)     # is 1 in set_a
print(5 in set_b)     # is 5 in set_b
print(set_a & set_b)  # intersection of set_a and set_b
print(set_a & set_c)  # intersection of set_a and set_c
print(set_a | set_b)  # union of set_a and set_b

True
False
set()
{1, 3, 5}
{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}


## Custom Data Types

Until now we have been using the default data structures offered to us by Python, the language. But sometimes we want more control or just personalized structures. Python allows us to do this with classes.

Imagine we want to store a date (year, month, day). With the data structures we have already seen we could do it with a tuple or a dictionary:

In [3]:
my_date_dict = {
    'year': 1920,
    'month': 5,
    'day': 13
}

my_date_tuple = (1920, 5, 13)

That looks nice! What about accessing the information stored in our date?

In [4]:
print(my_date_dict['year'])

print(my_date_tuple[0])

1920
1920


Well, not so nice.

With our dictionary representation we need to add all the `[]` and the `''` and on top of that we need to remember the name of the key, otherwise we won't get the right value or none at all.

With our tuple representation it is even worse! Wich position is the year at? The first one? The last one?

Can we do better? Yes we can, we can define our own data classes:

In [5]:
from dataclasses import dataclass

@dataclass
class Date:
    year: int
    month: int
    day: int

my_date = Date(1920, 5, 13)

print(my_date)

Date(year=1920, month=5, day=13)


Wow! That is a lot of stuff in one go! So, what's happening here?

First, we are importing the `dataclass` decorator. This decorator does a lot of things for us and basically gives the basic functionality to our data class. For now it is fine as it is, we'll go more in-depth with classes in a later notebook.

Second, we define our data class `Date`. It has three fields: `year` which is an `integer`, `month` which is an `integer` and `day` which is an `integer`.

Then, we are just creating a new instance of `Date` that we are naming `my_date` that represents the 13th of May of 1920.

Cool! How can we access the information in our data class? Very simple:

In [6]:
print(my_date.year)

1920


## Composing Data Structures

Up until now we have created our custom data structures with member values are of types provided by default by the language, Python.

We can also create custom data structures that contain other of our custom data types:

In [7]:
@dataclass
class DateInterval:
    start: Date  # The Date class we declared earlier
    end: Date
        
date_interval = DateInterval(Date(1920, 4, 21), Date(1924, 10, 12))
print(date_interval)

DateInterval(start=Date(year=1920, month=4, day=21), end=Date(year=1924, month=10, day=12))


## Recursive Data Structures

A recursive data structure is one that can contain an instance of the same type as itself:

In [8]:
from typing import Any

@dataclass
class LinkedList:
    value: Any  # value can contain any data type
    next: 'LinkedList'  # next contains a LinkedList but as we are in the process of declaring it,
                        # we have to do it like so
        
# Creating and populating the recursive data structure
l = LinkedList('a', None)
l.next = LinkedList('b', None)
l.next.next = LinkedList('c', None)

# print all values contained in the recursive data structure
k = l
while k is not None:
    print(k.value)
    k = k.next

a
b
c


## Exercises

**Exercise 1:** Create a tuple containing the first five prime numbers and print the third

In [None]:
prime_numbers = ...

print(...)

**Exercise 2:** Create a set with three names in it and print the result of checking if `John` is in it

In [None]:
names = ...

print(...)

**Exercise 3:** Create a data class that represents a `Person`: name, surname, birth date, height, weight.

In [None]:
class Person:
    ...