<figure>
  <IMG SRC="https://raw.githubusercontent.com/mbakker7/exploratory_computing_with_python/master/tudelft_logo.png" WIDTH=250 ALIGN="right">
</figure>

# 3.1 Data Structures


In this Section you will tackle a data management problem! In the first module you have learned how to create variables, which is cool. But when you populate a lot of variables, or you want to store & access them within one entity, you need to have a data structure.

There are plenty of them, which differ their use cases and complexity. Today we will tackle some of the standard Python built-in data structures. The most popular of those are: <b><code>list</code></b>, <b><code>dict</code></b> and <b><code>tuple</code></b>.

## <code>list</code>

First, the easiest and the most popular data structure in Python: <b><code>list</code></b> (which is similar to a typical array you could have seen in a different programming language).

You can create a list in the following ways:

In [None]:
# 1). Creating an empty list, option 1

empty_list1 = []
print('Type of my_list1 object', type(empty_list1))
print('Contents of my_list1', empty_list1)
print('--------------------')

# 2). Creating an empty list, option 2 - using the class constructor

empty_list2 = list()
print('Type of my_list2 object', type(empty_list2))
print('Contents of my_list2', empty_list2)
print('--------------------')

# 3). Creating a list from existing data - option 1

my_var1 = 5
my_var2 = "hello"
my_var3 = 37.5

my_list = [my_var1, my_var2, my_var3]
print('Type of my_list3 object', type(my_list))
print('Contents of my_list3', my_list)
print('--------------------')

# 4). Creating a list from existing data - option 2

cool_rock = "sandstone" # remember that a string is a collection of characters

list_with_letters = list(cool_rock)

print('Type of my_list3 object', type(list_with_letters))
print('Contents of list_with_letters', list_with_letters)
print('--------------------')


As you can see, in all three cases we created a list, only the method how we did it was slightly different:

- the first method uses the bracket notation,
- the second method uses class constructor approach. 

Both methods also apply to the other data structures.

Now, we have a list — what can we do with it?

Well... we can access and modify any element of an existing list. In order to access a list element, square brackets <b><code>[]</code></b> are used with the index of the element we want to access inside. Sounds easy, but keep in mind that Python has a zero-based indexing (as mentioned in Section 1.4 in Notebook 1).

As a reminder, a zero-based indexing means that the first element has index 0 (not 1), the second element has index 1 (not 2) and the n-th element has index n - 1 (not n)!

In [None]:
# len() function returns the lengths of an iterable (string, list, array, etc)
print(len(my_list))

# We have 3 elements, thus we can access 0th, 1st, and 2nd elements
print('First element of my list:', my_list[0])

print('Last element of my list:', my_list[2])

# After the element is accessed, it can be used as any variable,
# the list only provides a convenient storage

summation = my_list[0] + my_list[2]
print(f'Sum of {my_list[0]} and {my_list[2]} is {summation}')

# Since it is a storage - we can easily alter and swap list elements
my_list[0] += 7
my_list[1] = "My new element"

print(my_list)

# However we can only access data we have - Python will give us an error for the following

my_list[10] = 199

We can also add new elements to a list, or remove them! Adding is realized with the <b><code>append</code></b> method and removal of an element uses the <b><code>del</code></b> keyword.

In [None]:
# adding a new element to the end of the list
my_list.append("new addition to  my variable collection!")
print(my_list)

# we can also store a list inside a list - list inception! Useful for matrices, images etc
my_list.append(['another list', False, 1 + 2j])
print(my_list)

# Let's remove 37.5

del my_list[2]

print(my_list)

Lists also have other useful functionalities, as you can see from the <a href="https://docs.python.org/3/tutorial/datastructures.html">official documentation</a>. Since lists are still objects you can try and apply some operations to them as well.

In [None]:
lst1 = [2, 4, False]
lst2 = ['second list', 0, 222]

#what will happen?
lst1 = lst1 + lst2
print(lst1)

lst2 = lst2 * 4
print(lst2)

lst2[3] = 5050
print(lst2)

As you can see, adding lists together concatenates them and multiplying them basically does the same thing (it performs addition several times, just like in real math...).

Additionally, you can also use the <b><code>in</code></b> keyword to check the presence of a value inside a list.

In [None]:
print(lst1)

if 222 in lst1:
    print('We found 222 inside lst1')
else:
    print('Nope, nothing there....')

## <code>tuple</code>

If you understood how <code>list</code> works, then you already understand 95% of <b><code>tuple</code></b>. Tuples are just like lists, with some small differences.

1. In order to create a tuple you need to use <b><code>()</code></b> brackets, comma or a <b><code>tuple</code></b> class constructor.
2. You can change the content of your list, however <b>tuples are immutable</b> (just like strings).



In [None]:
# Creating an empty tuple - a bit useless, since you cannot change it

tupl1 = tuple() # option 1 with class constructor
print('Type of tupl1', type(tupl1))
print('Content of tupl1', tupl1)

tupl2 = () # option 2 with ()
print(type(tupl2), tupl2)

In [None]:
# Creating a non-empty tuple using brackets

my_var1 = 26.5
my_var2 = 'Oil'
my_var3 = False

my_tuple = (my_var1, my_var2, my_var3, 'some additional stuff', 777)
print('my tuple', my_tuple)

# Creating a non-empty tuple using comma

comma_tuple = 2, 'hi!', 228
print('A comma made tuple', comma_tuple)

# now, let's try to access an element
print('4th element of my_tuple:', my_tuple[3])

# but, can we change it?
my_tuple[3] = 'will I change?'

Since tuples are immutable, it has no <b><code>append()</code></b> method nor any other methods to alter them.

You might think that tuple is a useless class. However, there are some reasons for it to exist:
1)   Storing constants & objects which shouldn't be changed.
2)   Saving memory (tuple uses less memory to store the same data than a list).



In [None]:
#creating a list and a tuple from the same data

my_name = 'Vasyan'
my_age = 27
is_student = True

a = (my_name, my_age, is_student)
b = [my_name, my_age, is_student]

print('size of a =', a.__sizeof__(), 'bytes') #.__sizeof__() determines the size of a variable in bytes
print('size of b =', b.__sizeof__(), 'bytes')

## <code>dictionary</code>

After seeing lists and tuples, you may think: 

"Wow, storing all my variables within another variable is cool and gnarly! But... sometimes it's boring & inconvenient to access my data by using it's position within a tuple/list. Is there a way that I can store my object within a data structure but access it via something meaningful, like a keyword...?"

Don't worry if you had this exact same thought.. Python had it as well!

Dictionaries are suited especially for that purpose — to each element you want to store, you give it a nickname (i.e., a key) and use that key to access the value you want.

In [None]:
# Creating an empty dictionary - we used () for tuples and and [] for lists. 
# Now, it's time to use {}.

empty_dict1 = {}
print('Type of empty_dict1', type(empty_dict1))
print('Content of it ->', empty_dict1)

# Creating an empty dictionary - using class constructor
empty_dict2 = dict()
print('Type of empty_dict2', type(empty_dict2))
print('Content of it ->', empty_dict2)

In [None]:
# Creating a non-empty dictionary - specifying pairs of key:value pattern
my_dict = {
    'name': 'Jarno',
    'color': 'red',
    'year': 2007,
    'is cool': True,
    6: 'it works',
    (2, 22): 'that is a strange key'
}

print('Content of my_dict>>>', my_dict)

In the last example, you can see that only strings, numbers, or tuples were used as keys. Dictionaries can only use immutable data (or numbers) as keys:

In [None]:
# using mutable structures won't work: 

mutable_key_dict = {
    5: 'lets try',
    True: 'I hope it will run perfectly',
    6.78: 'heh',
    ['No problemo', 'right?']: False  
}

print(mutable_key_dict)

Alright, now it is time to access the data we have managed to store inside <b><code>my_dict</code></b>...

In [None]:
# for that we use keys!

print('Some random content of my_dict:')
print(my_dict['name'])
print(my_dict[(2, 22)])

In [None]:
# remember the mutable key dict? Let's make it work by omitting the list item
mutable_key_dict = {
    5: 'lets try',
    True: 'I hope it will run perfectly',
    6.78: 'heh'
}

# You can see that it doesn't give any errors but how do we access the data inside it?
# use keys!
print('Accessing weird dictionary...')
print(mutable_key_dict[True])
print(mutable_key_dict[5])
print(mutable_key_dict[6.78])

In [None]:
# Trying to access something we have and something we don't have
print('My favorite year is', my_dict['year'])
print('My favorite song is', my_dict['song'])

```{admonition} Attention
:class: danger
It is best practice to use mainly <i><b>strings</b></i> as keys — the other options are weird and are almost never used.
+++
```

What's next? Dictionaries are mutable, so let's go ahead and add some additional data and delete old ones.

In [None]:
print('my_dict right now', my_dict)

my_dict['new_element'] = 'magenta'
my_dict['weight'] = 27.8
del my_dict['year']

print('my_dict after some operations', my_dict)

You can also print all keys present in the dictionary using the <b><code>.keys()</code></b> method, or check whether a certain key exists in a dictionary, as shown below. More operations can be found <a href="https://docs.python.org/3/tutorial/datastructures.html">here</a>.

In [None]:
print(my_dict.keys())

# check if my_dict has a name key
print("\nmy_dict has a ['name'] key:", 'name' in my_dict)

```{admonition} Real life example: Analyzing satellite metadata
:class: important
Metadata is a set of data that describes and gives information about other data. For Sentinel-1, the metadata of the satellite is acquired as an <i>.xml</i> file. It is common for Dictionaries to play an important role in classifying this metadata. One could write a function to read and obtain important information from this metadata and store them in a Dictionary. Some examples of keys for the metadata of Sentinel-1 are:
    
<i>dict_keys(['azimuthSteeringRate', 'dataDcPolynomial', 'dcAzimuthtime', 'dcT0', 'rangePixelSpacing', 'azimuthPixelSpacing', 'azimuthFmRatePolynomial', 'azimuthFmRateTime', 'azimuthFmRateT0', 'radarFrequency', 'velocity', 'velocityTime', 'linesPerBurst', 'azimuthTimeInterval', 'rangeSamplingRate', 'slantRangeTime', 'samplesPerBurst', 'no_burst'])</i>
+++
```

## Slices

The last important thing for this Notebook are slices. Similar to how you can slice a <b>string</b> (shown in Section 1.4, in Notebook 1). This technique allows you to select a subset of data from a list or tuple.

In [None]:
# let's make a simple list

x = [1, 2, 3, 4, 5, 6, 7]
n = len(x) # len(x) gives the length of x

# in order to select a slice of it, you have to use a : symbol,
# like this => my_list[start:end], 
# it will select all elements with indices [start, end), 
# starting from start and ending at end-1 (excluding the end).
print('The first three elements of x:', x[0:3])

# getting the same subset but using a different slice "equation"

# if we start from the beginning (at index 0) - we can omit the 0
print(x[:3])

# instead of counting elements from the beginning, we can count from end
print('The last element is', x[6], 'or', x[n - 1], 'or', x[-1])

# thus, we can apply it in slicing our list
print(x[0:-4])

# we can also specify a third argument: the step size of our selection
print(x[0:3:1])

Thus, the general slicing call is given by <b><code>iterable[start:end:step]</code></b>.

Here's another example:

In [None]:
numbers = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

# similar to omiting the 0 if you start at the beginning of the list
# you can omit the last value if you want all values until the end
print('Selecting all even numbers', numbers[::2])
# in the above case you start at 0 (omited), end at 10 (omited)
# with a step of 2 (not omited)

print('All odd numbers', numbers[1::2])

# Reversing the list
print('Normal order', numbers)
print('Reversed order', numbers[::-1])

# Selecting middle subset
print('Numbers from 5 to 8:', numbers[5:9])

```{admonition} Additional study material:
:class: tip

* Official Python Documentation - https://docs.python.org/3/tutorial/datastructures.html
* Think Python (2nd ed.) - Chapters 8, 10, 11, 12
+++
```