# Data structures

Data structures are architectures in which data are stored in python.  There are [many, many](https://realpython.com/python-data-structures/) data structures, but we will cover the most commonly used: lists and dictionaries.

Note that you will also make frequent use of [arrays](https://numpy.org/doc/stable/reference/generated/numpy.array.html) in the AI/ML workshop.  In python, true arrays do not exist and instead come from the library [numpy](https://numpy.org), which is how you will be using them.

## Lists

You have already seen a couple of lists.  They're used to store multiple values in a single structure and are contained within square brackets `[ ]`.

Lists are:
* mutable
* iterable
* ordered
* can contain multiple data types (arbitrary objects)
* can be nested

In [1]:
a_list = ['this', 'is', 'a', 'list']
print(type(a_list))
this = 7
b_list = ['so', 'is', this]
print(type(b_list))
print(b_list)

<class 'list'>
<class 'list'>
['so', 'is', 7]


In [2]:
# empty lists are fine, and are often useful in scripts
empty_list = []
print(type(empty_list))

<class 'list'>


In [3]:
# a list within a list!
letters = ['a', 'b', ['c', 'd'], 'e']
print(letters)

['a', 'b', ['c', 'd'], 'e']


We can use functions on lists.

In [4]:
print(len(a_list))
print(len(a_list) == len(b_list))

4
False


Like strings, they can be indexed and sliced.

In [5]:
print(a_list)
print(a_list[0])
print(a_list[1] == b_list[1])

['this', 'is', 'a', 'list']
this
True


In [7]:
numbers = [6, 9, 18, 243, 7]
print(numbers[1:4])

[9, 18, 243]


I said earlier that lists are **mutable**.   That means that we can add to and change the values inside them.  Let's chack out a few ways to do that.

In [14]:
# replace based on position
fives = [5, 10, 15, 20, 30] # oops, I meant 25!
fives[4] = 25
print(fives) 

[5, 10, 15, 20, 25]


In [15]:
# add to the end of a list with append()
fives.append(30)
print(fives)

[5, 10, 15, 20, 25, 30]


30

In [19]:
# add to the beginning or middle of a list with insert()
fives.insert(99, 1) # have to provide the position
print(fives)

[0, 8, 1, 5, 10, 15, 20, 25, 30, 1]


Going out of the list's range will result in an error.

In [20]:
fives[77]

IndexError: list index out of range

As a bit of a side note, you can't do this with strings, because they are **immutable**.

In [21]:
a_str = 'string'
a_str.append('s')

AttributeError: 'str' object has no attribute 'append'

## Dictionaries

Python dictionaries store an arbitrary number of objects that are each attached to a unique identifier called a **key**.  In other languages, similar data structures are generally called hashes, hash tables, maps, and associative arrays.

Dictionaries are surrounded by curly brackets `{ }` and contain comma-separated lists `key:value` groups.

Dictionaries are:
* mutable
* iterable
* ordered
* can contain multiple data types (arbitrary objects)
* can be nested

In [22]:
names_dict = {'last name':'first name', 'Harrison':'Amelia', 'Richards':'Vanessa'}
print(type(names_dict))

<class 'dict'>


Instead of retrieving values with positional indexing (like we did with lists and strings), we use keys to retrieve values.

In [23]:
# get the first names
print(names_dict['last name'])
print(names_dict['Harrison'])
print(names_dict['Richards'])

first name
Amelia
Vanessa


In [25]:
print(names_dict['Amelia'])

KeyError: 'Amelia'

In the above example, we can easily retrieve values using keys, but not keys using values.  That's not to say it's impossible to retrieve dictionary keys using the values, because [it can be done](https://www.geeksforgeeks.org/python-get-key-from-value-in-dictionary/), but you would like to avoid it when necessary.

Python lists are ordered as of version 3.7, but you cannot retrieve values using positions.

In [31]:
names_dict[0]

KeyError: 0

I said that dictionaries are mutable, so let's see that in action.

In [26]:
# add an entry
names_dict['Anandakrishnan'] = 'Rene'
print(names_dict)

{'last name': 'first name', 'Harrison': 'Amelia', 'Richards': 'Vanessa', 'Anandakrishnan': 'Rene'}


Rats, it looks like I assigned Rene's first name to Manju's last name!  I'd better update the entry to fix that.

In [27]:
names_dict['Anandakrishnan'] = 'Manju'
print(names_dict)
names_dict['Hoover'] = 'Rene'
print(names_dict)

{'last name': 'first name', 'Harrison': 'Amelia', 'Richards': 'Vanessa', 'Anandakrishnan': 'Manju'}
{'last name': 'first name', 'Harrison': 'Amelia', 'Richards': 'Vanessa', 'Anandakrishnan': 'Manju', 'Hoover': 'Rene'}


Much better. While we're at it, let's go ahead and drop the first/last name entry at the beginning.

In [28]:
del names_dict['last name']
print(names_dict)

{'Harrison': 'Amelia', 'Richards': 'Vanessa', 'Anandakrishnan': 'Manju', 'Hoover': 'Rene'}


As with lists, you can mix data types in dictionaries and nest dictionaries within each other.

In [29]:
food_dict = {'breakfast':{'meal': 1, 'calories':500, 'tasty?':True}, 
             'lunch':{'meal': 2, 'calories':0, 'tasty?':False, 'notes':'sandwich was dropped :('},
             'dinner':{'meal': 3, 'calories':1500, 'tasty?':True}}
print(food_dict)

{'breakfast': {'meal': 1, 'calories': 500, 'tasty?': True}, 'lunch': {'meal': 2, 'calories': 0, 'tasty?': False, 'notes': 'sandwich was dropped :('}, 'dinner': {'meal': 3, 'calories': 1500, 'tasty?': True}}


In [33]:
# accessing the nested dictionaries
print(food_dict['breakfast'])
print(food_dict['lunch']['notes'])

{'meal': 1, 'calories': 500, 'tasty?': True}
sandwich was dropped :(
