# Python Data Structures: Lists, Dictionaries, Sets, Tuples


## Data structures in Python
Built-in data structures in Python can be divided into two broad categories: mutable and immutable. Mutable (from Latin mutabilis, "changeable") data structures are those which we can modify -- for example, by adding, removing, or changing their elements. Python has three mutable data structures: lists, dictionaries, and sets. Immutable data structures, on the other hand, are those that we cannot modify after their creation. The only basic built-in immutable data structure in Python is a tuple.

### Lists
Lists in Python are implemented as dynamic mutable arrays which hold an ordered collection of items.
In Python, lists can contain heterogeneous data types and objects. For instance, integers, strings, and even functions can be stored within the same list. 
We can arbitrarily add, remove, and change elements in the list. For instance, the .append() method adds a new element to a list, and the .remove() method removes an element from a list.
When creating a list, we do not have to specify in advance the number of elements it will contain.
Thus, the pros of lists are:
   They represent the easiest way to store a collection of related objects.
   They are easy to modify by removing, adding, and changing elements.
   They are useful for creating nested data structures, such as a list of lists/dictionaries.
However, they also have cons:
   They can be pretty slow when performing arithmetic operations on their elements. (For speed, use NumPy's arrays.)
   They use more disk space because of their under-the-hood implementation.

In [1]:
# Create an empty list using square brackets
l1 = []

# Create a four-element list using square brackets
l2 = [1, 2, "3", 4]  # Note that this lists contains two different data types: integers and strings

# Create an empty list using the list() constructor
l3 = list()

# Create a three-element list from a tuple using the list() constructor
l4 = list((1, 2, 3))

# Print out lists
print(f"List l1: {l1}")
print(f"List l2: {l2}")
print(f"List l3: {l3}")
print(f"List l4: {l4}")

List l1: []
List l2: [1, 2, '3', 4]
List l3: []
List l4: [1, 2, 3]


In [2]:
# Accessing data (Indexing)
list1 = [1, 2, 2, True, 'a', 'a', 'c']
print(list1[3])  # Access at index
print(list1[-1]) # Access from back
print(list1[0:2]) # Slice

True
c
[1, 2]


In [6]:
list1 = [1, 2, 2, True, 'a', 'a', 'c']

# Append a new element to the list 1
list1.append(3)

# Print the modified list
print("Append 3 to the list 1: ")
print(list1)

# Remove element 3 from the list 1
list1.remove(3)

#Print the modified list
print("Removed element 3 from the list 1: ")
print(list1)

# Change value at index 2 in list 1
list1[1] = 5

# Print the modified list 1
print("Modified list 1: ")
print(list1)

Append 3 to the list 1: 
[1, 2, 2, True, 'a', 'a', 'c', 3]
Removed element 3 from the list 1: 
[1, 2, 2, True, 'a', 'a', 'c']
Modified list 1: 
[1, 5, 2, True, 'a', 'a', 'c']


### Dictionaries
Dictionaries in Python are very similar to real-world dictionaries. These are mutable data structures that contain a collection of keys and, associated with them, values.
We use dictionaries when we are able to associate (in technical terms, to map) a unique key to certain data, and we want to access that data very quickly (in constant time, no matter the dictionary size). Moreover, dictionary values can be pretty complex. For example, our keys can be customer names, and their personal data (values) can be dictionaries with the keys like "Age," "Hometown," etc.
Thus, the pros of dictionaries are:

They make code much easier to read if we need to generate key:value pairs. We can also do the same with a list of lists (where inner lists are pairs of "keys" and "values"), but this looks more complex and confusing.
We can look up a certain value in a dictionary very quickly. Instead, with a list, we would have to read the list before we hit the required element. This difference grows drastically if we increase the number of elements.
However, their cons are:

They occupy a lot of space. If we need to handle a large amount of data, this is not the most suitable data structure.
In Python 3.6.0 and later versions, dictionaries remember the order of element insertions. Keep that in mind to avoid compatibility issues when using the same code in different versions of Python.


In [7]:
# Create dictionary with duplicate keys
d1 = {"1": 1, "1": 2}
print(d1)

# It will only print one key, although no error was thrown
# If we  try to access this key, then it'll return 2, so the value of the second key
print(d1["1"])

# It is technically possible to create a dictionary, although this dictionary will not support them,
# and will contain only one of the key


{'1': 2}
2


In [8]:
# Create an empty dictionary using curly brackets
d1 = {}

# Create a two-element dictionary using curly brackets
d2 = {"John": {"Age": 27, "Hometown": "Boston"}, "Rebecca": {"Age": 31, "Hometown": "Chicago"}}
# Note that the above dictionary has a more complex structure as its values are dictionaries themselves!

# Create an empty dictionary using the dict() constructor
d3 = dict()

# Create a two-element dictionary using the dict() constructor
d4 = dict([["one", 1], ["two", 2]])  # Note that we created the dictionary from a list of lists

# Print out dictionaries
print(f"Dictionary d1: {d1}")
print(f"Dictionary d2: {d2}")
print(f"Dictionary d3: {d3}")
print(f"Dictionary d4: {d4}")

Dictionary d1: {}
Dictionary d2: {'John': {'Age': 27, 'Hometown': 'Boston'}, 'Rebecca': {'Age': 31, 'Hometown': 'Chicago'}}
Dictionary d3: {}
Dictionary d4: {'one': 1, 'two': 2}


In [12]:
# Accessing data (Indexing)
dict1 = dict({1: "apple", 2: "cherry", 3: "strawberry"})
print(dict1[2])   # Access at key
print(dict1.keys())  # Keys
print(dict1.values())  # Values
print(dict1.items()) # Key value Pairs

# Add another name to the dictionary 1
dict1[4] = "Violet"
print(dict1)

cherry
dict_keys([1, 2, 3])
dict_values(['apple', 'cherry', 'strawberry'])
dict_items([(1, 'apple'), (2, 'cherry'), (3, 'strawberry')])
{1: 'apple', 2: 'cherry', 3: 'strawberry', 4: 'Violet'}


### Sets
Sets in Python can be defined as mutable dynamic collections of immutable unique elements. The elements contained in a set must be immutable. Sets may seem very similar to lists, but in reality, they are very different.
First, they may only contain unique elements, so no duplicates are allowed. Thus, sets can be used to remove duplicates from a list. Next, like sets in mathematics, they have unique operations which can be applied to them, such as set union, intersection, etc. Finally, they are very efficient in checking whether a specific element is contained in a set.

Thus, the pros of sets are:

We can perform unique (but similar) operations on them.
They are significantly faster than lists if we want to check whether a certain element is contained in a set.
But their cons are:

Sets are intrinsically unordered. If we care about keeping the insertion order, they are not our best choice.
We cannot change set elements by indexing as we can with lists.

In [13]:
# Create a set using curly brackets
s1 = {1, 2, 3}

# Create a set using the set() constructor
s2 = set([1, 2, 3, 4])

# Print out sets
print(f"Set s1: {s1}")
print(f"Set s2: {s2}")

Set s1: {1, 2, 3}
Set s2: {1, 2, 3, 4}


In [17]:
# Create two new sets
names1 = set(["Glory", "Tony", "Joel", "Dennis"])
names2 = set(["Morgan", "Joel", "Tony", "Emmanuel", "Diego"])

# Create a union of two sets using the union() method
names_union_1 = names1.union(names2)

# Create a union of two sets using the | operator
names_union_2 = names1 | names2

# Print out the resulting union
print(names_union_1)
print(names_union_2)

# Intersection of two sets using the intersection() method
names_intersection = names1.intersection(names2)

# Intersection of two sets using the & operator
names_intersection_2 = names1 & names2

# Print out
print(names_intersection)
print(names_intersection_2)

# Create a set of all the names present in names1 but absent in names2 with the difference() method
names_difference = names1.difference(names2)

# Create a set of all the names present in names1 but absent in names2 with the - operator
names_difference = names1 - names2

# Print out the resulting difference
print(names_difference)

{'Tony', 'Emmanuel', 'Joel', 'Dennis', 'Glory', 'Diego', 'Morgan'}
{'Tony', 'Emmanuel', 'Joel', 'Dennis', 'Glory', 'Diego', 'Morgan'}
{'Joel', 'Tony'}
{'Joel', 'Tony'}
{'Dennis', 'Glory'}


### Tuples
Tuples are almost identical to lists, so they contain an ordered collection of elements, except for one property: they are immutable. We would use tuples if we needed a data structure that, once created, cannot be modified anymore. Furthermore, tuples can be used as dictionary keys if all the elements are immutable.

Other than that, tuples have the same properties as lists. To create a tuple, we can either use round brackets (()) or the tuple() constructor. We can easily transform lists into tuples and vice versa (recall that we created the list l4 from a tuple).

The pros of tuples are:

They are immutable, so once created, we can be sure that we won't change their contents by mistake.
They can be used as dictionary keys if all their elements are immutable.
The cons of tuples are:

We cannot use them when we have to work with modifiable objects; we have to resort to lists instead.
Tuples cannot be copied.
They occupy more memory than lists.

In [18]:
# Create a tuple using round brackets
t1 = (1, 2, 3, 4)

# Create a tuple from a list the tuple() constructor
t2 = tuple([1, 2, 3, 4, 5])

# Create a tuple using the tuple() constructor
t3 = tuple([1, 2, 3, 4, 5, 6])

# Print out tuples
print(f"Tuple t1: {t1}")
print(f"Tuple t2: {t2}")
print(f"Tuple t3: {t3}")

Tuple t1: (1, 2, 3, 4)
Tuple t2: (1, 2, 3, 4, 5)
Tuple t3: (1, 2, 3, 4, 5, 6)


In [19]:
# Use tuples as dictionary keys
working_hours = {("Rebecca", 1): 38, ("Thomas", 2): 40}
