<h1 align="center"> Data Structure in Python </h1>

For a data scientist it is very important to be able to handle different types of data storage in order to be used in an data processing in which one needs to retrieve information or store new pieces of information for later-on purposes. In this notebook we will be studying these types of data structure that Python uses:

- Lists
- Dictionaries
- Tuple
- Sets

<h2> Lists </h2>

A list is an array of values that are stored in vector form as `x = [1,2,3]`, and Python understands this variable as a list by using squared brackets (`[]`). Here are some examples:

In [1]:
this_is_a_list = [2,4,5,6]
also_a_list = ['Python', 'R', 'Julia']
another_list = ['Hello', 3.1416, True, False, 4]

print(this_is_a_list)
print(also_a_list)
print(another_list)

[2, 4, 5, 6]
['Python', 'R', 'Julia']
['Hello', 3.1416, True, False, 4]


As we can see, a list allows to use the same data type in each entry, but also to use different types of data. Now let's see some important properties of the lists.

We can access a specific entry by indexing with squared brackets. Note that Python starts counting at 0 rather than in 1, so the first element of the list is at the index 0.

In [2]:
this_is_a_list[0] # extracting a value of the list

2

By accessing an entry we can also re asign it's value as follows:

In [3]:
print('The this_is_a_list list is', this_is_a_list, 'before changing the second entry')
this_is_a_list[1] = 7

print('The this_is_a_list list is', this_is_a_list, 'after changing the second entry')

The this_is_a_list list is [2, 4, 5, 6] before changing the second entry
The this_is_a_list list is [2, 7, 5, 6] after changing the second entry


In python, most objects come with some functions defined so we can apply to the object itself and its current value, such a functions are called in the programming jargon as methods. The first method we will use is the `append` method that is defined within the lists, and allows us to add an element to a list, just as shown in the following example:

In [4]:
this_is_a_list.append(10)

this_is_a_list

[2, 7, 5, 6, 10]

Let's see some common methods for lists:

In [5]:
this_is_a_list.remove(7) # it removes the first values that is equal to 7
print(this_is_a_list)

this_is_a_list.pop(1) # it removes the element at position 1 (second position)
print(this_is_a_list)

this_is_a_list.reverse() # reverses the order of elements in the list
print(this_is_a_list)

[2, 5, 6, 10]
[2, 6, 10]
[10, 6, 2]


To inspect all about lists and it's properties, one can visit the direct Python documentation and see all the available methods: https://docs.python.org/3/tutorial/datastructures.html#

<h2> Dictionaries </h2>

A very important data structure are the dictionaries, which consist of of a set of a `key: value` pairs where each key should be a unique identifier to which a value corresponds to, some examples are:

In [6]:
this_is_a_dictionary = {'Name': 'Jhon', 'Surname': 'Wick', 'Age': 30}
also_a_dictionary = {'numbers': [2,3,6,9,10], 'is_even': [True, False, True, False, True]}
this_dicionary_is_empty = dict()
also_empty = {}

Note that each value corresponding to each pair need not be a single value, nor need to be an specific data type.

In contrast to lists, the keys are not accessible from an index position, but rather, by using the name of the key, as follow:

In [7]:
print(this_is_a_dictionary['Name'])
print(also_a_dictionary['numbers'])

Jhon
[2, 3, 6, 9, 10]


Look what happens if I intend to access a `key:value` pair from an index position:

In [8]:
this_is_a_dictionary[0]

KeyError: 0

The `KeyError` message states that there are no key named `0`, since the dictionary structure does not understand indexing. 

Once a dictionary is been created (regardless if the dictionary is empty or not), one can add a new `key:value` pair like this:

In [9]:
this_dicionary_is_empty['new_key'] = 'New value'
this_dicionary_is_empty

{'new_key': 'New value'}

`this_dictionary_is_empy` is no longer an empty dictionary.

As well as in the lists, there are certain methods that are applicable to dictionaries, and here are some examples:

In [10]:
print(this_is_a_dictionary.keys()) # Returns all keys
print(this_is_a_dictionary.values()) # Returns all values
print(this_is_a_dictionary.get('Name')) # Returns the respective value of a given key

dict_keys(['Name', 'Surname', 'Age'])
dict_values(['Jhon', 'Wick', 30])
Jhon


You can research more on dicionaries from: https://docs.python.org/3/tutorial/datastructures.html#dictionaries

<h2> Tuples </h2>

A tuple is an array of values, similar to lists, that are stored in vector form as `x = (1,2,3)`, and Python understands this variable as a tuple by using round brackets (`()`). Here are some examples:

In [11]:
this_is_a_tuple = (2,4,5,6)
also_a_tuple = ('Python', 'R', 'Julia')
another_tuple = ('Hello', 3.1416, True, False, 4)

print(this_is_a_tuple)
print(also_a_tuple)
print(another_tuple)

(2, 4, 5, 6)
('Python', 'R', 'Julia')
('Hello', 3.1416, True, False, 4)


Although the structure of tuples seems like a list, there is a key diference: lists are mutable, and tuples are not. This means that once a tuple has been defined you cannot modify a value. Let's try this out:

In [12]:
this_is_a_tuple[2] = 10

TypeError: 'tuple' object does not support item assignment

Unlike lists, item assignment is not defined in tuples, this is what unmutability means. This property also means implies that there are few methods one can apply to tuples, since you cannot remove, pop, or reverse its elements. Here are the two methods defined in tuples:

In [13]:
print(this_is_a_tuple.count(4)) # Counts the number of times the number 4 appears in the tuple
print(this_is_a_tuple.index(6)) # Returns the position in which the number 6 appears for the first time in the tuple

1
3


<h2> Sets </h2>

Lists and tuples have two important properties: the order of the elements matter, and a given value can appear several types within a list or a tuple. However, now we examine `sets` which does not have these two properties, in fact, as its name suggests, here we are dealing with the definition of a set in set theory.

In Python, sets are defined using curly braces (`{}`), like `x = {3,5,6,7}`. Here are some examples:

In [14]:
this_is_a_set = {2,4,5,6}
also_a_set = {'Python', 'R', 'Julia'}
another_set = {'Hello', 3.1416, True, False, 4}

print(this_is_a_set)
print(also_a_set)
print(another_set)

{2, 4, 5, 6}
{'R', 'Python', 'Julia'}
{False, True, 3.1416, 4, 'Hello'}


Although, at first glance, it seemes similar to how lists and tuples work, lets examine the two main diferences: 

In [15]:
print({2,2, 4, 5, 6, 6})
print({2,3,4} == {4,3,2})

{2, 4, 5, 6}
True


The first line in the above code shows that repetitive elements does not matter, and the second line shows that order is irrelevant in sets, because the definition of a set is just a collection of elements.

However, one can easily turns a list or a tuple into a set using the `set()` function as follows:

In [16]:
print(set(this_is_a_tuple))
print(set(this_is_a_list))

{2, 4, 5, 6}
{10, 2, 6}


Know let's see some methods applicable to sets:

In [17]:
this_is_a_set.add(10) # Adds the element 10 to the set
print(this_is_a_set)

this_is_a_set.clear() # Remove all elements
print(this_is_a_set) # this_is_a_set is now an empty set

{2, 4, 5, 6, 10}
set()


As mentioned before, the concept of set is a mathematical definition and within Python all set algebra is also defined (e.g. Union, Intersection, Difference). Let's see how they work, let's define two finite sets A and B and perform several operations between them:

In [18]:
A = {2,4,6,8}
B = {1,2, 3, 4, 5,7,9}

print(A.difference(B)) # Difference between two sets
print(A - B) # Also a difference between two sets

print(A.union(B)) # Union between two sets

print(A.intersection(B)) # Intersection between two sets

{8, 6}
{8, 6}
{1, 2, 3, 4, 5, 6, 7, 8, 9}
{2, 4}


<h2> References </h2>

- https://docs.python.org/3/tutorial/datastructures.html#