# Data Types

## Handle data collections

Data collections are objects that allow you to store several types of data at the same time. For example, you can have decimal numbers, strings, and integers in the same data collection. This is very useful to know how to manipulate because you will often deal with data collections as a Data Scientist (and even Web Developer).

### Lists


#### Definition

A list is a mutable data collection, i.e. the collection in the list can be modified. Here is an example:

In [None]:
list_1 = [1, 2, 3, 4, 5]

In this list, we have 5 different items that have the value: 1, 2, 3, 4 and 5 respectively. We can access each item in the list via its index (which corresponds to its position in the list) :

In [None]:
print(list_1[3])

4


Warning: in python, the numbering of indices always starts at 0!

#### Change items in a list

As we said in our introduction, a list is mutable. That is, you can change the items that are already present in the list by specifying the index of the item you want to change. For example, you can specify the index of the item you want to change:

In [None]:
list_1[3] = 17
print(list_1)

[1, 2, 3, 17, 5]


Lists can therefore be useful if you need access to items in the data collection in order to modify them.

#### Add items to a list

Items can be added to a list in different ways:


*   The operator `+`
*   `.append()`
*   `.extend()`

In [1]:
list_1 = [1, 2, 3, 4, 5]
add_1 = ["hello"]
list_1 = list_1 + add_1
print(list_1)

[1, 2, 3, 4, 5, 'hello']


In [2]:
# Add of a single element with append
list_2 = [1, 2, 3, 4, 5]
add_2 = 'test' # with append, we don't need brackets anymore "[]"

list_2.append(add_2)
print(list_2)

[1, 2, 3, 4, 5, 'test']


The `.append()` function is limited to one item addition. If you want to add several items to a list with this function, the program will consider these several items as a single list, this list becoming a single item added to your first list. See instead an example:

In [None]:
list_3 = [1, 2, 3, 4, 5]
add_3 = ['test1', 'test2', 'test3']
list_3.append(add_3)
print(list_3)

[1, 2, 3, 4, 5, ['test1', 'test2', 'test3']]


The `.extend()` function solves this problem by adding each item one by one to the first list. So here it is:

In [None]:
# Add a list to the end of an other (concatenate)
list_4 = [1, 2, 3, 4, 5]
add_4 = ['test1', 'test2', 'test3']
list_4.extend(add_4)
print(list_4)

[1, 2, 3, 4, 5, 'test1', 'test2', 'test3']


#### Remove items from a list

The last thing you need to see is how to remove items from a list. There are three ways to do this:
*   `del()`
*   `.remove()`
*   `pop()`

In the same way, let's try to understand by example:

In [None]:
list_1 = [1, 2, 4]
del(list_1[1])
print(list_1)

[1, 4]


In [None]:
list_1 = [1, 2, 4]
list_1.remove(1)
print(list_1)

[2, 4]


The main difference between `del()` and `remove()` is that the first function allows you to delete an element via its index, while the second function allows you to delete an element via the value of the element (in this case _1_). Depending on your issues, you may need one more than the other.

Warning: the `remove()` function removes only one element at a time. If you have duplicates that you want to remove, use an `while` loop as in the following example:

In [None]:
list_2 = [1, 2, 'to_remove', 4, 'to_remove', 'to_remove']

while 'to_remove' in list_2:
    list_2.remove('to_remove')
print(list_2)

[1, 2, 4]


The problem with these functions is that the item is deleted and cannot be retrieved. Indeed, it often happens to want to separate an item from a list to reuse it somewhere, or even to want to move this item in the list. In this case, we will use the `pop()` function:

In [None]:
list_1 = [1, 2, 4]
elem = list_1.pop(2)
print(elem)
print(list_1)

4
[1, 2]


### Slices


#### Fundamental Principal

Very often you may not need a complete list but just a part of it. This is when slices become very useful. Here is an example:

In [3]:
list_1 = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
three_last_items = list_1[7:10]
print(three_last_items)

?list_1.append

[8, 9, 10]


[0;31mSignature:[0m [0mlist_1[0m[0;34m.[0m[0mappend[0m[0;34m([0m[0mobject[0m[0;34m,[0m [0;34m/[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m Append object to the end of the list.
[0;31mType:[0m      builtin_function_or_method


The structure is therefore as follows:

```
slice[start_index : end_index]
```

As well as the `range()` function, `start_index` will be inclued and `end_index` will be excluded from the list.

If you don't put anything at the beginning index, the program will choose by default the index 0. Conversely, if you don't put anything at the end index, the program will understand that it must go to the last index in the list:

In [3]:
three_last_items = list_1[7:]
print(three_last_items)

three_first_items = list_1[:3]
print(three_first_items)

[8, 9, 10]
[1, 2, 3]


#### Slices with negative index

Negative indices are useful when you want to count from the end of the list :

In [4]:
three_last_items = list_1[-3:]
print(three_last_items)

three_last_items_excluded = list_1[:-3]
print(three_last_items_excluded)

[8, 9, 10]
[1, 2, 3, 4, 5, 6, 7]


## Dictionaries

Dictionaries have a great deal of specificity, which is very useful. They have keys associated with one or more values. You can think of them as "drawers" that contain data and on which labels are stuck. Dictionaries are a very common format, which you will encounter when using APIs for example. In a dictionary, it is therefore important to know how to manipulate the keys and the values associated with that key.

In [None]:
# Declare a dictionary with a key 'first_name' and a value 'Antoine'
dic1 = {
    'first_name': 'Antoine'
       }

The dictionary key is the equivalent of the index in the list. It can be used to access the different values:

In [6]:
dic1['first_name']

'Antoine'

#### Modify / Add a key

The way you will modify or add a key works the same way:

In [7]:
# We modify the already existing element to the key "first_name".
dic1['first_name'] = 'Charles'

# New elements are added, with the keys 'name' and 'age'.
dic1['name'] = 'Dupont'
dic1['age'] = 36
print(dic1)

{'first_name': 'Charles', 'name': 'Dupont', 'age': 36}


#### Delete a key

A key can be removed using the `del()` function. In the same way as we have seen with the lists:

In [8]:
del(dic1['age'])
print(dic1)

{'first_name': 'Charles', 'name': 'Dupont'}


## Iterate on dictionaries

In a dictionary, you can iterate on keys, values associated with keys, or both. Here are some good methods to know.


##### Iterate on the keys of a dictionary

In [9]:
# iterate on keys
for key in dic1.keys():
    print(key)
    
# iterate on keys
for key in dic1:
    print(key)

first_name
name


##### Iterate on dictionary values

To iterate on values in a dictionary, the easiest way is to use the `.values()` function:

In [10]:
# iterate on values
for val in dic1.values():
    print(val)

Charles
Dupont


##### Iterate on keys and values

You can also iterate on the key and the value associated with the key at the same time with the `.items()` function. This function will return the key and the value at the same time in the iteration:

In [None]:
# iterate on keys and values
for item in dic1.items():
    print(item)

('prenom', 'Charles')
('nom', 'Dupont')


## Tuples

Tuples are, unlike lists, immutable. It often happens that you come across them when you want to manipulate data that should not be changed during the execution of your code. Here's how it's built:

In [12]:
tuple_1 = (1, 2, 3)
print(tuple_1)

(1, 2, 3)


The value of an item in a tuple can be accessed in the same way as the value of an item in a list:

In [13]:
print(tuple_1[2])

3


On the other hand, if you try to change the value for a tuple, you get an error:


In [14]:
# A tuple cannot be modified: the code below produces an error
tuple_1[1] = 0

TypeError: ignored

#### Exchanging values with tuples

The first interesting thing we can do is exchange values for tuples:

In [1]:
# Exchanging values using tuples
a = 100
b = 1000

(a, b) = (b, a)

In [16]:
print(b)

100


In [17]:
print(a)

1000


#### Iterate on tuples

Let's imagine that we have a list in which each element is a tuple, as in the example below:

In [None]:
liste_tuples = [("antoine", 24), ("léa", 22), ("margaux",20)]

It is possible to iterate on each tuple and retrieve first names and ages separately in the following way:

In [20]:
for i in liste_tuples : # iterating on each tuple
    print('first_name : ' , i[0]) # first tuple's element
    print('age : ' , i[1]) # second tuples's element
    print('----------------')

first_name :  antoine
age :  24
----------------
first_name :  léa
age :  22
----------------
first_name :  margaux
age :  20
----------------


There is a more practical way of doing this, by giving names to each element of the tuple:

In [21]:
# Here, we give explicit names (first_name, age) to the elements on which we iterate:
for first_name, age in liste_tuples :
    print('first_name : ' , first_name)
    print('age : ' , age)
    print('----------------')

first_name :  antoine
age :  24
----------------
first_name :  léa
age :  22
----------------
first_name :  margaux
age :  20
----------------


In [1]:
# print variables in str

str1 = "numbers: {:.2f} and {:.2f}".format(1.34, 2.34)
str2 = "numbers: {1:.2f} and {0:.2f}".format(1.34, 2.34)

print(str1)
print(str2)

numbers: 1.34 and 2.34
numbers: 2.34 and 1.34


In [2]:
dict1 = {"un": 1, "deux": 2}
dict2 = {"trois": 3, "quatre": 4}
merge_dict = {**dict1, **dict2}

print(merge_dict)

{'un': 1, 'deux': 2, 'trois': 3, 'quatre': 4}
