## Lists reminder
In Python, a list is a data type that defines a collection of elements. This collection of elements can be of any type, we could have a list of numbers, a list of strings, a list of objects, a list of lists,... 

In [1]:
empty_list = [] # Empty list
numbers = [1, 2, 3]  # List of numbers
ways_to_greet = ["Hello", "Hi", "Ciao", "Bonjour", "Zdravo", "Salaam", "Mirëdita"] # List of strings

You can then access items in the list using indexes: each element of the list is linked to an index, starting at 0
In a list like ["value 1", "value 2", "value 3"] the first element "value 1" is associated to index 0, then index 1 corresponds to "value 2" and so on until the end of the list.  

In [2]:
print(numbers[1]) # accessing item 1 of the 'numbers' list (indexes start at 0 so we're getting the 2nd element of the list)
print(ways_to_greet[3]) # accessing and printing the 4th element of the 'ways_to_greet' list

2
Bonjour


Finally, we can iterate through a list like so:

In [3]:
print("Printing ways to greet\n")

for greeting in ways_to_greet:
    print(greeting) 
    
# OR, using indexes, it could also be written as:

print("\nPrinting ways to greet using indexes\n")

number_of_items_in_list = len(ways_to_greet)  # we get the number of elements in the list 
list_of_indexes = range(number_of_items_in_list) # range(n) creates a list [0, 1, ..., n-1]
for index in list_of_indexes:
    greeting = ways_to_greet[index]
    print(greeting)

Printing ways to greet

Hello
Hi
Ciao
Bonjour
Zdravo
Salaam
Mirëdita

Printing ways to greet using indexes

Hello
Hi
Ciao
Bonjour
Zdravo
Salaam
Mirëdita


There are several functions/methods that you can use with lists, for example, len(my_list) gives you the number of elements of 'my_list', but there are many more, for example:
- my_list.append('test') : adds the string element 'test' to the end of 'my_list'
- my_list.remove('test') : searches the list for a 'test' element and removes it from the list
- my_list.sort() : sorts the list in increasing order of values 
- ...

## Dictionaries reminder
A dictionary is another data type which allows us to store key : value pairs, much like a real dictionary, where the key is a word and the value is a definition, the idea is to be able to access items of a collections like a list using something more meaningful than just a number from 0 to the number of elements. 


Just like lists, the values can be of any data type.
The keys, however, will usually be strings. 

In [4]:
empty_dict = {}
greeting_by_language = {"English": "Hello", "French": "Bonjour", "Farsi": "Salaam", "Albanian": "Mirëdita", "Serbian": "Zdravo"}
print(list(greeting_by_language.keys())) #  dict.keys() returns a list of all the keys in the dictionnary

['English', 'French', 'Farsi', 'Albanian', 'Serbian']


We can then access values of the dictionnary using the keys. 

In [5]:
print(greeting_by_language["Farsi"]) 

Salaam


# Aim
The goal of this session is to filter some data about building materials based on certain attributes of the material, like for instance, we want to get a list of materials that belong in the "Minerals" category.
The data with which we are working is in the form of a list of dictionaries. Each item in the list represents a material, and each material is represented as a dictionary with information describing the material. 

## Importing data from csv with pandas
Before the session, I imported some data from a file and turned it into a list of dictionnaries

In [6]:
import pandas as pd

In [7]:
df = pd.read_csv("../data/materials.csv", sep=";")
materials = df.to_dict(orient="records")

We can first check how the data looks

In [8]:
print(type(materials))
print(type(materials[0]))
print(list(materials[0].keys())) # Each material is defined by these keys

<class 'list'>
<class 'dict'>
['id', 'name', 'category', 'type', 'functional_unit', 'description', 'common_uses', 'comments', 'embodied_energy', 'embodied_water', 'embodied_carbon', 'weight']


# First algorithm
Let's first simplify the task to figure out how we could get a list of materials that belong to the category 'Minerals'
Here is the pseudo-code for what needs to be done:

- Initialize a list 
- loop through list_of_dicts
    - check the object's category field
    - if the category is "Minerals" => add it to new_list

In [9]:
minerals_materials = []
for material in materials:
    if material['category'] == 'Minerals':
        minerals_materials.append(material) 

In [10]:
# OR, the compact and optimized way:
minerals_materials_fast = [material for material in materials if material["category"]=="Minerals"]
minerals_materials_fast == minerals_materials  # Checking that the two methods lead to the same results

True

## Generalizing the algorithm
Now, filtering based on the category is a pretty common task, and we might sometimes be interested in getting different categories, like 'Timber' instead of 'Minerals' for instance, to do that, we can build a reusable function taking the category to filter as argument

In [11]:
def filter_category(category_to_filter):
    filtered_materials = []
    for material in materials:
        if material['category'] == category_to_filter:
            filtered_materials.append(material) 
    return filtered_materials

# We can then call this function to filter based on any category we want:
print(filter_category('Minerals') == [material for material in materials if material["category"]=="Minerals"])
print(filter_category('Timber')  == [material for material in materials if material["category"]=="Timber"])

True
True


## Further generalization
We could add parameters to our function in order to filter based on other attributes than the category

In [12]:
def filter_attribute(attribute_to_filter, attribute_value):
    filtered_materials = []
    for material in materials:
        if material[attribute_to_filter] == attribute_value:
            filtered_materials.append(material) 
    return filtered_materials

print(filter_attribute('category', 'Minerals') == [material for material in materials if material["category"]=="Minerals"])
print(filter_attribute('type', 'Glass') == [material for material in materials if material["type"]=="Glass"])

True
True


## Comparing with pandas
That's it! Now, using pandas to work with tables of data is much more convenient than relying on Python's built in data types (e.g. lists, dictionaries,..) because it provides these kinds of functions and much more, so we could do the same kind of filtering using the following line on a pandas dataframe

In [13]:
filtered_materials = df[df["category"] == "Minerals"]
filtered_materials.head()

Unnamed: 0,id,name,category,type,functional_unit,description,common_uses,comments,embodied_energy,embodied_water,embodied_carbon,weight
0,1,20 MPa concrete mix (30% FA),Minerals,Concrete,m³,Concrete is a composite material combining san...,Floor slabs. suspended slabs. driveways. preca...,,2.026212,4011.163289,250.83422,2335.0
1,2,20 MPa concrete mix (30% GGBFS),Minerals,Concrete,m³,Concrete is a composite material combining san...,Floor slabs. suspended slabs. driveways. preca...,,2.185683,4033.505476,263.429388,2335.0
2,3,20 MPa concrete mix,Minerals,Concrete,m³,Concrete is a composite material combining san...,Floor slabs. suspended slabs. driveways. preca...,,2.403695,4153.722958,328.381715,2335.0
3,4,25 MPa concrete mix (30% FA),Minerals,Concrete,m³,Concrete is a composite material combining san...,Floor slabs. suspended slabs. precast wall panels,,2.24105,4028.347615,277.233297,2409.0
4,5,25 MPa concrete mix (30% GGBFS),Minerals,Concrete,m³,Concrete is a composite material combining san...,Floor slabs. suspended slabs. precast wall panels,,2.441117,4104.690027,292.537716,2409.0
