In [1]:
import pandas as pd

In [2]:
df = pd.read_csv("../data/materials.csv", sep=";")
materials = df.to_dict(orient="records")

### Exercise 1
Go through the materials list of dictionaries and print the sum of embodied energies of all materials of the 'Minerals' category

##### Reminder: previous session's algorithm
- Initialize a list (to hold your output)
- loop through materials
    - check the object's category field
    - if the category is "Minerals" => add it to new_list (update the output)
    
In this case, we're not trying to create a list, our output should instead just be a number equal to the sum of embodied energies of all materials that enter the if statement.

##### Algorithm for summing on the fly
- Initialize a variable y equal to 0
- loop through materials
    - check the material dictionary's category field 
    - if the category is 'Minerals' 
        - get the materials's embodied energy
        - add the embodied energy to y

In [3]:
## SUMMING ON THE FLY
output_on_the_fly = 0  # initialize the sum of embodied energies to 0
for material in materials: #  go through each material
    if material['category'] == 'Minerals':  # check the category of each material
        embodied_energy = material['embodied_energy']  #  retrieve the material's embodied energy
        output_on_the_fly +=  embodied_energy  #  if it's a Mineral, add it's embodied energy coefficient to the sum 
        #  1st iteration: this expression could look like:  output_on_the_fly = 0 + 3.23
        #  2nd iteration: this expression could look like:  output_on_the_fly = 3.23 (previous value) + 2.21
        #  ...
print(f"output_on_the_fly = {output_on_the_fly}") 
#  Once the loop is finished, we should have added the values of the embodied energy coefficient of all mineral materials

## OR

## USING A LIST
mineral_materials = [] # initialize an empty list
for material in materials: #  go through each material   
    if material['category'] == 'Minerals':  # check the category of each material
        mineral_materials.append(material['embodied_energy'])  #  if it's a Mineral, add it's embodied energy coefficient to the list
output_from_list = sum(mineral_materials) # Sum all the embodied energy coefficients using builtin function 'sum'
print(f"output_from_list = {output_from_list}")

output_on_the_fly = 69.13592424619038
output_from_list = 69.13592424619038


### Exercise 2
Stepping it up a little, for this second exercise, we are interested not only in the sum of embodied energy of Minerals, but in the sum of embodied energy of each category:
Your output should consist in a dictionary looking like this:


{'Minerals': 13900, 'Timber': 25788, ... } 

#### FIRST, INEFFICIENT WAY
- initialize a dictionary z whose keys will be material categories and values will be 0 
- for each category x in all the existing categories

    - *Run previous algorithm while replacing 'Minerals' with the value of x:*
    - Initialize a variable y equal to 0
    - loop through materials
        - check the material dictionary's category field 
        - if the category is equal to x 
            - get the materials's embodied energy
            - add the embodied energy to y
    - set the value of the key x of the dictionary z to y -> z\[x\] = y
            
this first requires us to have a list of all the existing categories, which we can get this way:

In [4]:
categories = list(set(mat['category'] for mat in materials))

## WHICH IS EQUIVALENT TO:

categories = set()  # A set is like a list, only it cannot have two elements with the same value.
for item in materials:
    categories.add(item['category'])
categories

{'Ferrous metals',
 'Minerals',
 'Natural fibres',
 'Non ferrous metals',
 'Other chemicals',
 'Polymers',
 'Timber'}

From that, we can create our initial dictionary 'z' like this:

In [5]:
my_output_dict = {category: 0 for category in categories}

## WHICH IS EQUIVALENT TO:

my_output_dict = {}
for category in categories:
    my_output_dict[category] = 0
my_output_dict

{'Non ferrous metals': 0,
 'Natural fibres': 0,
 'Minerals': 0,
 'Other chemicals': 0,
 'Polymers': 0,
 'Ferrous metals': 0,
 'Timber': 0}

This is then how the first algorithm would look like:

In [6]:
for category in categories: 
    #  1st iteration: category is equal to 'Minerals'
     #  If we substitute category for 'Minerals' in the next paragraph , this code is exactly the same as exercise 1
        
    output_on_the_fly = 0 
    for material in materials:       
        if material['category'] == category:  
            embodied_energy = material['embodied_energy'] 
            output_on_the_fly +=  embodied_energy
            
    # Each time we do exercise 1's algo for a category, we store the final value in the 'category' key of our dictionary
    my_output_dict[category] = output_on_the_fly  

my_output_dict

{'Non ferrous metals': 4.062554805334697,
 'Natural fibres': 1.2769074529365696,
 'Minerals': 69.13592424619038,
 'Other chemicals': 0.34575541662011133,
 'Polymers': 17.85852525272205,
 'Ferrous metals': 26.138118891679333,
 'Timber': 331.9579418760087}

Unfortunately, this is not very efficient, because we are going through the 131 materials again for each different category, so 7\*131 iterations. Here is a better way to do it:

- initialize a dictionary z whose keys will be material categories and values will be 0 
- loop through materials
    - get the material's category and store it in a variable x
    - get the materials's embodied energy y
    - add the value of this material's embodied energy to the sum of embodied energies for this material's category
        -> z\[x\] += y  

In [7]:
my_output_dict = {category: 0 for category in categories}

for material in materials:
    material_category = material['category']
    embodied_energy = material['embodied_energy']
    my_output_dict[material_category] += embodied_energy
    
my_output_dict

{'Non ferrous metals': 4.062554805334697,
 'Natural fibres': 1.2769074529365696,
 'Minerals': 69.13592424619038,
 'Other chemicals': 0.34575541662011133,
 'Polymers': 17.85852525272205,
 'Ferrous metals': 26.138118891679333,
 'Timber': 331.9579418760087}

## Going further: How would we do this in pandas?

In [8]:
# exercise 1:
ex1_output = sum(df[df["category"] == 'Minerals']['embodied_energy'])
print(ex1_output)

# which can be decomposed as:
mask = df["category"] == 'Minerals' #  create a mask/filter of true/false value identifying which rows are Minerals
mineral_materials = df[mask]  # apply mask/filter to get a dataframe with only minerals
mineral_materials_embodied_energies = mineral_materials['embodied_energy'] # get only the 'embodied energy' column
ex1_output = sum(mineral_materials_embodied_energies) # sum all values
print(ex1_output)

69.13592424619038
69.13592424619038


In [9]:
# exercise 2:
ex2_output = df.groupby('category')['embodied_energy'].sum().to_dict()
print(ex2_output)

# which can be decomposed as:
df_by_category = df.groupby('category') # group materials by category
df_by_category_embodied_energies = df_by_category['embodied_energy']  # only get the 'embodied_energy' column
summed_embodied_energies_by_group = df_by_category_embodied_energies.sum()  #  specify how you should regroup values of the same group
result_as_dict = summed_embodied_energies_by_group.to_dict() # turn the dataframe into a dictionary
print(result_as_dict)

{'Ferrous metals': 26.138118891679333, 'Minerals': 69.13592424619036, 'Natural fibres': 1.2769074529365696, 'Non ferrous metals': 4.062554805334696, 'Other chemicals': 0.34575541662011133, 'Polymers': 17.85852525272205, 'Timber': 331.9579418760087}
{'Ferrous metals': 26.138118891679333, 'Minerals': 69.13592424619036, 'Natural fibres': 1.2769074529365696, 'Non ferrous metals': 4.062554805334696, 'Other chemicals': 0.34575541662011133, 'Polymers': 17.85852525272205, 'Timber': 331.9579418760087}
