# Use of `shrecc`

This notebook guides you through the various steps needed to create spatially- and regionally-specific consumption electricity mixes. These mixes are written directly in a dedicated `brightway` database, ready to be used in your own LCA project.

In [None]:
import bw2data as bd
import bw2io as bi
import pandas as pd

from shrecc.download import get_data
from shrecc.treatment import data_processing
from shrecc.database import create_database, filt_cutoff

%load_ext line_profiler
%load_ext autoreload
%autoreload 2

## Prepare your data

Do this in case you want to download data for any other year than the default year in the `shrecc_data` [repository](https://git.list.lu/shrecc_project/shrecc_data), or if you simply want to recalculate it on your own.

First, run function get_data(). This will download all the electricity data for all countries for a selected year.

In [None]:
data = get_data(year=2024,
               path_to_data = 'your_path_to_data')
#if no path given, shrecc will create a folder inside of shrecc/data and save the data there

Next, it's time to treat the data. This part is very heavy due to matrix inversion, and is recommended to run on a server.

In [None]:
data_processing(data_df=data,
                year=2024,
               path_to_data = 'your_path_to_data')
#if no path given, shrecc will create a folder inside of shrecc/data and save the data there

And that's literally it! Now it's time to select which countries and times you are interested in and create your database.

## Get your project ready
In order to run following functions, you need to have a `brightway` (2 or 2.5) project with biosphere and ecoinvent installed. Matching of electricity sources is valid for ecoinvent 3.9.1, but should be compatible with 3.10 and 3.11. You also need to have a `pandas` dataframe with all the electricity data. Either you created it on your own with the above steps, or you downloaded it from [our git](https://git.list.lu/shrecc_project/shrecc_data).

In [None]:
bd.projects.set_current('SHRECCei311')

In [None]:
bd.databases

In [None]:
if 'ecoinvent-3.11-cutoff' not in bd.databases:

    bi.import_ecoinvent_release(
        version="3.11",
        system_model="cutoff",
        username="XX",
        password="YY",
    )

To download already-calculated data, you can visit the SHRECC data git repository: https://git.list.lu/shrecc_project/shrecc_data

### Example 1: all days in June around noon
In this example, we model the consumption mix of five countries, around noon on all days in June.


In [None]:
countries = ["FI", "SE", "LU", "AT", "FR"]

In [None]:
example = filt_cutoff(
    countries=countries,
    general_range=["2024-06-01 01:00:00",
                   "2024-06-30 23:00:00"],
    refined_range=[10, 14],
    freq="h",
    cutoff=1e-3,
    include_cutoff=True,
    path_to_data = 'your_path_to_data'
)
#if no path given, shrecc will look for the data inside of shrecc/data

#### To create a database based on our filtered and cut off dataframe with selected days/times

In [None]:
example.head()

In [None]:
create_database(dataframe_filt = example,
                project_name   = "SHRECCei311",
                db_name        = "elec_june_2024_noon",
                eidb_name      = "ecoinvent-3.11-cutoff")

### Example 2: all days on February evenings
Another example of dataframe, this time on February evenings

In [None]:
example2 = filt_cutoff(
    countries=countries,
    general_range=["2024-02-01 01:00:00",
                   "2024-02-28 23:00:00"], 
    refined_range=[19, 23],
    freq="h",
    cutoff=1e-3,
    include_cutoff=True,
    path_to_data = 'your_path_to_data'
)
#if no path given, shrecc will look for the data in shrecc/data

In [None]:
create_database(dataframe_filt = example2,
                project_name   = "SHRECCei311",
                db_name        = "elec_february_2024_evening",
                eidb_name      = "ecoinvent-3.11-cutoff")

## To show that shrecc results are similar enough to ecoinvent

We compare 2022 Energy-Charts data with ecoinvent 3.11, which uses 2021 data for most European countries (except of Switzerland).
You can download our data from shrecc_project/shrecc_data/2021. Supply the path to were you saved the data.

In [None]:
data = get_data(year=2021,
               path_to_data = 'your_path_to_data')
#if no path given, shrecc will create a folder inside of shrecc/data, download the data and save it there

In [None]:
data_processing(data_df=data,
                year=2021,
               path_to_data = 'your_path_to_data')
#if no path given, shrecc will create a folder inside of shrecc/data, download the data and save it there

Selecting all the countries that do not contain missing data:

In [None]:
countries = [
        "AT",
        "BE",
        "CZ",
        "DE",
        "DK",
        "ES",
        "FI",
        "FR",
        "GR",
        "HU",
        "IE",
        "IT",
        "LU",
        "NL",
        "NO",
        "PL",
        "PT",
        "RO",
        "SE",
        "SK",
        "SI",]

In [None]:
year2021 = filt_cutoff(
    countries=countries,
    general_range=["2021-01-01 00:00:00", "2021-12-31 23:00:00"],
    freq="h",
    cutoff=1e-8,
    include_cutoff=True
)

In [None]:
create_database(dataframe_filt = year2021,
                project_name   = 'SHRECCei311',
                db_name        = "full year 2021",
                eidb_name      = "ecoinvent-3.11-cutoff")

In [None]:
import bw2calc as bc

In [None]:
ef_methods = [m for m in bd.methods if m[1] == 'EF v3.0']
config = {"impact_categories" : ef_methods}

In [None]:
demands_year = {f"{a['name']}, full year":{a['id']:1} for a in bd.Database('full year 2021')}

In [None]:
acts_ei = [bd.get_node(database="ecoinvent-3.11-cutoff",
                       location=country,
                       name='market for electricity, low voltage') for country in sorted(countries)]

In [None]:
demands_year.update({f"{a['name']}, {a['location']}":{a['id']:1} for a in acts_ei})

In [None]:
data_objs = bd.get_multilca_data_objs(functional_units=demands_year,
                                      method_config=config)

In [None]:
m_lca = bc.MultiLCA(demands=demands_year,
                    method_config=config,
                    data_objs=data_objs)

In [None]:
m_lca.lci()

In [None]:
m_lca.lcia()

In [None]:
results = []
for (method, fu), score in m_lca.scores.items():    
    demand = m_lca.demands[fu]
    a_id = list(demand).pop()
    a = bd.get_activity(a_id)
    results.append(
            {
                "activity": a['name'],
                "database": a['database'],
                "location": a['location'],
                "production amount": demand[a_id],
                "activity unit": a["unit"],
                "reference product": a["reference product"],
                "methodology": method[1],
                "category": method[2],
                "indicator": method[3],
                "score": score,
                "unit": bd.Method(method).metadata["unit"],
            },
        )
res_df = pd.DataFrame(results)
res_df = res_df.set_index([c for c in res_df.columns if c != 'score']).unstack(['category','indicator','unit'])

In [None]:
res_df.sort_index(level='location', inplace=True)

In [None]:
data = res_df[('score', 'climate change', 'global warming potential (GWP100)', 'kg CO2-Eq')]
data = data.droplevel('activity')
data = data.loc[data.index.get_level_values('location').isin(countries)]
data = data.unstack('location').T

ax = data.plot.bar(figsize=(10, 6))
ax.set_ylabel('kg COâ‚‚ eq./kWh')

import matplotlib.pyplot as plt
plt.tight_layout()
plt.show()

# Test
We run an LCA for the 4 mix samples, as well as their ecoinvent 3.11 equivalents. It is fair to say that here, we are using 2024 data (shrecc) and ecoinvent 3.11 (2021 data), so the results also differ due to changes in the grid.

In [None]:
demands_summer_noon    = {f"{a['name']}, summer, noon":{a['id']:1} for a in bd.Database('elec_june_2024_noon')}
demands_winter_evening = {f"{a['name']}, winter, evening":{a['id']:1} for a in bd.Database('elec_february_2024_evening')}
demands = {**demands_summer_noon, **demands_winter_evening}

In [None]:
acts_ei = [bd.get_node(database='ecoinvent-3.11-cutoff',
                       location=country,
                       name='market for electricity, low voltage') for country in sorted(countries)]
acts_ei

In [None]:
demands.update({f"{a['name']}, {a['location']}":{a['id']:1} for a in acts_ei})
demands

In [None]:
data_objs = bd.get_multilca_data_objs(functional_units=demands,
                                      method_config=config)

In [None]:
m_lca = bc.MultiLCA(demands=demands,
                    method_config=config,
                    data_objs=data_objs)

In [None]:
m_lca.lci()

In [None]:
m_lca.lcia()

In [None]:
results = []
for (method, fu), score in m_lca.scores.items():    
    demand = m_lca.demands[fu]
    a_id = list(demand).pop()
    a = bd.get_activity(a_id)    
    # prod_e = list(a.production()).pop()    
    results.append(
            {
                "activity": a['name'],
                "database": a['database'],
                "location": a['location'],
                "production amount": demand[a_id],
                "activity unit": a["unit"],
                "reference product": a["reference product"],
                "methodology": method[0],
                "category": method[1],
                "indicator": method[2],
                "score": score,
                "unit": bd.Method(method).metadata["unit"],
            },
        )
res_df = pd.DataFrame(results)
res_df = res_df.set_index([c for c in res_df.columns if c != 'score']).unstack(['category','indicator','unit'])


In [None]:
res_df.sort_index(level='location', inplace=True)

In [None]:
ax = res_df[('score','EF v3.0', 'climate change', 'kg CO2-Eq')].droplevel(['activity']).unstack(['location']).T.plot.bar(figsize=(10,6))
ax.set_ylabel('kg CO2 eq./kWh')

There are differences in some countries, namely Norway and The Netherlands.In Norway, we recommend setting the cut off to 0, as small imports are summed together and European mix is used as the source. For most countries, that is fine, but in this specific case, Norwegian electricity is very low impact, so using the European average skews it a lot. In case of The Netherlands, there are some issues in solar power reporting, and this will be fixed in the future versions of shrecc.

# Further analysis and comparisons

In [None]:
m_lca.lci()

In [None]:
m_lca.invert_technosphere_matrix()

In [None]:
m_lca.demands

In [None]:
matrix_indices = [{k:m_lca.activity_dict[vk] for vk in v.keys()} for k,v in m_lca.demands.items()]
matrix_indices

In [None]:
m_lca.inverted_technosphere_matrix.shape

In [None]:
demands_inverted = {k:m_lca.inverted_technosphere_matrix[:,v] for d in matrix_indices for k,v in d.items()}

In [None]:
reversed_activity_dict = {v:k for k,v in m_lca.activity_dict.items()}

In [None]:
matrix_index = [bd.get_activity(reversed_activity_dict[i]) for i in range(m_lca.inverted_technosphere_matrix.shape[0])]

In [None]:
properties = ['name', 'unit', 'code', 'location', 'reference product', 'type', 'database', 'id']
df_index = pd.MultiIndex.from_tuples([[m[prop] for prop in properties] for m in matrix_index], names=properties)

In [None]:
demands_inverted_df = pd.DataFrame(demands_inverted, index=df_index)

In [None]:
process_contribution = demands_inverted_df.sort_values(by=demands_inverted_df.columns[0], ascending=False)

In [None]:
idx_elec = process_contribution.index.get_level_values('name').str.startswith('electricity production')

In [None]:
process_contribution_elec = process_contribution.loc[idx_elec].sort_index(axis=1).droplevel(['unit','code','reference product','type','database','id'])
process_contribution_elec

In [None]:
country = 'LU'
threshold = 0.05



In [None]:
for country in countries:
    process_contribution_elec_c = process_contribution_elec[[c for c in process_contribution_elec if country in c]]

    idx_filt = (process_contribution_elec_c >= threshold).any(axis=1)
    mix_to_plot = pd.concat([process_contribution_elec_c.loc[idx_filt],
            pd.DataFrame(process_contribution_elec_c[~idx_filt].sum(), columns=[('REST', 'REST')]).T],
            axis=0).T#.droplevel(['code','unit'], axis=0)

    ax = mix_to_plot.plot.bar(stacked=True)
    h,l = ax.get_legend_handles_labels()
    ax.legend(loc='center left', bbox_to_anchor=(1.05,0.5), handles=h[::-1], labels=l[::-1])
    ax.set_title(f'Comparison of background process contribution for electricity-producing activities\nbetween shrecc and ecoinvent, for location {country}')


Some remarks about the mix analysis.

1. From the comparison of background processes in Sweden, you can see that the electricity mix done by shrecc contains some aggregated data for hydro electricity. On the Energy Chart API, there is no distinction between the types, and so all hydro (run-off-river, reservoir, hydro pumped storage) is together in one category. Ecoinvent has the different types in the mix.
2. Not all background processes sum to one; we performed our search based on "electricity production", and there might be some activities with a different name, still contributing to the electricity mix.