# `to_dataframe` functions demonstrations

Works with both normal and IO databases. Start with a normal database (stored normally, even if the origin is an IO table).

Normal databases have each edge stored in the SQLite table as a separate row.

In [1]:
import bw2data as bd
import bw2calc as bc
import bw2io as bi

In [55]:
bd.projects.set_current("to_dataframe USEEIO")

Utility function to get the US EEIO and its LCIA methods easily. This has its own biosphere flows, so we don't run `bw2setup`.

In [3]:
bi.useeio11()

Downloading US EEIO 1.1
Downloading US_Environmental_Protection_Agency-USEEIO.zip to /var/folders/rn/ht0vvs3s7mz2h9f_xjt9x4040000gn/T/tmpa5rleiqp/US_Environmental_Protection_Agency-USEEIO.zip


13238272it [00:01, 8304141.85it/s]                                                                                                                             


Unzipping file
Importing data
Applying strategy: json_ld_allocate_datasets
Applying strategy: json_ld_get_normalized_exchange_locations
Applying strategy: json_ld_convert_unit_to_reference_unit
Applying strategy: json_ld_get_activities_list_from_rawdata
Applying strategy: json_ld_add_products_as_activities
Applying strategy: json_ld_get_normalized_exchange_units
Applying strategy: json_ld_add_activity_unit
Applying strategy: json_ld_rename_metadata_fields
Applying strategy: json_ld_location_name
Applying strategy: json_ld_remove_fields
Applying strategy: json_ld_fix_process_type
Applying strategy: json_ld_label_exchange_type
Applying strategy: json_ld_prepare_exchange_fields_for_linking
Applying strategy: add_database_name
Applying strategy: link_iterable_by_fields
Applying strategy: link_iterable_by_fields


Writing activities to SQLite3 database:


Applying strategy: normalize_units
Applied 17 strategies in 0.56 seconds
Moved 1873 biosphere flows to `self.data`
2649 datasets
162926 exchanges
0 unlinked exchanges
  
Not able to determine geocollections for all datasets. This database is not ready for regionalization.


0% [##############################] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Title: Writing activities to SQLite3 database:
  Started: 08/21/2022 21:34:26
  Finished: 08/21/2022 21:34:30
  Total time elapsed: 00:00:04
  CPU %: 97.70
  Memory %: 1.89
Created database: US EEIO 1.1
Applying strategy: json_ld_lcia_add_method_metadata
Applying strategy: json_ld_lcia_convert_to_list
Applying strategy: json_ld_lcia_set_method_metadata
Applying strategy: json_ld_lcia_reformat_cfs_as_exchanges
Applying strategy: normalize_units
Applied 5 strategies in 0.01 seconds
19 methods
4511 cfs
0 unlinked cfs
Wrote 19 LCIA methods with 4511 characterization factors


Pick a product and an activity node at random.

In [48]:
product = next(node for node in bd.Database("US EEIO 1.1") if node['type'] == 'product')
activity = next(node for node in bd.Database("US EEIO 1.1") if node['type'] == 'process')
product, activity

('Scientific research and development' (, United States, ('54: Professional, Scientific, and Technical Services', '5417: Scientific Research and Development Services')),
 'Other retail' (USD, United States, None))

The first dataframe is all the nodes (processes or activities) in the given database:

In [49]:
df = bd.Database("US EEIO 1.1").nodes_to_dataframe()
df

Unnamed: 0,CAS number,categories,classifications,code,database,description,dqEntry,dqSystem,exchangeDqSystem,filename,id,location,modified,name,processDocumentation,type,unit,version
1175,,"(water, unspecified)",,2ee4697d-b7f4-362b-86a4-94b644699500,US EEIO 1.1,,,,,,1134,,,"(2,4-DICHLOROPHENOXY)ACETIC ACID COMPD. WITH 2...",,emission,,
1745,,"(air, low population density)",,6ca23b5d-83dc-3b02-bf39-8eabf9d41151,US EEIO 1.1,,,,,,1640,,,"(2,4-DICHLOROPHENOXY)ACETIC ACID COMPD. WITH 2...",,emission,,
1948,,"(soil, groundwater)",,5b98f875-8d1c-3549-a7df-28d7d90e7ccb,US EEIO 1.1,,,,,,1427,,,"(2,4-DICHLOROPHENOXY)ACETIC ACID COMPD. WITH 2...",,emission,,
1174,,"(water, unspecified)",,93086e32-c013-3e34-a074-4760c72fe775,US EEIO 1.1,,,,,,1886,,,(4-CHLORO-2-METHYLPHENOXY)ACETIC ACID COMPD. W...,,emission,,
2300,,"(soil, groundwater)",,b8889098-9a89-35c6-b226-07cb66c83217,US EEIO 1.1,,,,,,2160,,,(4-CHLORO-2-METHYLPHENOXY)ACETIC ACID COMPD. W...,,emission,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1088,,"(air, unspecified)",,d9a5b786-d06c-44af-a088-b070aa605d9b,US EEIO 1.1,,,,,,2370,,,trifluralin,,emission,,
222,7440622.0,"(soil, industrial)",,1a5850a0-0069-4b73-bb91-7a61e8d45ae5,US EEIO 1.1,,,,,,971,,,vanadium,,emission,,
1146,7440622.0,"(water, unspecified)",,63e8256e-8549-11e0-9d78-0800200c9a66,US EEIO 1.1,,,,,,1577,,,vanadium,,emission,,
2373,7440622.0,"(air, unspecified)",,591b0a62-8064-4697-86ed-47bfa1f8b5e6,US EEIO 1.1,,,,,,1407,,,vanadium,,emission,,


The columns come from the data attributes stored on the nodes. If one node has the attribute, it is added as a column. You can control which columns get returned, and how they are sorted, see the docstring.

This is a normal dataframe, so you can filter it, add or remove columns, and sort as desired.

In [50]:
df.columns

Index(['CAS number', 'categories', 'classifications', 'code', 'database',
       'description', 'dqEntry', 'dqSystem', 'exchangeDqSystem', 'filename',
       'id', 'location', 'modified', 'name', 'processDocumentation', 'type',
       'unit', 'version'],
      dtype='object')

We can also list all the edges (exchanges) as a dataframe. This is normally too much information, and can take a bit of time to produce, but can be useful.

In [51]:
df = bd.Database("US EEIO 1.1").edges_to_dataframe()
df

Getting activity data


100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2649/2649 [00:00<00:00, 368893.76it/s]


Adding exchange data to activities


100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████| 162926/162926 [00:02<00:00, 61446.10it/s]


Filling out exchange data


100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2649/2649 [00:00<00:00, 180277.97it/s]


Creating DataFrame
Compressing DataFrame


Unnamed: 0,target_id,target_database,target_code,target_name,target_reference_product,target_location,target_unit,target_type,source_id,source_database,source_code,source_name,source_product,source_location,source_unit,source_categories,edge_amount,edge_type
0,1,US EEIO 1.1,01624075-b520-3826-bd73-2068f7aa24e7,Frozen food; at manufacturer,,United States,USD,process,459,US EEIO 1.1,26bed504-3f97-3a2b-aa83-ffbe94f3b371,Frozen food; at manufacturer,,United States,,,1.000000e+00,production
1,1,US EEIO 1.1,01624075-b520-3826-bd73-2068f7aa24e7,Frozen food; at manufacturer,,United States,USD,process,437,US EEIO 1.1,1bafcbbb-dbe0-338d-b9c1-8c355426cbef,State and local government enterprises,,United States,,,1.338032e-03,technosphere
2,1,US EEIO 1.1,01624075-b520-3826-bd73-2068f7aa24e7,Frozen food; at manufacturer,,United States,USD,process,1020,US EEIO 1.1,20185046-64bb-4c09-a8e7-e8a9e144ca98,Dinitrogen monoxide,,,,,1.114413e-07,biosphere
3,1,US EEIO 1.1,01624075-b520-3826-bd73-2068f7aa24e7,Frozen food; at manufacturer,,United States,USD,process,1728,US EEIO 1.1,7ae398b3-8532-11e0-9d78-0800200c9a66,ethylene glycol,,,,,1.223946e-07,biosphere
4,1,US EEIO 1.1,01624075-b520-3826-bd73-2068f7aa24e7,Frozen food; at manufacturer,,United States,USD,process,408,US EEIO 1.1,0bb0108f-c486-32b3-b059-e0c6c8380571,"Tobacco, cotton, sugarcane, peanuts, sugar bee...",,United States,,,9.990870e-06,technosphere
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
162921,388,US EEIO 1.1,fe5971a7-2610-32ca-8193-a94873de130c,Automatic controls for HVAC and refrigeration ...,,United States,USD,process,1426,US EEIO 1.1,5b2a19b9-1243-44ae-b76c-c0d92159d5d6,2-METHOXYETHANOL,,,,,8.602175e-12,biosphere
162922,388,US EEIO 1.1,fe5971a7-2610-32ca-8193-a94873de130c,Automatic controls for HVAC and refrigeration ...,,United States,USD,process,2622,US EEIO 1.1,fd7aa71c-508c-480d-81a6-8052aad92646,sulfur dioxide,,,,,2.635425e-08,biosphere
162923,388,US EEIO 1.1,fe5971a7-2610-32ca-8193-a94873de130c,Automatic controls for HVAC and refrigeration ...,,United States,USD,process,1462,US EEIO 1.1,5fd672a0-cb68-39e6-88dc-db1a9281c57b,"2,4,5,2',5'-PENTACHLOROBIPHENYL",,,,,9.791193e-19,biosphere
162924,388,US EEIO 1.1,fe5971a7-2610-32ca-8193-a94873de130c,Automatic controls for HVAC and refrigeration ...,,United States,USD,process,589,US EEIO 1.1,7e5c6ee1-a47e-3afd-9278-dcb5e1ee50b5,"Screws, nuts, and bolts; at manufacturer",,United States,,,1.810643e-02,technosphere


Now we have standard column labels. as these are directed edges, they have a source and a target. Most of the columns should be self-explanatory. Note that we differentiate between `'target_reference_product'` and `'source_product'`, and only provide the `categories` on the `source`.

In [52]:
df.columns

Index(['target_id', 'target_database', 'target_code', 'target_name',
       'target_reference_product', 'target_location', 'target_unit',
       'target_type', 'source_id', 'source_database', 'source_code',
       'source_name', 'source_product', 'source_location', 'source_unit',
       'source_categories', 'edge_amount', 'edge_type'],
      dtype='object')

If you want to add or remove columns, you can pass in an iterable of formatting functions. These functions must satisfy the following rules:

* The take the keyword arguments `node`, `edge`, and `row`.
* They modify the dictionary `row` in place. Any return value is ignored.
* `node` and `edge` are dictionaries following the [wurst internal format](https://wurst.readthedocs.io/#internal-data-format). `node` is the target, and `edge` is both attributes of the edge and of the source.

Here is a simple example:

In [61]:
def remove_target_database(node, edge, row):
    del row['target_database']
    
def food_sector(node, edge, row):
    row['is_food'] = 'food' in edge['name'].lower()

In [63]:
df = bd.Database("US EEIO 1.1").edges_to_dataframe(formatters=[remove_target_database, food_sector])
df

Getting activity data


100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2649/2649 [00:00<00:00, 268355.22it/s]


Adding exchange data to activities


100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████| 162926/162926 [00:02<00:00, 64784.46it/s]


Filling out exchange data


100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2649/2649 [00:00<00:00, 167645.59it/s]


Creating DataFrame
Compressing DataFrame


Unnamed: 0,target_id,target_code,target_name,target_reference_product,target_location,target_unit,target_type,source_id,source_database,source_code,source_name,source_product,source_location,source_unit,source_categories,edge_amount,edge_type,is_food
0,1,01624075-b520-3826-bd73-2068f7aa24e7,Frozen food; at manufacturer,,United States,USD,process,459,US EEIO 1.1,26bed504-3f97-3a2b-aa83-ffbe94f3b371,Frozen food; at manufacturer,,United States,,,1.000000e+00,production,True
1,1,01624075-b520-3826-bd73-2068f7aa24e7,Frozen food; at manufacturer,,United States,USD,process,437,US EEIO 1.1,1bafcbbb-dbe0-338d-b9c1-8c355426cbef,State and local government enterprises,,United States,,,1.338032e-03,technosphere,False
2,1,01624075-b520-3826-bd73-2068f7aa24e7,Frozen food; at manufacturer,,United States,USD,process,1020,US EEIO 1.1,20185046-64bb-4c09-a8e7-e8a9e144ca98,Dinitrogen monoxide,,,,,1.114413e-07,biosphere,False
3,1,01624075-b520-3826-bd73-2068f7aa24e7,Frozen food; at manufacturer,,United States,USD,process,1728,US EEIO 1.1,7ae398b3-8532-11e0-9d78-0800200c9a66,ethylene glycol,,,,,1.223946e-07,biosphere,False
4,1,01624075-b520-3826-bd73-2068f7aa24e7,Frozen food; at manufacturer,,United States,USD,process,408,US EEIO 1.1,0bb0108f-c486-32b3-b059-e0c6c8380571,"Tobacco, cotton, sugarcane, peanuts, sugar bee...",,United States,,,9.990870e-06,technosphere,False
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
162921,388,fe5971a7-2610-32ca-8193-a94873de130c,Automatic controls for HVAC and refrigeration ...,,United States,USD,process,1426,US EEIO 1.1,5b2a19b9-1243-44ae-b76c-c0d92159d5d6,2-METHOXYETHANOL,,,,,8.602175e-12,biosphere,False
162922,388,fe5971a7-2610-32ca-8193-a94873de130c,Automatic controls for HVAC and refrigeration ...,,United States,USD,process,2622,US EEIO 1.1,fd7aa71c-508c-480d-81a6-8052aad92646,sulfur dioxide,,,,,2.635425e-08,biosphere,False
162923,388,fe5971a7-2610-32ca-8193-a94873de130c,Automatic controls for HVAC and refrigeration ...,,United States,USD,process,1462,US EEIO 1.1,5fd672a0-cb68-39e6-88dc-db1a9281c57b,"2,4,5,2',5'-PENTACHLOROBIPHENYL",,,,,9.791193e-19,biosphere,False
162924,388,fe5971a7-2610-32ca-8193-a94873de130c,Automatic controls for HVAC and refrigeration ...,,United States,USD,process,589,US EEIO 1.1,7e5c6ee1-a47e-3afd-9278-dcb5e1ee50b5,"Screws, nuts, and bolts; at manufacturer",,United States,,,1.810643e-02,technosphere,False


In the case of `target_name`, the dataframe has more than 150.000 rows, but only 388 unique values.

You can skip the conversion to categorical columns by passing `categorical=False`.

To save on memory, we turn some columns into categorical columns, where each unique value is only stored once.

In [53]:
df.dtypes

target_id                      int64
target_database             category
target_code                   object
target_name                 category
target_reference_product    category
target_location             category
target_unit                 category
target_type                 category
source_id                      int64
source_database             category
source_code                 category
source_name                 category
source_product              category
source_location             category
source_unit                 category
source_categories           category
edge_amount                  float64
edge_type                   category
dtype: object

In [54]:
df['target_name']

0                              Frozen food; at manufacturer
1                              Frozen food; at manufacturer
2                              Frozen food; at manufacturer
3                              Frozen food; at manufacturer
4                              Frozen food; at manufacturer
                                ...                        
162921    Automatic controls for HVAC and refrigeration ...
162922    Automatic controls for HVAC and refrigeration ...
162923    Automatic controls for HVAC and refrigeration ...
162924    Automatic controls for HVAC and refrigeration ...
162925    Automatic controls for HVAC and refrigeration ...
Name: target_name, Length: 162926, dtype: category
Categories (388, object): ['Abrasive products; at manufacturer', 'Accounting, tax preparation, bookkeeping, and..., 'Adhesives; at manufacturer', 'Advertising and public relations', ..., 'Wiring devices; at manufacturer', 'Wood kitchen cabinets and countertops; at man..., 'Wood pulp; at m

We can also get a dataframe of the edges for a specific node. Here we get all edges, but you can filter this further with the edge constructors `.production()`, `.technosphere()`, and `.biosphere()`.

In [13]:
df = activity.exchanges().to_dataframe()
df

Unnamed: 0,target_id,target_database,target_code,target_name,target_reference_product,target_location,target_unit,target_type,source_id,source_database,source_code,source_name,source_product,source_location,source_unit,source_categories,edge_amount,edge_type
0,373,US EEIO 1.1,ee44f5fb-fc51-3071-a4b0-dc63b7e932bf,"Other computer related services, including fac...",,United States,USD,process,640,US EEIO 1.1,acf5099d-f095-3bdc-9efe-da7e3b535d40,"Other computer related services, including fac...",,United States,,"54: Professional, Scientific, and Technical Se...",1.000000e+00,production
1,373,US EEIO 1.1,ee44f5fb-fc51-3071-a4b0-dc63b7e932bf,"Other computer related services, including fac...",,United States,USD,process,1832,US EEIO 1.1,88ef28f1-cfd5-44a0-ac34-01acf2db84a0,"benzene, ethyl-",,,,air::unspecified,9.456940e-09,biosphere
2,373,US EEIO 1.1,ee44f5fb-fc51-3071-a4b0-dc63b7e932bf,"Other computer related services, including fac...",,United States,USD,process,464,US EEIO 1.1,2b8a18a4-1b89-3625-9263-bf9e8b6ebe1b,Household goods repair,,United States,,81: Other Services (except Public Administrati...,1.622952e-03,technosphere
3,373,US EEIO 1.1,ee44f5fb-fc51-3071-a4b0-dc63b7e932bf,"Other computer related services, including fac...",,United States,USD,process,438,US EEIO 1.1,1c64f0cb-bbab-3d07-9e38-8aff7b6746f0,Relay and industrial controls; at manufacturer,,United States,,31-33: Manufacturing::3353: Electrical Equipme...,7.666949e-09,technosphere
4,373,US EEIO 1.1,ee44f5fb-fc51-3071-a4b0-dc63b7e932bf,"Other computer related services, including fac...",,United States,USD,process,426,US EEIO 1.1,169c4fbb-c518-39e0-8344-3162382887c2,Environmental and other technical consulting s...,,United States,,"54: Professional, Scientific, and Technical Se...",5.468763e-03,technosphere
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
224,373,US EEIO 1.1,ee44f5fb-fc51-3071-a4b0-dc63b7e932bf,"Other computer related services, including fac...",,United States,USD,process,1072,US EEIO 1.1,281e1011-f121-4d5b-ad4d-22d562b2c2cd,"Ethane, pentafluoro-, HFC-125",,,,air::unspecified,8.367100e-09,biosphere
225,373,US EEIO 1.1,ee44f5fb-fc51-3071-a4b0-dc63b7e932bf,"Other computer related services, including fac...",,United States,USD,process,1726,US EEIO 1.1,7ae371aa-8532-11e0-9d78-0800200c9a66,Carbon dioxide,,,,air::unspecified,4.942435e-04,biosphere
226,373,US EEIO 1.1,ee44f5fb-fc51-3071-a4b0-dc63b7e932bf,"Other computer related services, including fac...",,United States,USD,process,1020,US EEIO 1.1,20185046-64bb-4c09-a8e7-e8a9e144ca98,Dinitrogen monoxide,,,,air::unspecified,2.449813e-09,biosphere
227,373,US EEIO 1.1,ee44f5fb-fc51-3071-a4b0-dc63b7e932bf,"Other computer related services, including fac...",,United States,USD,process,1181,US EEIO 1.1,37880630-22b3-3c88-b701-4abde6f0b9a8,"Water, fresh",,,,water::surface water,6.297825e-05,biosphere


Same columns as before.

In [14]:
df.columns

Index(['target_id', 'target_database', 'target_code', 'target_name',
       'target_reference_product', 'target_location', 'target_unit',
       'target_type', 'source_id', 'source_database', 'source_code',
       'source_name', 'source_product', 'source_location', 'source_unit',
       'source_categories', 'edge_amount', 'edge_type'],
      dtype='object')

We can also get dataframes for LCA calculation results.

In [18]:
lca = bc.LCA({product: 1}, method=('Impact Potential', 'HRSP'))
lca.lci()
lca.lcia()

By default, this method looks at the `characterized_inventory` matrix, and sorts by the top 200 values (using absolute value).

In [23]:
df = lca.to_dataframe()
df

Unnamed: 0,row_index,col_index,amount,row_id,col_id,row_database,row_code,row_name,row_location,row_unit,row_type,row_categories,row_product,col_database,col_code,col_name,col_location,col_unit,col_type,col_reference_product
0,1043,8,3.779723e-04,1822,9,US EEIO 1.1,87883a4e-1e3e-4c9d-90c0-f1bea36f8014,ammonia,,,emission,air::unspecified,,US EEIO 1.1,04ee2e71-af3b-39f3-8e69-bcae6a2d70d8,Dairies,United States,USD,process,
1,257,8,4.411418e-05,1035,9,US EEIO 1.1,21e46cb8-6233-4c99-bac3-c41d2ab99498,"particulates, < 2.5 um",,,emission,air::unspecified,,US EEIO 1.1,04ee2e71-af3b-39f3-8e69-bcae6a2d70d8,Dairies,United States,USD,process,
2,55,8,4.653532e-07,832,9,US EEIO 1.1,08a91e70-3ddc-11dd-91be-0050c2490048,"particulates, < 10 um",,,emission,air::unspecified,,US EEIO 1.1,04ee2e71-af3b-39f3-8e69-bcae6a2d70d8,Dairies,United States,USD,process,
3,1843,8,6.529780e-08,2622,9,US EEIO 1.1,fd7aa71c-508c-480d-81a6-8052aad92646,sulfur dioxide,,,emission,air::unspecified,,US EEIO 1.1,04ee2e71-af3b-39f3-8e69-bcae6a2d70d8,Dairies,United States,USD,process,
4,1445,8,3.981023e-06,2224,9,US EEIO 1.1,c1b91234-6f24-417b-8309-46111d09c457,nitrogen oxides,,,emission,air::unspecified,,US EEIO 1.1,04ee2e71-af3b-39f3-8e69-bcae6a2d70d8,Dairies,United States,USD,process,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
195,257,332,5.718976e-08,1035,333,US EEIO 1.1,21e46cb8-6233-4c99-bac3-c41d2ab99498,"particulates, < 2.5 um",,,emission,air::unspecified,,US EEIO 1.1,d5b7c244-8d46-3ede-9659-45986f6c6fb5,Cutlery and handtools; at manufacturer,United States,USD,process,
196,257,119,5.238594e-08,1035,120,US EEIO 1.1,21e46cb8-6233-4c99-bac3-c41d2ab99498,"particulates, < 2.5 um",,,emission,air::unspecified,,US EEIO 1.1,4775afeb-8e11-3ea2-92b0-4f1c41952703,Other secondary nonferrous metal products; at ...,United States,USD,process,
197,257,269,5.219094e-08,1035,270,US EEIO 1.1,21e46cb8-6233-4c99-bac3-c41d2ab99498,"particulates, < 2.5 um",,,emission,air::unspecified,,US EEIO 1.1,ad81ce2a-e3c5-3695-a2ef-812cd8b79dd3,Other plastic products; at manufacturer,United States,USD,process,
198,257,197,4.989681e-08,1035,198,US EEIO 1.1,21e46cb8-6233-4c99-bac3-c41d2ab99498,"particulates, < 2.5 um",,,emission,air::unspecified,,US EEIO 1.1,777544f7-e9cc-3593-8d70-9a018d2a87e2,"Metal coatings, engravings, and heat treatment...",United States,USD,process,


The columns labels are a bit different, as we don't have target and source but instead matrix rows and columns. The meaning of these rows and columns changes from matrix to matrix. The same pattern with `'row_product'`, `'col_reference_product'`, and `'row_categories'` applies though.

In [24]:
df.columns

Index(['row_index', 'col_index', 'amount', 'row_id', 'col_id', 'row_database',
       'row_code', 'row_name', 'row_location', 'row_unit', 'row_type',
       'row_categories', 'row_product', 'col_database', 'col_code', 'col_name',
       'col_location', 'col_unit', 'col_type', 'col_reference_product'],
      dtype='object')

We can get dataframes for any matrix. In standard LCA, the matrices are:

* inventory
* technosphere_matrix
* biosphere_matrix
* characterization_matrix
* characterized_inventory

Regionalization adds more matrices. Note that for other matrices you will need to specify the row and column mapping dictionaries, see the docstring.

In [22]:
lca.to_dataframe(matrix_label='biosphere_matrix')

Unnamed: 0,row_index,col_index,amount,row_id,col_id,row_database,row_code,row_name,row_location,row_unit,row_type,row_categories,row_product,col_database,col_code,col_name,col_location,col_unit,col_type,col_reference_product
0,232,18,372.440552,1010,19,US EEIO 1.1,1ece2361-87e0-355c-a702-ff268570ca3e,Coal,,,emission,resource::in ground,,US EEIO 1.1,08f1c4b8-03f9-360c-87be-31ad6c778da5,Coal; at mine,United States,USD,process,
1,947,18,0.174231,1726,19,US EEIO 1.1,7ae371aa-8532-11e0-9d78-0800200c9a66,Carbon dioxide,,,emission,air::unspecified,,US EEIO 1.1,08f1c4b8-03f9-360c-87be-31ad6c778da5,Coal; at mine,United States,USD,process,
2,232,75,0.283722,1010,76,US EEIO 1.1,1ece2361-87e0-355c-a702-ff268570ca3e,Coal,,,emission,resource::in ground,,US EEIO 1.1,2bf3d179-abc9-3d6f-983f-f2de03471649,Other support activities for mining,United States,USD,process,
3,947,75,0.116804,1726,76,US EEIO 1.1,7ae371aa-8532-11e0-9d78-0800200c9a66,Carbon dioxide,,,emission,air::unspecified,,US EEIO 1.1,2bf3d179-abc9-3d6f-983f-f2de03471649,Other support activities for mining,United States,USD,process,
4,1388,95,96.912102,2167,96,US EEIO 1.1,b91d0527-9a01-4a86-b420-c62b70629ba4,"Occupation, forest",,,emission,resource::land,,US EEIO 1.1,392eb1e3-3cd1-34c7-948c-c177114e8d20,Timber and raw forest products; at forest,United States,USD,process,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
195,947,121,0.092894,1726,122,US EEIO 1.1,7ae371aa-8532-11e0-9d78-0800200c9a66,Carbon dioxide,,,emission,air::unspecified,,US EEIO 1.1,48b2f105-8ae0-36e2-ad0a-cf7101ab8f4c,Residential building repair and maintanence,United States,USD,process,
196,947,39,0.092228,1726,40,US EEIO 1.1,7ae371aa-8532-11e0-9d78-0800200c9a66,Carbon dioxide,,,emission,air::unspecified,,US EEIO 1.1,15615fdc-2456-3e6c-bd24-9ed9a13d2599,Health care buildings,United States,USD,process,
197,947,191,0.090496,1726,192,US EEIO 1.1,7ae371aa-8532-11e0-9d78-0800200c9a66,Carbon dioxide,,,emission,air::unspecified,,US EEIO 1.1,7452907b-74cc-3106-aaaa-5560c0645af6,Flours and malts; at manufacturer,United States,USD,process,
198,947,340,0.090194,1726,341,US EEIO 1.1,7ae371aa-8532-11e0-9d78-0800200c9a66,Carbon dioxide,,,emission,air::unspecified,,US EEIO 1.1,d920c723-5594-34a5-8b55-0cbe517c2f9f,Polystyrene foam products; at manufacturer,United States,USD,process,


We can do the same for IO databases. IO databases are stored in a different way, with numeric values only stored as Numpy arrays, but the functions work the same. This wasn't trivial, you can see the [code in the iotable backend](https://github.com/brightway-lca/brightway2-data/tree/main/bw2data/backends/iotable), but it's awesome that it works!

In [27]:
bd.projects.set_current("to_dataframe Exiobase 3.8.1 monetary")

In [28]:
bi.bw2setup()

Writing activities to SQLite3 database:


Creating default biosphere

Applying strategy: normalize_units
Applying strategy: drop_unspecified_subcategories
Applying strategy: ensure_categories_are_tuples
Applied 3 strategies in 0.00 seconds


0% [##############################] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Title: Writing activities to SQLite3 database:
  Started: 08/21/2022 21:44:30
  Finished: 08/21/2022 21:44:30
  Total time elapsed: 00:00:00
  CPU %: 103.10
  Memory %: 1.73
Created database: biosphere3
Creating default LCIA methods

Applying strategy: normalize_units
Applying strategy: set_biosphere_type
Applying strategy: fix_ecoinvent_38_lcia_implementation
Applying strategy: drop_unspecified_subcategories
Applying strategy: link_iterable_by_fields
Applied 5 strategies in 0.57 seconds
Wrote 975 LCIA methods with 254388 characterization factors
Creating core data migrations



Experimental function to make EXIOBASE imports easier :)

In [29]:
bi.exiobase_monetary(year=2022)

Downloading IOT_2022_ixi.zip to /var/folders/rn/ht0vvs3s7mz2h9f_xjt9x4040000gn/T/tmp8rgdqyay/IOT_2022_ixi.zip


754188288it [02:04, 6047601.29it/s]                                                                                                                            
Writing activities to SQLite3 database:


Not able to determine geocollections for all datasets. This database is not ready for regionalization.


0% [##############################] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Title: Writing activities to SQLite3 database:
  Started: 08/21/2022 21:49:21
  Finished: 08/21/2022 21:49:21
  Total time elapsed: 00:00:00
  CPU %: 106.90
  Memory %: 2.96


Writing activities to SQLite3 database:


Created new database for EXIOBASE-specific biosphere flows: EXIOBASE 3.8.1 2022 monetary biosphere


0% [##############################] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Title: Writing activities to SQLite3 database:
  Started: 08/21/2022 21:49:23
  Finished: 08/21/2022 21:49:23
  Total time elapsed: 00:00:00
  CPU %: 100.90
  Memory %: 2.96
Created database of EXIOBASE activity metadata
Patched 74 LCIA methods with unit 'kg CO2-Eq'
Patching LCIA methods with EXIOBASE flows
Starting IO table write
Adding technosphere matrix


7988it [01:03, 125.03it/s]


Adding biosphere matrix


1114it [00:02, 408.37it/s]


Finalizing serialization
Created database EXIOBASE 3.8.1 2022 monetary. Cleaned up temporary downloads.


This is an instance of `IOTableBackend`.

In [37]:
bd.Database("EXIOBASE 3.8.1 2022 monetary")

Brightway2 IOTableBackend: EXIOBASE 3.8.1 2022 monetary

We can get all the nodes:

In [31]:
df = bd.Database("EXIOBASE 3.8.1 2022 monetary").nodes_to_dataframe()
df

Unnamed: 0,code,database,id,location,name,production volume,reference product,stam,unit
3231,Activities auxiliary to financial intermediati...,EXIOBASE 3.8.1 2022 monetary,5670,AT,Activities auxiliary to financial intermediation,4841.805237,Activities auxiliary to financial intermediation,,million €
850,Activities auxiliary to financial intermediati...,EXIOBASE 3.8.1 2022 monetary,11701,AU,Activities auxiliary to financial intermediation,57342.932946,Activities auxiliary to financial intermediation,,million €
3541,Activities auxiliary to financial intermediati...,EXIOBASE 3.8.1 2022 monetary,5833,BE,Activities auxiliary to financial intermediation,20947.253461,Activities auxiliary to financial intermediation,,million €
5659,Activities auxiliary to financial intermediati...,EXIOBASE 3.8.1 2022 monetary,5996,BG,Activities auxiliary to financial intermediation,325.803300,Activities auxiliary to financial intermediation,,million €
563,Activities auxiliary to financial intermediati...,EXIOBASE 3.8.1 2022 monetary,11049,BR,Activities auxiliary to financial intermediation,22782.277847,Activities auxiliary to financial intermediation,,million €
...,...,...,...,...,...,...,...,...,...
1564,"Wool, silk-worm cocoons|WE",EXIOBASE 3.8.1 2022 monetary,13053,WE,"Wool, silk-worm cocoons",112.305430,"Wool, silk-worm cocoons",Animals,million €
384,"Wool, silk-worm cocoons|WF",EXIOBASE 3.8.1 2022 monetary,13216,WF,"Wool, silk-worm cocoons",1439.132549,"Wool, silk-worm cocoons",Animals,million €
95,"Wool, silk-worm cocoons|WL",EXIOBASE 3.8.1 2022 monetary,12890,WL,"Wool, silk-worm cocoons",334.831769,"Wool, silk-worm cocoons",Animals,million €
1827,"Wool, silk-worm cocoons|WM",EXIOBASE 3.8.1 2022 monetary,13379,WM,"Wool, silk-worm cocoons",464.853941,"Wool, silk-worm cocoons",Animals,million €


Columns come from the attributes stored on the nodes:

In [32]:
df.columns

Index(['code', 'database', 'id', 'location', 'name', 'production volume',
       'reference product', 'stam', 'unit'],
      dtype='object')

We can also get all 39 million edges as a single dataframe, if you really want to. We use categorical columns and other tricks during construction to reduce memory usage. Same columns as before. **However**, conversion to categorical columns always happens, and you can't specify custom formatters.

In [33]:
df = bd.Database("EXIOBASE 3.8.1 2022 monetary").edges_to_dataframe()
df

Retrieving metadata
Loading datapackage
Creating metadata dataframes
Building merged dataframe
Compressing DataFrame


Unnamed: 0,target_id,source_id,edge_amount,edge_type,target_database,target_code,target_name,target_location,target_unit,target_type,target_reference_product,source_database,source_code,source_name,source_location,source_unit,source_categories,source_product
0,5541,5541,1.000000,production,EXIOBASE 3.8.1 2022 monetary,Cultivation of paddy rice|AT,Cultivation of paddy rice,AT,million €,process,Cultivation of paddy rice,EXIOBASE 3.8.1 2022 monetary,Cultivation of paddy rice|AT,Cultivation of paddy rice,AT,million €,,
1,5542,5542,0.032928,technosphere,EXIOBASE 3.8.1 2022 monetary,Cultivation of wheat|AT,Cultivation of wheat,AT,million €,process,Cultivation of wheat,EXIOBASE 3.8.1 2022 monetary,Cultivation of wheat|AT,Cultivation of wheat,AT,million €,,
2,5542,5542,1.000000,production,EXIOBASE 3.8.1 2022 monetary,Cultivation of wheat|AT,Cultivation of wheat,AT,million €,process,Cultivation of wheat,EXIOBASE 3.8.1 2022 monetary,Cultivation of wheat|AT,Cultivation of wheat,AT,million €,,
3,5548,5542,0.000567,technosphere,EXIOBASE 3.8.1 2022 monetary,Cultivation of crops nec|AT,Cultivation of crops nec,AT,million €,process,Cultivation of crops nec,EXIOBASE 3.8.1 2022 monetary,Cultivation of wheat|AT,Cultivation of wheat,AT,million €,,
4,5549,5542,0.005161,technosphere,EXIOBASE 3.8.1 2022 monetary,Cattle farming|AT,Cattle farming,AT,million €,process,Cattle farming,EXIOBASE 3.8.1 2022 monetary,Cultivation of wheat|AT,Cultivation of wheat,AT,million €,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
39010868,13445,13445,1.000000,production,EXIOBASE 3.8.1 2022 monetary,Re-processing of secondary copper into new cop...,Re-processing of secondary copper into new copper,WM,million €,process,Re-processing of secondary copper into new copper,EXIOBASE 3.8.1 2022 monetary,Re-processing of secondary copper into new cop...,Re-processing of secondary copper into new copper,WM,million €,,
39010869,13447,13447,1.000000,production,EXIOBASE 3.8.1 2022 monetary,Re-processing of secondary other non-ferrous m...,Re-processing of secondary other non-ferrous m...,WM,million €,process,Re-processing of secondary other non-ferrous m...,EXIOBASE 3.8.1 2022 monetary,Re-processing of secondary other non-ferrous m...,Re-processing of secondary other non-ferrous m...,WM,million €,,
39010870,13459,13459,1.000000,production,EXIOBASE 3.8.1 2022 monetary,Recycling of bottles by direct reuse|WM,Recycling of bottles by direct reuse,WM,million €,process,Recycling of bottles by direct reuse,EXIOBASE 3.8.1 2022 monetary,Recycling of bottles by direct reuse|WM,Recycling of bottles by direct reuse,WM,million €,,
39010871,13478,13478,1.000000,production,EXIOBASE 3.8.1 2022 monetary,Re-processing of secondary construction materi...,Re-processing of secondary construction materi...,WM,million €,process,Re-processing of secondary construction materi...,EXIOBASE 3.8.1 2022 monetary,Re-processing of secondary construction materi...,Re-processing of secondary construction materi...,WM,million €,,


And we can do the same thing for LCA matrices:

In [36]:
act = bd.Database("EXIOBASE 3.8.1 2022 monetary").random()
act

'Distribution and trade of electricity' (million €, CZ, None)

In [41]:
lca = bc.LCA({act: 1}, method=('IPCC 2013', 'climate change', 'GWP 100a'))
lca.lci()
lca.lcia()
lca.score

771164.4127744769

In [42]:
lca.to_dataframe()

Unnamed: 0,row_index,col_index,amount,row_id,col_id,row_database,row_code,row_name,row_location,row_unit,row_type,row_categories,row_product,col_database,col_code,col_name,col_location,col_unit,col_type,col_reference_product
0,21,1875,250227.301624,1229,6303,biosphere3,349b29d1-3e58-4c66-98b9-9d1a076efd2e,"Carbon dioxide, fossil",,kilogram,emission,air,,EXIOBASE 3.8.1 2022 monetary,Steam and hot water supply|CZ,Steam and hot water supply,CZ,million €,process,Steam and hot water supply
1,37,1875,467.484767,1737,6303,biosphere3,0795345f-c7ae-410c-ad25-1845784c75f5,"Methane, fossil",,kilogram,emission,air,,EXIOBASE 3.8.1 2022 monetary,Steam and hot water supply|CZ,Steam and hot water supply,CZ,million €,process,Steam and hot water supply
2,29,1875,5373.530425,1373,6303,biosphere3,20185046-64bb-4c09-a8e7-e8a9e144ca98,Dinitrogen monoxide,,kilogram,emission,air,,EXIOBASE 3.8.1 2022 monetary,Steam and hot water supply|CZ,Steam and hot water supply,CZ,million €,process,Steam and hot water supply
3,23,1875,226.716553,1241,6303,biosphere3,ba2f3f82-c93a-47a5-822a-37ec97495275,"Carbon monoxide, fossil",,kilogram,emission,air,,EXIOBASE 3.8.1 2022 monetary,Steam and hot water supply|CZ,Steam and hot water supply,CZ,million €,process,Steam and hot water supply
4,21,1860,144437.577992,1229,6288,biosphere3,349b29d1-3e58-4c66-98b9-9d1a076efd2e,"Carbon dioxide, fossil",,kilogram,emission,air,,EXIOBASE 3.8.1 2022 monetary,Production of electricity by coal|CZ,Production of electricity by coal,CZ,million €,process,Production of electricity by coal
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
195,35,1773,160.954083,1691,6201,biosphere3,6c977009-5c4e-4901-a4c1-ab20389cb972,"Methane, non-fossil",,kilogram,emission,"air::low population density, long-term",,EXIOBASE 3.8.1 2022 monetary,Cattle farming|CZ,Cattle farming,CZ,million €,process,Cattle farming
196,35,1778,156.990175,1691,6206,biosphere3,6c977009-5c4e-4901-a4c1-ab20389cb972,"Methane, non-fossil",,kilogram,emission,"air::low population density, long-term",,EXIOBASE 3.8.1 2022 monetary,Raw milk|CZ,Raw milk,CZ,million €,process,Raw milk
197,35,8624,130.299857,1691,13052,biosphere3,6c977009-5c4e-4901-a4c1-ab20389cb972,"Methane, non-fossil",,kilogram,emission,"air::low population density, long-term",,EXIOBASE 3.8.1 2022 monetary,Raw milk|WE,Raw milk,WE,million €,process,Raw milk
198,35,6994,129.783952,1691,11422,biosphere3,6c977009-5c4e-4901-a4c1-ab20389cb972,"Methane, non-fossil",,kilogram,emission,"air::low population density, long-term",,EXIOBASE 3.8.1 2022 monetary,Raw milk|RU,Raw milk,RU,million €,process,Raw milk


Filtering LCA matrices by default takes a maximum number of values to return, but can also be configured to return all values whose absolute value is above a certain fraction of the total score. In that case, the `cutoff` should be between zero and one.

In [44]:
lca.to_dataframe(cutoff=0.01, cutoff_mode="fraction")

Unnamed: 0,row_index,col_index,amount,row_id,col_id,row_database,row_code,row_name,row_location,row_unit,row_type,row_categories,row_product,col_database,col_code,col_name,col_location,col_unit,col_type,col_reference_product
0,21,1875,250227.301624,1229,6303,biosphere3,349b29d1-3e58-4c66-98b9-9d1a076efd2e,"Carbon dioxide, fossil",,kilogram,emission,air,,EXIOBASE 3.8.1 2022 monetary,Steam and hot water supply|CZ,Steam and hot water supply,CZ,million €,process,Steam and hot water supply
1,21,1860,144437.577992,1229,6288,biosphere3,349b29d1-3e58-4c66-98b9-9d1a076efd2e,"Carbon dioxide, fossil",,kilogram,emission,air,,EXIOBASE 3.8.1 2022 monetary,Production of electricity by coal|CZ,Production of electricity by coal,CZ,million €,process,Production of electricity by coal
2,21,7002,63153.868488,1229,11430,biosphere3,349b29d1-3e58-4c66-98b9-9d1a076efd2e,"Carbon dioxide, fossil",,kilogram,emission,air,,EXIOBASE 3.8.1 2022 monetary,Extraction of natural gas and services related...,Extraction of natural gas and services related...,RU,million €,process,Extraction of natural gas and services related...
3,37,7002,54524.151057,1737,11430,biosphere3,0795345f-c7ae-410c-ad25-1845784c75f5,"Methane, fossil",,kilogram,emission,air,,EXIOBASE 3.8.1 2022 monetary,Extraction of natural gas and services related...,Extraction of natural gas and services related...,RU,million €,process,Extraction of natural gas and services related...
4,21,1784,21887.13173,1229,6212,biosphere3,349b29d1-3e58-4c66-98b9-9d1a076efd2e,"Carbon dioxide, fossil",,kilogram,emission,air,,EXIOBASE 3.8.1 2022 monetary,Mining of coal and lignite; extraction of peat|CZ,Mining of coal and lignite; extraction of peat,CZ,million €,process,Mining of coal and lignite; extraction of peat
5,36,1784,8856.6754,1734,6212,biosphere3,70ef743b-3ed5-4a6d-b192-fb6d62378555,"Methane, fossil",,kilogram,emission,air::non-urban air or from high stacks,,EXIOBASE 3.8.1 2022 monetary,Mining of coal and lignite; extraction of peat|CZ,Mining of coal and lignite; extraction of peat,CZ,million €,process,Mining of coal and lignite; extraction of peat
6,21,1864,18390.216445,1229,6292,biosphere3,349b29d1-3e58-4c66-98b9-9d1a076efd2e,"Carbon dioxide, fossil",,kilogram,emission,air,,EXIOBASE 3.8.1 2022 monetary,Production of electricity by wind|CZ,Production of electricity by wind,CZ,million €,process,Production of electricity by wind
7,21,1885,18001.995079,1229,6313,biosphere3,349b29d1-3e58-4c66-98b9-9d1a076efd2e,"Carbon dioxide, fossil",,kilogram,emission,air,,EXIOBASE 3.8.1 2022 monetary,Other land transport|CZ,Other land transport,CZ,million €,process,Other land transport
8,37,1785,12786.322474,1737,6213,biosphere3,0795345f-c7ae-410c-ad25-1845784c75f5,"Methane, fossil",,kilogram,emission,air,,EXIOBASE 3.8.1 2022 monetary,Extraction of crude petroleum and services rel...,Extraction of crude petroleum and services rel...,CZ,million €,process,Extraction of crude petroleum and services rel...
9,37,7001,10154.086262,1737,11429,biosphere3,0795345f-c7ae-410c-ad25-1845784c75f5,"Methane, fossil",,kilogram,emission,air,,EXIOBASE 3.8.1 2022 monetary,Extraction of crude petroleum and services rel...,Extraction of crude petroleum and services rel...,RU,million €,process,Extraction of crude petroleum and services rel...
