# What's new in Brightway 2.5

## Backwards compatibility

Compatilibility with Brightway 2 has been maintained whenever possible, but there are a few cases where compatiblity could not be kept. There are noted **in bold** when they occur, and listed below:

* In `bw2data`, `Database.get()` and `Database().get()` are no longer supported. Use `get_node(database="something", **other_filters)` instead.
* In `bw2calc`, the `LCA` class now takes over responsibility for all types of LCA calculations, including Monte Carlo. Depending on the type of sampling strategy desired, use `LCA(use_distributions=True)` or `LCA(use_arrays=True)` instead of `MonteCarloLCA`.

Before we get started, let's install a simple database for examples:

In [1]:
import bw2data as bd
import bw2io as bi

In [2]:
bd.projects.set_current("2.5 examples in action")

In [4]:
bi.add_example_database()

  for idx, row in parser.parse():
Writing activities to SQLite3 database:


Extracted 4 worksheets in 0.03 seconds
Applying strategy: csv_restore_tuples
Applying strategy: csv_restore_booleans
Applying strategy: csv_numerize
Applying strategy: csv_drop_unknown
Applying strategy: csv_add_missing_exchanges_section
Applying strategy: strip_biosphere_exc_locations
Applying strategy: set_code_by_activity_hash
Applying strategy: assign_only_product_as_production
Applying strategy: drop_falsey_uncertainty_fields_but_keep_zeros
Applying strategy: convert_uncertainty_types_to_integers
Applying strategy: convert_activity_parameters_to_list
Applied 11 strategies in 0.00 seconds
Applying strategy: link_iterable_by_fields
Not able to determine geocollections for all datasets. This database is not ready for regionalization.


0% [#########] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Title: Writing activities to SQLite3 database:
  Started: 09/25/2022 12:57:24
  Finished: 09/25/2022 12:57:24
  Total time elapsed: 00:00:00
  CPU %: 105.50
  Memory %: 0.56
Created database: Mobility example


# `bw2data`

## `Node` and `Edge`

In addition to `process` and `activity`, we now have `node`. Before you throw things at the wall (how many names can they come up with?), let me explain. `bw2data` uses what is essentially a graph database, with two main tables: Nodes and edges (the actual table names are `ActivityDataset` and `ExchangeDataset`). Nodes can serve as processes/activities, but are also elementary/biosphere flows, and anything else we want to store in the database (logical relationships, impact assessment, named parameters, etc). This isn't too say that you store everything in the Brightway database, but `node` is clearly a better name that `process` for e.g. CO2. Here are the node accessors:

### `bw2data.get_node()`

`get_node` behaves differently than `get_activity`. `get_activity` assumes an input of a Brightway key - a combination of database and code. This won't work with `get_node`:

In [8]:
bd.get_activity(('Mobility example', 'Steel'))

'Steel' (kilogram, GLO, None)

In [9]:
bd.get_node(key=('Mobility example', 'Steel'))

UnknownObject: 

The reason why this doesn't work is that `get_node` only looks for specific attributes of the `node` itself, not composite ones like the `key`. If you need to pass a key, use `get_activity`; otherwise, rewrite your query:

In [10]:
bd.get_node(database='Mobility example', code='Steel')

'Steel' (kilogram, GLO, None)

We can also filter on other attributes, both those stored as "core" attributes (code, database, name, product, type, location):

In [11]:
bd.get_node(name='Steel')

'Steel' (kilogram, GLO, None)

But also other arbitrary attributes:

In [14]:
steel = bd.get_node(name='Steel')
steel['foo'] = 'bar'
steel.save()
bd.get_node(foo='bar')

'Steel' (kilogram, GLO, None)

`get_node` will raise `bw2data.errors.UnknownObject` if not node can be found that matches the given filters, or `bw2data.errors.MultipleResults` if more than one node fits the given filters.

You can also use `Database().get_node()`, it works the same way. However, **`Database.get()`** is deprecated, as this is now a [core Peewee method](https://docs.peewee-orm.com/en/latest/peewee/api.html#Model.get).

## `get_id` and the removal of `mapping`

Previously, the `mapping` object was stored as a pickle file, and linked activity/process/node *keys* to *integer ids*. This was very dumb, as we already have an indexed mapping of these objects to unique integer ids in the database itself: the primary key `id` column. `mapping` still exists, but only as a compatibility layer; instead, use `node.id` and `get_id()`:

In [33]:
steel.id

4

In [34]:
bd.get_id(steel)

4

In [35]:
bd.get_id(steel.key)

4

In [36]:
bd.mapping[steel.key]

4

In [37]:
str(bd.mapping)

'Obsolete mapping dictionary.'

There is an important implication of removing `mapping` - it is **no longer possible to reference non-existant nodes in edges**. This was allowed previously to allow for a very high degree of flexibility, but this is no longer technically possible, nor is it reasonable, as it is much too easy to cause unintentional errors.

## `bw2data` uses the database columns differently

The database schema for nodes (`ActivityDataset`) is:

```SQL
CREATE TABLE "activitydataset" (
    "id" INTEGER NOT NULL PRIMARY KEY, 
    "data" BLOB NOT NULL, 
    "code" TEXT NOT NULL, 
    "database" TEXT NOT NULL, 
    "location" TEXT, 
    "name" TEXT, 
    "product" TEXT, 
    "type" TEXT
)
```

Previously, when loading or saving rows to this table, all data including `code`, `database`, `location`, `name`, `product`, and `type` was serialized to the `data` blob as a [pickle](https://docs.python.org/3/library/pickle.html).This made loading the Brightway objects easy, but it effectively made the other columns read-only - changes made to database directly were not propogated when loading Brightway objects. This is now changed, we use the values in the database columns, so you can use them directly:

In [15]:
from bw2data.backends import ActivityDataset as AD

In [16]:
AD.update(name="Wow, this is some steel!").where(AD.name == 'Steel').execute()

1

In [22]:
bd.get_node(code="Steel")

Why should you care?

For one thing, in Brightway 3, there will no longer be an `ActivityDataset` and a separate `Activity` class; rather, they will be unified, and we will be able to use [peewee](https://docs.peewee-orm.com/en/latest/index.html) query methods natively. This won't necessarily be easier in all cases, but will expose more functionality, and will use less magic to hide the underlying database schema, which is better hygiene in the long run.

But learning to write SQL is also a good idea in itself - it is a different way of think about data, and things like bulk updates are always nice (unless they go wrong 😛).

## More powerful `Activity` attribute lookups

Some `Activity` objects for things like industry or product classifications, or properties like price or carbon content, but these can be awkward to retrieve:

In [9]:
steel = bd.get_node(name='Steel')
steel['properties'] = {'carbon content': {'amount': 0.01}}
steel['classifications'] = {'ISIC': {'code': '2410', 'system': 'ISIC Rev. 4'}}
steel.save()

In [10]:
[value for key, value in steel['properties'].items() if key == 'carbon content']

[{'amount': 0.01}]

Instead, we can now just do:

In [3]:
a = {1:2}
a.pop(2)

KeyError: 2

In [12]:
steel['carbon content']

{'amount': 0.01}

In [13]:
steel['ISIC']

{'code': '2410', 'system': 'ISIC Rev. 4'}

Please note the following:

* `classifications` are looked up before `properties`.
* Looking up normal attributes (even arbitrary ones) happens before traversing the `classifications` and `properties`.

## Easier access to reference products

If you have a suitably formatted activity, you can do:

In [15]:
steel.rp_exchange()

Exchange: 1 kilogram 'Steel' (kilogram, GLO, None) to 'Steel' (kilogram, GLO, None)>

This works by looking through all exchanges with the type `production`; if there is only one, that is returned; otherwise, return the exchange whose input name is the same as the node's `reference product`. Raises `ValueError` is no suitable exchange is found.

We can also look up attributes of the reference product exchange:

In [17]:
exc = steel.rp_exchange()
exc['properties'] = {'iron content': 0.98}
exc.save()

In [18]:
steel['iron content']

0.98

This lookup occurs after the `classifications` and `properties` of the node itself.

## Easier data cleanup

If you are building inventories manually, it is easy to accidentally add an exchange too many times:

In [25]:
steel, co2 = bd.get_node(name="Steel"), bd.get_node(name="CO2")

for _ in range(10):
    steel.new_edge(input=co2, amount=1.5, type="biosphere").save()

We can now easily clean up these duplicates:

In [28]:
bd.Database('Mobility example').delete_duplicate_exchanges()

Deleting exchange: Exchange: 1.5 kilogram 'CO2' (kilogram, None, None) to 'Steel' (kilogram, GLO, None)>
Deleting exchange: Exchange: 1.5 kilogram 'CO2' (kilogram, None, None) to 'Steel' (kilogram, GLO, None)>
Deleting exchange: Exchange: 1.5 kilogram 'CO2' (kilogram, None, None) to 'Steel' (kilogram, GLO, None)>
Deleting exchange: Exchange: 1.5 kilogram 'CO2' (kilogram, None, None) to 'Steel' (kilogram, GLO, None)>
Deleting exchange: Exchange: 1.5 kilogram 'CO2' (kilogram, None, None) to 'Steel' (kilogram, GLO, None)>
Deleting exchange: Exchange: 1.5 kilogram 'CO2' (kilogram, None, None) to 'Steel' (kilogram, GLO, None)>
Deleting exchange: Exchange: 1.5 kilogram 'CO2' (kilogram, None, None) to 'Steel' (kilogram, GLO, None)>
Deleting exchange: Exchange: 1.5 kilogram 'CO2' (kilogram, None, None) to 'Steel' (kilogram, GLO, None)>
Deleting exchange: Exchange: 1.5 kilogram 'CO2' (kilogram, None, None) to 'Steel' (kilogram, GLO, None)>
Deleting exchange: Exchange: 1.5 kilogram 'CO2' (kilogr

In [29]:
for exc in steel.biosphere():
    print(exc)

Exchange: 1.5 kilogram 'CO2' (kilogram, None, None) to 'Steel' (kilogram, GLO, None)>


As always, best practice is to always have a completely reproducible workflow, so that if things get messed up you can delete everything and regenerate the data from scratch!

## Filepaths are instances of `pathlib.Path`

`Path` objects are [pretty great](https://treyhunner.com/2018/12/why-you-should-be-using-pathlib/), you should [use them](https://docs.python.org/3/library/pathlib.html).

In [31]:
type(bd.projects.dir), type(bd.projects.logs_dir)

(pathlib.PosixPath, pathlib.PosixPath)

In [32]:
type(bd.Database('Mobility example').dirpath_processed()), type(bd.Database('Mobility example').filepath_processed())

(pathlib.PosixPath, pathlib.PosixPath)

## Easier access to `Datapackages`

Brightway `Datapackages` are a replacement for the previous processed arrays. They bring a range of new functionality, including the ability to be stored on many different kinds of physical or virtual filesystems using [PyFilesystem](https://docs.pyfilesystem.org/en/latest/). But this means that loading them requires the specification of a filesystem, which can be extra code, and a pain if you don't remember the exact command. Instead, you can use:

In [20]:
bd.Database('Mobility example').datapackage()

<bw_processing.datapackage.Datapackage at 0x14bf9c5e0>

Note that, in combination with the above, local PyFilesystem `OSFS` objects [**need a string, not a `Path`**](https://github.com/PyFilesystem/pyfilesystem2/issues/238).

## IOTable improvements

This is being rewritten in the `file-remover-progressive` branch, so will be demonstrated instead of provided as an example.

## Brightway ❤️ Pandas

The previous mix of functions for importing some data to dataframes have been replaced with a common set of methods which work with both `bw2data` and `bw2calc`.

They also work with IOTables.

In [4]:
import bw2io as bi

In [5]:
bd.projects.set_current("USEEIO")

Utility function to get the US EEIO and its LCIA methods easily. This has its own biosphere flows, so we don't run `bw2setup`.

In [6]:
bi.useeio11()

Downloading US EEIO 1.1
Unzipping file
Importing data
Applying strategy: json_ld_allocate_datasets
Applying strategy: json_ld_get_normalized_exchange_locations
Applying strategy: json_ld_convert_unit_to_reference_unit
Applying strategy: json_ld_get_activities_list_from_rawdata
Applying strategy: json_ld_add_products_as_activities
Applying strategy: json_ld_get_normalized_exchange_units
Applying strategy: json_ld_add_activity_unit
Applying strategy: json_ld_rename_metadata_fields
Applying strategy: json_ld_location_name
Applying strategy: json_ld_remove_fields
Applying strategy: json_ld_fix_process_type
Applying strategy: json_ld_label_exchange_type
Applying strategy: json_ld_prepare_exchange_fields_for_linking
Applying strategy: add_database_name
Applying strategy: link_iterable_by_fields
Applying strategy: link_iterable_by_fields
Applying strategy: normalize_units
Applied 17 strategies in 0.57 seconds
Moved 1873 biosphere flows to `self.data`
2649 datasets
162926 exchanges
0 unlinked 

Writing activities to SQLite3 database:


Not able to determine geocollections for all datasets. This database is not ready for regionalization.


0% [##############################] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Title: Writing activities to SQLite3 database:
  Started: 09/26/2022 00:11:56
  Finished: 09/26/2022 00:12:00
  Total time elapsed: 00:00:04
  CPU %: 97.70
  Memory %: 1.89
Created database: US EEIO 1.1
Applying strategy: json_ld_lcia_add_method_metadata
Applying strategy: json_ld_lcia_convert_to_list
Applying strategy: json_ld_lcia_set_method_metadata
Applying strategy: json_ld_lcia_reformat_cfs_as_exchanges
Applying strategy: normalize_units
Applied 5 strategies in 0.00 seconds
19 methods
4511 cfs
0 unlinked cfs
Wrote 19 LCIA methods with 4511 characterization factors


Pick a product and an activity node at random.

In [7]:
product = next(node for node in bd.Database("US EEIO 1.1") if node['type'] == 'product')
activity = next(node for node in bd.Database("US EEIO 1.1") if node['type'] == 'process')
product, activity

('Vehicle seating and interior trim (upholstery); at manufacturer' (, United States, ('31-33: Manufacturing', '3363: Motor Vehicle Parts Manufacturing')),
 'Ice cream and frozen desserts; at manufacturer' (USD, United States, None))

The first dataframe is all the nodes (processes or activities) in the given database:

In [8]:
df = bd.Database("US EEIO 1.1").nodes_to_dataframe()
df

Unnamed: 0,CAS number,categories,classifications,code,database,description,dqEntry,dqSystem,exchangeDqSystem,filename,id,location,modified,name,processDocumentation,type,unit,version
91,,"(water, unspecified)",,2ee4697d-b7f4-362b-86a4-94b644699500,US EEIO 1.1,,,,,,1134,,,"(2,4-DICHLOROPHENOXY)ACETIC ACID COMPD. WITH 2...",,emission,,
2165,,"(air, low population density)",,6ca23b5d-83dc-3b02-bf39-8eabf9d41151,US EEIO 1.1,,,,,,1640,,,"(2,4-DICHLOROPHENOXY)ACETIC ACID COMPD. WITH 2...",,emission,,
2336,,"(soil, groundwater)",,5b98f875-8d1c-3549-a7df-28d7d90e7ccb,US EEIO 1.1,,,,,,1427,,,"(2,4-DICHLOROPHENOXY)ACETIC ACID COMPD. WITH 2...",,emission,,
72,,"(water, unspecified)",,93086e32-c013-3e34-a074-4760c72fe775,US EEIO 1.1,,,,,,1886,,,(4-CHLORO-2-METHYLPHENOXY)ACETIC ACID COMPD. W...,,emission,,
892,,"(air, low population density)",,3404c9d4-8d41-36cb-8a95-c8b428518cfa,US EEIO 1.1,,,,,,1166,,,(4-CHLORO-2-METHYLPHENOXY)ACETIC ACID COMPD. W...,,emission,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2460,,"(air, unspecified)",,d9a5b786-d06c-44af-a088-b070aa605d9b,US EEIO 1.1,,,,,,2370,,,trifluralin,,emission,,
1251,7440622.0,"(water, unspecified)",,63e8256e-8549-11e0-9d78-0800200c9a66,US EEIO 1.1,,,,,,1577,,,vanadium,,emission,,
1785,7440622.0,"(soil, industrial)",,1a5850a0-0069-4b73-bb91-7a61e8d45ae5,US EEIO 1.1,,,,,,971,,,vanadium,,emission,,
2343,7440622.0,"(air, unspecified)",,591b0a62-8064-4697-86ed-47bfa1f8b5e6,US EEIO 1.1,,,,,,1407,,,vanadium,,emission,,


The columns come from the data attributes stored on the nodes. If one node has the attribute, it is added as a column. You can control which columns get returned, and how they are sorted, see the docstring.

This is a normal dataframe, so you can filter it, add or remove columns, and sort as desired.

In [9]:
df.columns

Index(['CAS number', 'categories', 'classifications', 'code', 'database',
       'description', 'dqEntry', 'dqSystem', 'exchangeDqSystem', 'filename',
       'id', 'location', 'modified', 'name', 'processDocumentation', 'type',
       'unit', 'version'],
      dtype='object')

We can also list all the edges (exchanges) as a dataframe. This is normally too much information, and can take a bit of time to produce, but can be useful.

In [10]:
df = bd.Database("US EEIO 1.1").edges_to_dataframe()
df

Getting activity data


100%|███████████████████████████████████| 2649/2649 [00:00<00:00, 265825.57it/s]


Adding exchange data to activities


100%|████████████████████████████████| 162926/162926 [00:02<00:00, 68147.52it/s]


Filling out exchange data


100%|███████████████████████████████████| 2649/2649 [00:00<00:00, 154092.86it/s]


Creating DataFrame
Compressing DataFrame


Unnamed: 0,target_id,target_database,target_code,target_name,target_reference_product,target_location,target_unit,target_type,source_id,source_database,source_code,source_name,source_product,source_location,source_unit,source_categories,edge_amount,edge_type
0,1,US EEIO 1.1,01624075-b520-3826-bd73-2068f7aa24e7,Frozen food; at manufacturer,,United States,USD,process,459,US EEIO 1.1,26bed504-3f97-3a2b-aa83-ffbe94f3b371,Frozen food; at manufacturer,,United States,,,1.000000e+00,production
1,1,US EEIO 1.1,01624075-b520-3826-bd73-2068f7aa24e7,Frozen food; at manufacturer,,United States,USD,process,437,US EEIO 1.1,1bafcbbb-dbe0-338d-b9c1-8c355426cbef,State and local government enterprises,,United States,,,1.338032e-03,technosphere
2,1,US EEIO 1.1,01624075-b520-3826-bd73-2068f7aa24e7,Frozen food; at manufacturer,,United States,USD,process,1020,US EEIO 1.1,20185046-64bb-4c09-a8e7-e8a9e144ca98,Dinitrogen monoxide,,,,,1.114413e-07,biosphere
3,1,US EEIO 1.1,01624075-b520-3826-bd73-2068f7aa24e7,Frozen food; at manufacturer,,United States,USD,process,1728,US EEIO 1.1,7ae398b3-8532-11e0-9d78-0800200c9a66,ethylene glycol,,,,,1.223946e-07,biosphere
4,1,US EEIO 1.1,01624075-b520-3826-bd73-2068f7aa24e7,Frozen food; at manufacturer,,United States,USD,process,408,US EEIO 1.1,0bb0108f-c486-32b3-b059-e0c6c8380571,"Tobacco, cotton, sugarcane, peanuts, sugar bee...",,United States,,,9.990870e-06,technosphere
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
162921,388,US EEIO 1.1,fe5971a7-2610-32ca-8193-a94873de130c,Automatic controls for HVAC and refrigeration ...,,United States,USD,process,1426,US EEIO 1.1,5b2a19b9-1243-44ae-b76c-c0d92159d5d6,2-METHOXYETHANOL,,,,,8.602175e-12,biosphere
162922,388,US EEIO 1.1,fe5971a7-2610-32ca-8193-a94873de130c,Automatic controls for HVAC and refrigeration ...,,United States,USD,process,2622,US EEIO 1.1,fd7aa71c-508c-480d-81a6-8052aad92646,sulfur dioxide,,,,,2.635425e-08,biosphere
162923,388,US EEIO 1.1,fe5971a7-2610-32ca-8193-a94873de130c,Automatic controls for HVAC and refrigeration ...,,United States,USD,process,1462,US EEIO 1.1,5fd672a0-cb68-39e6-88dc-db1a9281c57b,"2,4,5,2',5'-PENTACHLOROBIPHENYL",,,,,9.791193e-19,biosphere
162924,388,US EEIO 1.1,fe5971a7-2610-32ca-8193-a94873de130c,Automatic controls for HVAC and refrigeration ...,,United States,USD,process,589,US EEIO 1.1,7e5c6ee1-a47e-3afd-9278-dcb5e1ee50b5,"Screws, nuts, and bolts; at manufacturer",,United States,,,1.810643e-02,technosphere


Now we have standard column labels. as these are directed edges, they have a source and a target. Most of the columns should be self-explanatory. Note that we differentiate between `'target_reference_product'` and `'source_product'`, and only provide the `categories` on the `source`.

In [11]:
df.columns

Index(['target_id', 'target_database', 'target_code', 'target_name',
       'target_reference_product', 'target_location', 'target_unit',
       'target_type', 'source_id', 'source_database', 'source_code',
       'source_name', 'source_product', 'source_location', 'source_unit',
       'source_categories', 'edge_amount', 'edge_type'],
      dtype='object')

If you want to add or remove columns, you can pass in an iterable of formatting functions. These functions must satisfy the following rules:

* The take the keyword arguments `node`, `edge`, and `row`.
* They modify the dictionary `row` in place. Any return value is ignored.
* `node` and `edge` are dictionaries following the [wurst internal format](https://wurst.readthedocs.io/#internal-data-format). `node` is the target, and `edge` is both attributes of the edge and of the source.

Here is a simple example:

In [12]:
def remove_target_database(node, edge, row):
    del row['target_database']
    
def food_sector(node, edge, row):
    row['is_food'] = 'food' in edge['name'].lower()

In [13]:
df = bd.Database("US EEIO 1.1").edges_to_dataframe(formatters=[remove_target_database, food_sector])
df

Getting activity data


100%|███████████████████████████████████| 2649/2649 [00:00<00:00, 352117.36it/s]


Adding exchange data to activities


100%|████████████████████████████████| 162926/162926 [00:02<00:00, 67688.72it/s]


Filling out exchange data


100%|███████████████████████████████████| 2649/2649 [00:00<00:00, 167108.52it/s]


Creating DataFrame
Compressing DataFrame


Unnamed: 0,target_id,target_code,target_name,target_reference_product,target_location,target_unit,target_type,source_id,source_database,source_code,source_name,source_product,source_location,source_unit,source_categories,edge_amount,edge_type,is_food
0,1,01624075-b520-3826-bd73-2068f7aa24e7,Frozen food; at manufacturer,,United States,USD,process,459,US EEIO 1.1,26bed504-3f97-3a2b-aa83-ffbe94f3b371,Frozen food; at manufacturer,,United States,,,1.000000e+00,production,True
1,1,01624075-b520-3826-bd73-2068f7aa24e7,Frozen food; at manufacturer,,United States,USD,process,437,US EEIO 1.1,1bafcbbb-dbe0-338d-b9c1-8c355426cbef,State and local government enterprises,,United States,,,1.338032e-03,technosphere,False
2,1,01624075-b520-3826-bd73-2068f7aa24e7,Frozen food; at manufacturer,,United States,USD,process,1020,US EEIO 1.1,20185046-64bb-4c09-a8e7-e8a9e144ca98,Dinitrogen monoxide,,,,,1.114413e-07,biosphere,False
3,1,01624075-b520-3826-bd73-2068f7aa24e7,Frozen food; at manufacturer,,United States,USD,process,1728,US EEIO 1.1,7ae398b3-8532-11e0-9d78-0800200c9a66,ethylene glycol,,,,,1.223946e-07,biosphere,False
4,1,01624075-b520-3826-bd73-2068f7aa24e7,Frozen food; at manufacturer,,United States,USD,process,408,US EEIO 1.1,0bb0108f-c486-32b3-b059-e0c6c8380571,"Tobacco, cotton, sugarcane, peanuts, sugar bee...",,United States,,,9.990870e-06,technosphere,False
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
162921,388,fe5971a7-2610-32ca-8193-a94873de130c,Automatic controls for HVAC and refrigeration ...,,United States,USD,process,1426,US EEIO 1.1,5b2a19b9-1243-44ae-b76c-c0d92159d5d6,2-METHOXYETHANOL,,,,,8.602175e-12,biosphere,False
162922,388,fe5971a7-2610-32ca-8193-a94873de130c,Automatic controls for HVAC and refrigeration ...,,United States,USD,process,2622,US EEIO 1.1,fd7aa71c-508c-480d-81a6-8052aad92646,sulfur dioxide,,,,,2.635425e-08,biosphere,False
162923,388,fe5971a7-2610-32ca-8193-a94873de130c,Automatic controls for HVAC and refrigeration ...,,United States,USD,process,1462,US EEIO 1.1,5fd672a0-cb68-39e6-88dc-db1a9281c57b,"2,4,5,2',5'-PENTACHLOROBIPHENYL",,,,,9.791193e-19,biosphere,False
162924,388,fe5971a7-2610-32ca-8193-a94873de130c,Automatic controls for HVAC and refrigeration ...,,United States,USD,process,589,US EEIO 1.1,7e5c6ee1-a47e-3afd-9278-dcb5e1ee50b5,"Screws, nuts, and bolts; at manufacturer",,United States,,,1.810643e-02,technosphere,False


In the case of `target_name`, the dataframe has more than 150.000 rows, but only 388 unique values.

You can skip the conversion to categorical columns by passing `categorical=False`.

To save on memory, we turn some columns into categorical columns, where each unique value is only stored once.

In [14]:
df.dtypes

target_id                      int64
target_code                   object
target_name                 category
target_reference_product    category
target_location             category
target_unit                 category
target_type                 category
source_id                      int64
source_database             category
source_code                 category
source_name                 category
source_product              category
source_location             category
source_unit                 category
source_categories           category
edge_amount                  float64
edge_type                   category
is_food                         bool
dtype: object

In [15]:
df['target_name']

0                              Frozen food; at manufacturer
1                              Frozen food; at manufacturer
2                              Frozen food; at manufacturer
3                              Frozen food; at manufacturer
4                              Frozen food; at manufacturer
                                ...                        
162921    Automatic controls for HVAC and refrigeration ...
162922    Automatic controls for HVAC and refrigeration ...
162923    Automatic controls for HVAC and refrigeration ...
162924    Automatic controls for HVAC and refrigeration ...
162925    Automatic controls for HVAC and refrigeration ...
Name: target_name, Length: 162926, dtype: category
Categories (388, object): ['Abrasive products; at manufacturer', 'Accounting, tax preparation, bookkeeping, and..., 'Adhesives; at manufacturer', 'Advertising and public relations', ..., 'Wiring devices; at manufacturer', 'Wood kitchen cabinets and countertops; at man..., 'Wood pulp; at m

We can also get a dataframe of the edges for a specific node. Here we get all edges, but you can filter this further with the edge constructors `.production()`, `.technosphere()`, and `.biosphere()`.

In [16]:
df = activity.exchanges().to_dataframe()
df

Unnamed: 0,target_id,target_database,target_code,target_name,target_reference_product,target_location,target_unit,target_type,source_id,source_database,source_code,source_name,source_product,source_location,source_unit,source_categories,edge_amount,edge_type
0,6,US EEIO 1.1,03fe82dd-2ea7-3c66-adfa-e714b1a88fe5,Ice cream and frozen desserts; at manufacturer,,United States,USD,process,555,US EEIO 1.1,60b7a10b-eec3-3690-80a1-e9a18d107ee9,Ice cream and frozen desserts; at manufacturer,,United States,,31-33: Manufacturing::3115: Dairy Product Manu...,1.000000e+00,production
1,6,US EEIO 1.1,03fe82dd-2ea7-3c66-adfa-e714b1a88fe5,Ice cream and frozen desserts; at manufacturer,,United States,USD,process,1077,US EEIO 1.1,28999907-a8a7-45b3-857e-836495ca2aa0,benzene,,,,air::unspecified,1.310432e-08,biosphere
2,6,US EEIO 1.1,03fe82dd-2ea7-3c66-adfa-e714b1a88fe5,Ice cream and frozen desserts; at manufacturer,,United States,USD,process,1694,US EEIO 1.1,770c88e4-cd71-315c-b0b1-ea502618eb04,trichloroethylene,,,,air::unspecified,6.402135e-11,biosphere
3,6,US EEIO 1.1,03fe82dd-2ea7-3c66-adfa-e714b1a88fe5,Ice cream and frozen desserts; at manufacturer,,United States,USD,process,501,US EEIO 1.1,3eca5c38-99e8-3373-b9ec-67cce9e286e6,Other retail,,United States,,Technosphere Flows::44-45: Retail Trade,1.406090e-03,technosphere
4,6,US EEIO 1.1,03fe82dd-2ea7-3c66-adfa-e714b1a88fe5,Ice cream and frozen desserts; at manufacturer,,United States,USD,process,1661,US EEIO 1.1,71234253-b3a7-4dfe-b166-a484ad15bee7,mercury,,,,air::unspecified,4.069678e-12,biosphere
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
232,6,US EEIO 1.1,03fe82dd-2ea7-3c66-adfa-e714b1a88fe5,Ice cream and frozen desserts; at manufacturer,,United States,USD,process,653,US EEIO 1.1,b5a870b7-779a-318b-b9df-bd29c37ea11e,Other support services,,United States,,56: Administrative and Support and Waste Manag...,7.256240e-06,technosphere
233,6,US EEIO 1.1,03fe82dd-2ea7-3c66-adfa-e714b1a88fe5,Ice cream and frozen desserts; at manufacturer,,United States,USD,process,1814,US EEIO 1.1,8648c983-3acb-4c4d-8607-bb070188d654,pyrene,,,,air::unspecified,6.731819e-11,biosphere
234,6,US EEIO 1.1,03fe82dd-2ea7-3c66-adfa-e714b1a88fe5,Ice cream and frozen desserts; at manufacturer,,United States,USD,process,471,US EEIO 1.1,31b35ef3-888b-396d-a607-1dcce0d8e728,"Investment advice, portfolio management, and o...",,United States,,52: Finance and Insurance::5239: Other Financi...,1.180070e-04,technosphere
235,6,US EEIO 1.1,03fe82dd-2ea7-3c66-adfa-e714b1a88fe5,Ice cream and frozen desserts; at manufacturer,,United States,USD,process,457,US EEIO 1.1,259903d4-a4df-3640-868c-694f4d057f0c,Computer systems design,,United States,,"54: Professional, Scientific, and Technical Se...",3.612774e-04,technosphere


Same columns as before.

In [17]:
df.columns

Index(['target_id', 'target_database', 'target_code', 'target_name',
       'target_reference_product', 'target_location', 'target_unit',
       'target_type', 'source_id', 'source_database', 'source_code',
       'source_name', 'source_product', 'source_location', 'source_unit',
       'source_categories', 'edge_amount', 'edge_type'],
      dtype='object')

We can also get dataframes for LCA calculation results.

In [18]:
lca = bc.LCA({product: 1}, method=('Impact Potential', 'HRSP'))
lca.lci()
lca.lcia()

By default, this method looks at the `characterized_inventory` matrix, and sorts by the top 200 values (using absolute value).

In [23]:
df = lca.to_dataframe()
df

Unnamed: 0,row_index,col_index,amount,row_id,col_id,row_database,row_code,row_name,row_location,row_unit,row_type,row_categories,row_product,col_database,col_code,col_name,col_location,col_unit,col_type,col_reference_product
0,1043,8,3.779723e-04,1822,9,US EEIO 1.1,87883a4e-1e3e-4c9d-90c0-f1bea36f8014,ammonia,,,emission,air::unspecified,,US EEIO 1.1,04ee2e71-af3b-39f3-8e69-bcae6a2d70d8,Dairies,United States,USD,process,
1,257,8,4.411418e-05,1035,9,US EEIO 1.1,21e46cb8-6233-4c99-bac3-c41d2ab99498,"particulates, < 2.5 um",,,emission,air::unspecified,,US EEIO 1.1,04ee2e71-af3b-39f3-8e69-bcae6a2d70d8,Dairies,United States,USD,process,
2,55,8,4.653532e-07,832,9,US EEIO 1.1,08a91e70-3ddc-11dd-91be-0050c2490048,"particulates, < 10 um",,,emission,air::unspecified,,US EEIO 1.1,04ee2e71-af3b-39f3-8e69-bcae6a2d70d8,Dairies,United States,USD,process,
3,1843,8,6.529780e-08,2622,9,US EEIO 1.1,fd7aa71c-508c-480d-81a6-8052aad92646,sulfur dioxide,,,emission,air::unspecified,,US EEIO 1.1,04ee2e71-af3b-39f3-8e69-bcae6a2d70d8,Dairies,United States,USD,process,
4,1445,8,3.981023e-06,2224,9,US EEIO 1.1,c1b91234-6f24-417b-8309-46111d09c457,nitrogen oxides,,,emission,air::unspecified,,US EEIO 1.1,04ee2e71-af3b-39f3-8e69-bcae6a2d70d8,Dairies,United States,USD,process,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
195,257,332,5.718976e-08,1035,333,US EEIO 1.1,21e46cb8-6233-4c99-bac3-c41d2ab99498,"particulates, < 2.5 um",,,emission,air::unspecified,,US EEIO 1.1,d5b7c244-8d46-3ede-9659-45986f6c6fb5,Cutlery and handtools; at manufacturer,United States,USD,process,
196,257,119,5.238594e-08,1035,120,US EEIO 1.1,21e46cb8-6233-4c99-bac3-c41d2ab99498,"particulates, < 2.5 um",,,emission,air::unspecified,,US EEIO 1.1,4775afeb-8e11-3ea2-92b0-4f1c41952703,Other secondary nonferrous metal products; at ...,United States,USD,process,
197,257,269,5.219094e-08,1035,270,US EEIO 1.1,21e46cb8-6233-4c99-bac3-c41d2ab99498,"particulates, < 2.5 um",,,emission,air::unspecified,,US EEIO 1.1,ad81ce2a-e3c5-3695-a2ef-812cd8b79dd3,Other plastic products; at manufacturer,United States,USD,process,
198,257,197,4.989681e-08,1035,198,US EEIO 1.1,21e46cb8-6233-4c99-bac3-c41d2ab99498,"particulates, < 2.5 um",,,emission,air::unspecified,,US EEIO 1.1,777544f7-e9cc-3593-8d70-9a018d2a87e2,"Metal coatings, engravings, and heat treatment...",United States,USD,process,


The columns labels are a bit different, as we don't have target and source but instead matrix rows and columns. The meaning of these rows and columns changes from matrix to matrix. The same pattern with `'row_product'`, `'col_reference_product'`, and `'row_categories'` applies though.

In [24]:
df.columns

Index(['row_index', 'col_index', 'amount', 'row_id', 'col_id', 'row_database',
       'row_code', 'row_name', 'row_location', 'row_unit', 'row_type',
       'row_categories', 'row_product', 'col_database', 'col_code', 'col_name',
       'col_location', 'col_unit', 'col_type', 'col_reference_product'],
      dtype='object')

We can get dataframes for any matrix. In standard LCA, the matrices are:

* inventory
* technosphere_matrix
* biosphere_matrix
* characterization_matrix
* characterized_inventory

Regionalization adds more matrices. Note that for other matrices you will need to specify the row and column mapping dictionaries, see the docstring.

In [22]:
lca.to_dataframe(matrix_label='biosphere_matrix')

Unnamed: 0,row_index,col_index,amount,row_id,col_id,row_database,row_code,row_name,row_location,row_unit,row_type,row_categories,row_product,col_database,col_code,col_name,col_location,col_unit,col_type,col_reference_product
0,232,18,372.440552,1010,19,US EEIO 1.1,1ece2361-87e0-355c-a702-ff268570ca3e,Coal,,,emission,resource::in ground,,US EEIO 1.1,08f1c4b8-03f9-360c-87be-31ad6c778da5,Coal; at mine,United States,USD,process,
1,947,18,0.174231,1726,19,US EEIO 1.1,7ae371aa-8532-11e0-9d78-0800200c9a66,Carbon dioxide,,,emission,air::unspecified,,US EEIO 1.1,08f1c4b8-03f9-360c-87be-31ad6c778da5,Coal; at mine,United States,USD,process,
2,232,75,0.283722,1010,76,US EEIO 1.1,1ece2361-87e0-355c-a702-ff268570ca3e,Coal,,,emission,resource::in ground,,US EEIO 1.1,2bf3d179-abc9-3d6f-983f-f2de03471649,Other support activities for mining,United States,USD,process,
3,947,75,0.116804,1726,76,US EEIO 1.1,7ae371aa-8532-11e0-9d78-0800200c9a66,Carbon dioxide,,,emission,air::unspecified,,US EEIO 1.1,2bf3d179-abc9-3d6f-983f-f2de03471649,Other support activities for mining,United States,USD,process,
4,1388,95,96.912102,2167,96,US EEIO 1.1,b91d0527-9a01-4a86-b420-c62b70629ba4,"Occupation, forest",,,emission,resource::land,,US EEIO 1.1,392eb1e3-3cd1-34c7-948c-c177114e8d20,Timber and raw forest products; at forest,United States,USD,process,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
195,947,121,0.092894,1726,122,US EEIO 1.1,7ae371aa-8532-11e0-9d78-0800200c9a66,Carbon dioxide,,,emission,air::unspecified,,US EEIO 1.1,48b2f105-8ae0-36e2-ad0a-cf7101ab8f4c,Residential building repair and maintanence,United States,USD,process,
196,947,39,0.092228,1726,40,US EEIO 1.1,7ae371aa-8532-11e0-9d78-0800200c9a66,Carbon dioxide,,,emission,air::unspecified,,US EEIO 1.1,15615fdc-2456-3e6c-bd24-9ed9a13d2599,Health care buildings,United States,USD,process,
197,947,191,0.090496,1726,192,US EEIO 1.1,7ae371aa-8532-11e0-9d78-0800200c9a66,Carbon dioxide,,,emission,air::unspecified,,US EEIO 1.1,7452907b-74cc-3106-aaaa-5560c0645af6,Flours and malts; at manufacturer,United States,USD,process,
198,947,340,0.090194,1726,341,US EEIO 1.1,7ae371aa-8532-11e0-9d78-0800200c9a66,Carbon dioxide,,,emission,air::unspecified,,US EEIO 1.1,d920c723-5594-34a5-8b55-0cbe517c2f9f,Polystyrene foam products; at manufacturer,United States,USD,process,


# bw2calc

## `LCA` object can now do Monte Carlo

## `.redo_lci` ➡️ `.lci`, `.redo_lcia` ➡️ `.lcia`

## Specify `data_objs` and new functional unit

## New `.dicts` accessor

## `bw2data.prepare_lca_inputs`

## No automatic remapping