# Indexing in Brightway 2.5

Brightway 2.5 keeps backwards compatibility for most things, but starts a shift in emphasis from the use of activity keys (e.g. `("some database", "some string")`) to integer IDs.

In Brightway 2, the processed data arrays (the Numpy arrays stored as files), each activity, product, biosphere flow, and location need to be associated with a unique integer ID, as the associated columns in the Numpy arrays have integer data types. This stays the same in 2.5, but the *origin* of the integer IDs changes for activities, products, and biosphere flows.

Previously, these integer IDs were stored in a separate dictionary, the `mapping` dictionary. As the number of activities, etc., grew larger, keeping such a dictionary in memory, and synced to a serialized form on disk anytime there were changes, started to become unwiedy. Moreover, *we were already storing* a unique integer ID for activities, etc., as the primary key in our database. We could make life much easier by just using those database values, so that is what 2.5 does.

Locations were stored in a similar file, called `geomapping`, and we use the same system in 2.5. Locations will get a separate database table in Brightway 3.

## Backwards compatibility

All the methods you know and ... love? still work fine.

In [1]:
import bw2data as bd

In [2]:
bd.projects.set_current("ecoinvent 3.7.1")

In [3]:
something = bd.Database("ecoinvent 3.7.1").random()

In [4]:
something.key

('ecoinvent 3.7.1', 'dc864e0dbc39c4e57e892d3a950542d9')

In [5]:
something['code']

'dc864e0dbc39c4e57e892d3a950542d9'

However, we also now have the `.id` attribute:

In [6]:
something.id

11196

And we can use this ID in `get_activity`:

In [7]:
bd.get_activity(something.id) == something

True

In [8]:
bd.Database("ecoinvent 3.7.1").get(something.id) == something

True

We also now have a `get_id` function, and this will be the preferred form in the future.

Normal `Activity` objects already have the `.id` attribute, but `get_id` also takes activity keys.

In [9]:
bd.get_id(something)

11196

In [10]:
bd.get_id(something.key)

11196

The `mapping` object still exists, but it is only a shim for `get_id`.

In [11]:
bd.mapping[something.key]

11196

In [12]:
str(bd.mapping)

'Obsolete mapping dictionary.'

## Indexing matrices in `LCA` objects

`LCA` objects need a dictionary that maps matrix row and column indices to the indentifiers for activities, etc. In Brightway 2, these mappings were `activity_dict`, `product_dict`, and `biosphere_dict`. In 2.5, we create on instance of a [DictionaryManager](https://github.com/brightway-lca/brightway2-calc/blob/master/bw2calc/dictionary_manager.py), which can have many `ReversibleRemappableDictionary` objects. As some LCA objects can have *many* mapping dictionaries, I considered it cleaner to put everything together in one managing object. Moreover, we can make things a bit more efficient by *only* building the parts of the dictionaries on demand. In fact, in normal calculations we don't need the dictionary at all, as we use a different algorithm with Numpy arrays to map values when constructing the matrices themselves.

We can get behaviour compatible with Brightway2 by calling the method `.remap_inventory()` after the LCI:

In [13]:
import bw2calc as bc

In [14]:
lca = bc.LCA({something: 1})

In [15]:
lca.lci()

In [16]:
lca.remap_inventory()

In [17]:
lca.dicts.activity

<bw2calc.dictionary_manager.ReversibleRemappableDictionary at 0x144b42a30>

In [18]:
some_int = lca.dicts.activity[something.key]
some_int

6874

Each `ReversibleRemappableDictionary` can be reversed:

In [19]:
lca.dicts.activity.reversed[some_int]

('ecoinvent 3.7.1', 'dc864e0dbc39c4e57e892d3a950542d9')

Calling `reversed` will create a reversed dictionary, and keep it in memory.

## Indexing matrices the 2.5 way

In 2.5, we can call the convenience function `prepare_lca_inputs` to get more control and save some system resources. Actually, this is what happens anyway when you instantiate `LCA` using the Brightway2 input arguments.

In [20]:
demand, data_objs, remapping_dicts = bd.prepare_lca_inputs({something: 1})

In [21]:
lca2 = bc.LCA(demand=demand, data_objs=data_objs, remapping_dicts=remapping_dicts)

Note that here we need to label in the input keyword arguments explicitly.

In [22]:
lca2.lci()

In [23]:
lca2.supply_array.sum() == lca.supply_array.sum()

True

The `remapping_dicts` go from the integer IDs *in the database* to the activity, etc., keys. If we are comfortable just using the database IDs, we can skip the whole remapping step entirely, saving time and memory.

In [25]:
demand, data_objs, _ = bd.prepare_lca_inputs({something: 1}, remapping=False)

In this world, the `demand` dictionary is just the `.id` of the activity:

In [35]:
demand, {something.id: 1}

({11196: 1}, {11196: 1})

The `data_objs` are the processed arrays for inventory and impact assessment (not used in this example), prepared in the correct form:

In [36]:
data_objs

[ReadZipFS(PosixPath('/Users/cmutel/Library/Application Support/Brightway3/ecoinvent-371.040a8b7bfd29ab08dd0a24a6d8383a3d/processed/biosphere3.5d405d71.zip')),
 ReadZipFS(PosixPath('/Users/cmutel/Library/Application Support/Brightway3/ecoinvent-371.040a8b7bfd29ab08dd0a24a6d8383a3d/processed/ecoinvent-371.040a8b7b.zip'))]

You can read more about this data packages in the [bw_processing](https://github.com/brightway-lca/bw_processing) package, and in forthcoming documentary notebooks.

In [26]:
lca3 = bc.LCA(demand=demand, data_objs=data_objs)

Note that here we need to label in the input keyword arguments explicitly.

In [27]:
lca3.lci()

In [28]:
lca3.supply_array.sum() == lca.supply_array.sum()

True

Now, we use the ID values in `.dicts.activity`, etc.

In [29]:
lca3.dicts.activity[something.id]

6874

In [30]:
lca3.dicts.product[something.id]

6874