# Importing data

Importing data means taking data in some form, and preparing it so that we can express that data as nodes and edges. On its own, this is not too challenging - it mostly means converting data formats. The harder part is harmonizing the data, so that the fields used across imported databases are consistent enough that we can link consumers and supplier.

Let's make this more concrete with an example. In the file `lci-carbon-fiber.xlsx` we have data from the publication [Ecological assessment of fuel cell electric vehicles with special focus on type IV carbon fiber hydrogen tank](https://www.sciencedirect.com/science/article/abs/pii/S0959652620333229). As this data is from Excel, it is tabular, and so on its surface looks different than the graph:

<img src="images/spreadsheet.png">

However, this difference is mostly cosmetic. Both the _document_ and _graph_ perspectives are showing the same information, but with a different emphasis and organizing structure. In the graph perspective, edges are independent objects with their own metadata, and their sources and targets are given as [pointers](https://en.wikipedia.org/wiki/Pointer_(computer_programming)) to the node objects. In the document perspective, edges are subsumed in the definition of the nodes, and because most input data formats don't have pointers, references to input or output flows are defined by the attributes of thoses flows.

Because we only have flow attributes, we need to define a way that we associate those attributes with nodes in our existing databases. This is trickier than you might think, as those is no guarantee that two data providers will use the same labels for things like locations or units; indeed, sometimes we even find different labels for the same attributes.

Therefore, Brightway treats IO as a classic [ETL pipeline](https://en.wikipedia.org/wiki/Extract,_transform,_load), and applies a series of transformation functions to prepare the data and find the correct flows. Let's look at our real-world example:

In [1]:
import bw2data as bd
import bw2io as bi

The example data is built on top of ecoinvent. You should update the project name to a project with ecoinvent 3.10 already installed.

In [None]:
bd.projects.set_current("<project name>")

In [None]:
bi.create_core_migrations()

In [None]:
xl_importer = bi.ExcelImporter("lci-bike.xlsx")

In [None]:
xl_importer.apply_strategies()

In [None]:
xl_importer.statistics()

In [None]:
for obj in xl_importer.unlinked:
    print(obj)

In [None]:
xl_importer.match_database(fields=['name'])
xl_importer.statistics()

In [None]:
xl_importer.write_database()

In [None]:
bd.databases