# Template matching notebook

This notebook provides some basic functionality to allow for flow matching and saving in the correct format.

## Basic setup

Import libraries and utilities, setup directories, and read input data

In [1]:
import pandas as pd
from pathlib import Path
from datetime import datetime, timezone
from notebook_utils import finish_notebook

Get paths of input and output directories

In [None]:
input_data_dir = (Path.cwd().parent / "Mapping" / "Input" / "Flowlists").resolve()
existing_matches_dir = (Path.cwd().parent / "Mapping" / "Output" / "Mapped_files").resolve()
output_dir = (Path.cwd().parent / "Contribute").resolve()

Read input dataframes

In [None]:
sp = pd.read_csv(input_data_dir / 'SimaProv9.4.csv')

In [None]:
ei = pd.read_csv(input_data_dir / 'ecoinventEFv3.7.csv')

## Merge some SimaPro and ecoinvent flows

Example of how to combine dataframes using [merge](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.merge.html). We already have these matches, this is only an example :) In actual use this will be more complicated, see the other `Merge` notebooks.

In [None]:
df = sp.merge(ei, how="inner", left_on="Flowable", right_on="Flowable")
len(df)

## Finishing Up

The function `finish_notebook`, given in the file `notebook_utils.py` does the following:

1. Adjust columns to match expected format (function `fix_names_after_merge`). Changes columns names after merging to meet expected format.
2. Add common but missing columns to match expected format (function `add_common_columns`).
3. Check that required columns are present (function `check_required_columns`).
4. Export the dataframe to the `Contribute` directory (function `export_dataframe`).

The function `finish_notebook` takes the following input arguments:

* `df`: The merged dataframe (`pd.DataFrame`).
* `author`: Your name (`str`).
* `notebook_name`: Name of this notebook (`str`); we can't figure this out automatically. It should normally start with `Match -`.
* `filename`: Name of CSV file to create; please make it meaningful (`str`).
* `output_dir`: Directory to write exported CSV file to; default is `../Contribute` (`pathlib.Path`). 
* `default_match_condition`: Condition to add when not already given in matching dataframe; one of `=`, `~`, `<`, or `>` (`str`).

In [None]:
finish_notebook(
    df=df,
    author="Someone",
    notebook_name="Match - Something",
    filename="Something",
)