# Helper Notebook
This notebook is responsible for declaring and setting up the variables to be passed to the other diagrams and charts for data visualization.  Adjust the constants in the Code cell below with the appropriate values.

-----

In [1]:
INPUT = 'example.csv' # The .csv file generated by the post processor
INDEX = 'id' # Which column to use for the pandas DataFrame index
# PYLIST denotes which columns from the .csv file should be stored as Python list types
PYLIST = ['citations', 'citation name', 'anchor text',
          'referring record id', 'tags']

---

## Helper Functions

Below this point of the helper notebook are helper functions to check and read the data from the *.csv* file.  Modify with caution.

---

In [2]:
from __future__ import annotations
from errno import ENOENT
from os import strerror
from os.path import exists
import pandas as pd

In [3]:
def check_input() -> None:
    '''
    Check that the input .csv file is in the scope.
    Raise an error if the file is not found.
    '''
    if not exists(INPUT):
        raise FileNotFoundError(ENOENT, strerror(ENOENT), INPUT)

In [4]:
def create() -> pd.DataFrame:
    '''
    Return the postprocessed data file (.csv) as a pandas dataframe.
    '''
    with open(INPUT, encoding='utf-8', newline='') as file:
        df = pd.read_csv(INPUT)
    df.set_index(INDEX, inplace=True)
    return df

---

## Verification

If the below cell is ran without errors, then data visualization should be ready to go.

In [None]:
check_input()
dataframe = create()
dataframe