First, the script imports the Python packages/libraries needed to run script: pandas, json, argparse, uuid, & datetime.

In [1]:
import pandas as pd
import json
import argparse
from datetime import datetime

Then, the script uses argparse to let us enter a filename in our terminal when we run the script. 

```
python makeArchivalObjects.py -f filename.csv
```

In [3]:
parser = argparse.ArgumentParser()
parser.add_argument('-f', '--file')
args = parser.parse_args()

if args.file:
    filename = args.file
else:
    filename = input('Enter filename (including \'.csv\'): ')
filename = 'exampleSheets_archivalObjects.csv'

Then, we see two functions:`add_to_dict` and `add_with_ref`. We can worry about these later.

In [17]:
def add_to_dict(dict_name, key, value):
    try:
        value = row.get(value)
        if pd.notna(value):
            value = value.strip()
            dict_name[key] = value
    except KeyError:
        pass


def add_with_ref(dict_name, key, value, repeat):  
    try:
        value = row[value]
        if pd.notna(value):
            if repeat == 'single':
                value = value.strip()
                dict_name[key] = {'ref': value}
            else:
                new_list = []
                value = value.split('|')
                for item in value:
                    new_dict = {'ref': item}
                    new_list.append(new_dict)
                dict_name[key] = new_list
    
    except KeyError:
        pass
       

This next bit of code opens the CSV as a `DataFrame` called `df` and loops through its rows.

As the script loops through each row, it extracts data based on CSV column names and adds the data to a dictionary called `json_file`. This dictionary will be transformed and saved as a JSON file at the end of the loop.

In [18]:
df = pd.read_csv(filename, dtype={'position': str, 'parent': str})
for index, row in df.iterrows():
    
    # Create empty dictionary to store data.
    json_file = {}
    json_file['jsonmodel_type'] = 'archival_object'
    json_file['suppressed'] = False
    
    # For required fields, add directly to json_file.
    identifier = row['local_id']
    json_file['title'] = row['title']
    json_file['resource'] = {'ref': row['resource']}
    json_file['level'] = row['level']
    json_file['publish'] = row['publish']
    json_file['restrictions_apply'] = row['restrictions_apply']
    
    # For optional fields, try to find value and add to json_file if found.
    add_to_dict(json_file, 'repository_processing_note', 'repository_processing_note')
    add_to_dict(json_file, 'position', 'position')
    add_to_dict(json_file, 'other_level', 'other_level')
    
    # For optional fields with 'ref' key, use function to add.
    add_with_ref(json_file, 'parent', 'parent', 'single')
    add_with_ref(json_file, 'repository', 'repository', 'single')
    add_with_ref(json_file, 'linked_events', 'linked_events', 'multi')
    add_with_ref(json_file, 'subjects', 'subjects', 'multi')
    

This section generates a filename (`ao_filename`) based on an identifier variable and a datetime stamp, and then uses the json function `json.dump` to write and save our dictionary into a JSON file using our unique `ao_filename`.

In [19]:
    dt = datetime.now().strftime('%Y-%m-%d')
    ao_filename = identifier+'_'+dt+'.json'
    directory = ''
    with open(directory+ao_filename, 'w') as fp:
        json.dump(json_file, fp)