# Update the Options and Dependencies in the Housing Characteristics JSON File

**Author:** Anthony Fontanini
    
**Date:** January 10, 2019

This notebook updates the `<project_folder>/util/housing characteristics info.json` file. The file checks to make sure that the housing characteristics in the `<project_folder>/housing_characteristics` are in the documentation, then updates the options and the dependency list based on the TSV files.  

**Warning:** If the housing characteristic TSV cannot be found in the `housing characteristics info.json` file, a warning will be printed but the script will continue to execute.

## Import Modules

In [1]:
import json
import numpy as np
from os import listdir
from os.path import isfile, join

## Function Definition

In [2]:
def getDependenciesAndOptions(filepath):
    # Get first line in TSV file
    with open(filepath) as f:
        header = f.readline()

    # Check for Dependency or Option in each entry of the header
    dependencies = list()
    options = list()
    for column in header.split('\t'):

        # If dependency, append dependency list
        if (column[0] == 'D'):
            dependencies.append( column.split('=')[1].split('\n')[0] )
        elif (column[0] == 'O'):
            options.append( column.split('=')[1].split('\n')[0] )

    # If empty assign none
    if len(dependencies) == 0:
        dependencies = ['None']
    if len(options) == 0:
        options = ['None']
    
    return dependencies,options

## Get the housing characteristics

In [3]:
hc_path = join('..','..','housing_characteristics')
hc_names = [f.split('.')[0] for f in listdir(hc_path) if isfile(join(hc_path,f))]

## Load the JSON file

In [4]:
with open(join('..','housing characteristics info.json')) as f:
    data = json.load(f)

## Replace the dependencies and options in the data structure

In [5]:
# Get the housing characteristic names from the JSON file
json_hc_names = np.array([ data['characteristics'][i]['name'] for i in range(len(data['characteristics']))])

# For each housing characteristic in the TSV directory
for i in range(len(hc_names)):
    
    # Get Dependencies and Options for TSV file
    dependencies,options = getDependenciesAndOptions( join(hc_path,hc_names[i] + '.tsv') )

    # Check to see if the housing characteristic is present in JSON File
    idx = np.where( json_hc_names == hc_names[i])[0]
    if len(idx) == 0:
        print("WARNING: %s not found in json file." % hc_names[i] )
    elif (len(idx) > 1):
        print("WARNING: %s has multiple entries in json file." % hc_names[i])
    else:
        # Set the options and the dependencies
        data['characteristics'][idx[0]]['features']['options'] = options
        data['characteristics'][idx[0]]['features']['dependencies'] = dependencies


## Write JSON file

In [6]:
with open(join('..','housing characteristics info.json'), 'w') as outfile:
    json.dump(data, outfile, indent=4, separators=(',', ': '))