# Combine Area Analysis Files
When we make calls to the Tom Tom Area Analysis Api we have to request each day separately (they have a limit of 24 time slots per request, and we want hourly granularity). So if you are wanting a longer chunk of time you end up with a whole collection of files. This can be cumbersome and also space inefficient, since the files all share some identical data (for example, the shape of road segments). This notebook simply helps you combine a collection of separate geojson files into one combined file.

## Set your filenames here

In [6]:
import os
import sys

def filename_path(filename):
    return os.path.join(os.environ['ATHENA_DATA_PATH'], filename)

def create_directory():
    directory = os.path.join(os.environ['ATHENA_DATA_PATH'], "AreaAnalysis")
    if not os.path.exists(directory):
        os.mkdir(directory)
        

create_directory()
files_to_combine = [filename_path("AreaAnalysis/AreaAnalysis_2018092{}.geojson".format(i)) for i in range(3,9)] # A list of filenames
combined_filename = filename_path("AreaAnalysis/testCombined.geojson") # geojson format

for file in files_to_combine:
    if (not os.path.exists(file)):
            print("WARNING: file <" + file + "> does not exist. Please double check your filesToCombine list.", file=sys.stderr)
if (os.path.exists(combined_filename)):
    print("WARNING: combinedFilename <" + combined_filename + "> already exists: It will be overwritten", file=sys.stderr)
print("File to combine: \n" + "\n".join(files_to_combine))
print("Will be saved to: \n" + combined_filename)

File to combine: 
/Users/mlunacek/nrel/athena/ATHENA-twin-internal/src/athena/.data/AreaAnalysis/AreaAnalysis_20180923.geojson
/Users/mlunacek/nrel/athena/ATHENA-twin-internal/src/athena/.data/AreaAnalysis/AreaAnalysis_20180924.geojson
/Users/mlunacek/nrel/athena/ATHENA-twin-internal/src/athena/.data/AreaAnalysis/AreaAnalysis_20180925.geojson
/Users/mlunacek/nrel/athena/ATHENA-twin-internal/src/athena/.data/AreaAnalysis/AreaAnalysis_20180926.geojson
/Users/mlunacek/nrel/athena/ATHENA-twin-internal/src/athena/.data/AreaAnalysis/AreaAnalysis_20180927.geojson
/Users/mlunacek/nrel/athena/ATHENA-twin-internal/src/athena/.data/AreaAnalysis/AreaAnalysis_20180928.geojson
Will be saved to: 
/Users/mlunacek/nrel/athena/ATHENA-twin-internal/src/athena/.data/AreaAnalysis/testCombined.geojson


## Run this cell to actually combine the results.

In [10]:
import json

def perFileProcessing(data):
    timeSetNames = data["features"][0]["properties"]["timeSets"]
    data["features"] = data["features"][1:]
    for feature in data["features"]:
        for segment in feature["properties"]["segmentTimeResults"]:
            # In each time segment results, remove timeSet and dateRange and replace with the actual time start and end
            timeSetId = segment.pop("timeSet", None)
            segment.pop("dateRange", None)
            segment["timeRange"] = next(x for x in timeSetNames if x["@id"] == timeSetId)["name"]
    return data
    
combinedData = {}
print("Processing " + files_to_combine[0])
with open(files_to_combine[0], 'r') as fp:
    combinedData = perFileProcessing(json.load(fp))
    
# Now we have the data from the first file. Open the others and add
# the time segment data to combinedData
for file in files_to_combine[1:]:
    print("Processing " + file)
    with open(file, 'r') as fp:
        data = perFileProcessing(json.load(fp))
        for feature in data["features"]:
            # Find the corresponding feature in combinedData
            segmentId = feature["properties"]["segmentId"]
            combinedFeature = next(x for x in combinedData["features"] if x["properties"]["segmentId"] == segmentId)
            if (combinedFeature == None):
                print("WARNING: Couldn't find segmentId " + str(segmentId) + " in file " + file, file=sys.stderr)
            else:
                # And then add these time segments to that feature
                timeSegments = feature["properties"]["segmentTimeResults"]
                combinedFeature["properties"]["segmentTimeResults"] = combinedFeature["properties"]["segmentTimeResults"] + timeSegments

print("Finished processing. Writing results to " + combined_filename)
with open(combined_filename, 'w') as fp:
    json.dump(combinedData, fp)

Processing /Users/mlunacek/nrel/athena/ATHENA-twin-internal/src/athena/.data/AreaAnalysis/AreaAnalysis_20180923.geojson
Processing /Users/mlunacek/nrel/athena/ATHENA-twin-internal/src/athena/.data/AreaAnalysis/AreaAnalysis_20180924.geojson
Processing /Users/mlunacek/nrel/athena/ATHENA-twin-internal/src/athena/.data/AreaAnalysis/AreaAnalysis_20180925.geojson
Processing /Users/mlunacek/nrel/athena/ATHENA-twin-internal/src/athena/.data/AreaAnalysis/AreaAnalysis_20180926.geojson
Processing /Users/mlunacek/nrel/athena/ATHENA-twin-internal/src/athena/.data/AreaAnalysis/AreaAnalysis_20180927.geojson
Processing /Users/mlunacek/nrel/athena/ATHENA-twin-internal/src/athena/.data/AreaAnalysis/AreaAnalysis_20180928.geojson
Finished processing. Writing results to /Users/mlunacek/nrel/athena/ATHENA-twin-internal/src/athena/.data/AreaAnalysis/testCombined.geojson
