# Table of Contents
1. [Imports](#imports)
2. [Import File](#importFile)
    1. [Example Contents](#exampleContents)
3. [Construction Full DataFrame](#fullDF)
4. [DataaFrame to CSV](#CSV)

# Imports <a name="imports"></a>

In [1]:
#Read JSON files
import json
import pandas as pd

# Import Events file <a name="importFile"></a>
Import the open data file 

In [22]:
with open("../Data/Original/Evenementen.json") as event_data:
    events = json.load(event_data)

## Example contents <a name="exampleContents"></a>
Below is an example snippet of an event in the file

In [23]:
#Return eventname
print("Event: ", events[0]["title"])

#Return exact location event in human terms
print("City: ", events[0]["location"]["city"])
print("Adress: ", events[0]["location"]["adress"])
print("Zipcode: ", events[0]["location"]["zipcode"])

#Check source crowdedness
print("Coordinates: ", events[0]["location"]["latitude"], events[0]["location"]["longitude"]) 

#Check whether the data has an event
print("Date: ", events[0]["dates"])

Event:  Springsnow Festival
City:  AMSTERDAM
Adress:  Diverse locaties / Various locations
Zipcode:  1012 JS
Coordinates:  52,3726380 4,8941060
Date:  {'startdate': '20-04-2018', 'enddate': '20-05-2018'}


# Full Dataset <a name="fullDF"></a>
As not all the variables are usable, we make a subset of the following variables:
- *Event Name*: Show the event name
- *Coordinates*: Show the coordinates of the event
- *Data*: Show the event date(s)

In [24]:
#Save all the events in the list
events_dict = {}

#Loop over all the events
j = 0
for event in events: 
    #Temporary save dates
    dates = []
    
    #Check if saved in format one or two
    if "startdate" in event["dates"]: 
        dates.append(event["dates"]["startdate"])
        dates.append(event["dates"]["enddate"])
    elif "singles" in event["dates"]:
        for date in event["dates"]["singles"]:
                dates.append(date)
    
    #Dict with all data single event
    event = {"Event": event["title"], "Latitude": event["location"]["latitude"], "Longtitude": 
             event["location"]["longitude"], "Data": dates}
    
    #Append dict to list
    events_dict[j] = event
    j += 1

In [25]:
df = pd.DataFrame.from_dict(events_dict, orient="index")

In [26]:
df.head()

Unnamed: 0,Event,Latitude,Longtitude,Data
0,Springsnow Festival,523726380,48941060,"[20-04-2018, 20-05-2018]"
1,Vurige Tongen,524103320,47490690,"[20-05-2018, 21-05-2018]"
2,Sneakerness,523828340,49204560,"[03-06-2018, 04-06-2018]"
3,Dutch Raw Food Festival,524362550,48167080,[17-06-2018]
4,Holland Festival,523615820,48854790,"[02-06-2018, 03-06-2018, 04-06-2018, 05-06-201..."


# Save to file <a name="CSV"></a>
Save the list of subset events to file

In [27]:
df.to_csv("../../../Data_thesis/Full_Datasets/Events.csv", index=False)