# Question 2
## Part 1
* Read all the json files in the folder called Data.
* There are three categories of json files in this folder. They are identified by the key called “term” in each of the json file.
* Create a folder structure to read all these json files and store them into these separate folders. You are expected to create a hierarchy of folder structure.

##### Output Format:
* Create a folder (Name: Data Processed)
* In this folder you should have a hierarchy of folder structures


In [1]:
from glob import glob
import os
import ast
import json
from shutil import copy

In [2]:
#create a list of all json files in the Data folder
datafiles = glob('./Data/*.json')

In [3]:
#creates a folder if non exists at that path with that name, otherwise does nothing
def addFolder(folderpath):
    if not os.path.isdir(folderpath):
        os.makedirs(folderpath)

In [4]:
addFolder('Data_Processed')

In [5]:
#read all data from the json files to derermine inner data
jsondata=[]
for file in datafiles:
    f = open(file,'r')
    jsondata.append([file , json.load(f)])
    f.close()

In [6]:
#pulls the 'term' key from each file
term = []
for i in range(len(datafiles)):
    term.append(jsondata[i][1]['term'])

In [7]:
#create one folder for each term type
for t in set(term):
    addFolder('Data_Processed/' + t)

In [8]:
#create list of all cities from the json files
locations = []
for i in range(len(datafiles)):
    locations.append([jsondata[i][1]['location']['country'],
                      jsondata[i][1]['location']['state'],
                      jsondata[i][1]['location']['city']])

In [9]:
#create one folder for each country location
for l in set(locations[0]):
    for t in set(term):
        addFolder('Data_Processed/' + t + '/' + l)

In [10]:
#sort data into folders by type (i.e. restaurant, hotel, or attraction)
for i in range(len(term)):
    copy(datafiles[i],'Data_Processed/' + term[i] + '/' + locations[i][0] +'/'+ datafiles[i][7:])

## Part 2
* Read all the json files in the folder called Data.
* Read only the json files which contain the key called “restaurants”
* Each (or most of the json files) contain a key called “open” which contains the details of the operation (timings) of the restaurants. For each json file, read the timings of the restaurants.
* Data of the operation timings of the restaurants is present for each day of the week. I want you to extract each of this data and write it in an excel sheet.

In [11]:
#build output
excel = []
for i in range(len(datafiles)):
    temp =[]
    if term[i] == 'restaurants' and 'hours' in (jsondata[i][1]).keys(): #only read restaurant data
        temp.append(jsondata[i][1]['name']) #restaurant name
        temp.append(locations[i][2]) #city
        temp.append(locations[i][0]) #country
        
        for d in range(len(jsondata[i][1]['hours'][0]['open'])):
            
            # pull in open and close times for each day of the week
            temp.append([jsondata[i][1]['hours'][0]['open'][d]['day'],
                         jsondata[i][1]['hours'][0]['open'][d]['start'],
                         jsondata[i][1]['hours'][0]['open'][d]['end']])
        excel.append(temp)

In [12]:
#this function takes in a 4 digit number or string and formats it as a time.
#It will also convert from a 24-hour scale to a 12-hour scale and add the 
#'am' or 'pm note.

def formatTime(time):
    time = str(time)
    if (time[0] == '1' and int(time[1]) > 1)  or time[0] == '2':
        temp = int(time[0])*10 + int(time[1])
        temp = temp - 12
        mm = 'pm'
    else:
        mm = 'am'
        temp = time[0:2]
    
    return (str(temp) + ':' + time[2:4] + mm)


#NOTE FOR TA: Please give me the bonus credit for this. It is the same intent as the bonus
#(same action), but a lot prettier in the output. Also it outputs in a format 
# that excel can intrepret

In [13]:
#open file and use UTF encoding to support foreign letters
c = open("Data_Processed/Q2output.csv",'w+',encoding = 'utf-8')

#print headers
c.write('Name of Restaurant, City, Country Code, Day of Week, Start Time, End Time\n')

#write excel list to file
for i in range(len(excel)):
    #write restaurant name, city, and country to file
    
    for j in range(len(excel[i])-3):
        #write restaurant name, city, and country to file
        c.write(excel[i][0] + "," +  excel[i][1] + "," +  excel[i][2] + ",")
        
        #write day, start time, and end time
        c.write(str(excel[i][j+3][0]) + ',' + formatTime(excel[i][j+3][1]) + ',' + formatTime(excel[i][j+3][1]))
        
        #next row
        c.write('\n')
        
#close file
c.close()