# Question 2- Part 1
Read all the json files in the folder called Data.
	There are three categories of json files in this folder. They are identified by the key  called “term” in each of the json file.
	Create a folder structure to read all these json files and store them into these separate  folders. You are expected to create a hierarchy of folder structure.
	Example:
	You can place all restaurants json files in a particular country (say Australia) in the  same folder. How you group the json files and create a folder structure is your choice.  Your task is to identify criteria by which you can group all these json files and store  them.
	(You could use these keys to create hierarchy and store json files: Country, city,
categories)
	Output Format:
	Create a folder (Name: Data Processed)
	In this folder you should have a hierarchy of folder structures  (Example: Data Processed/Australia(AU)/……..)
A good idea is that you can classify json files on Country name or code (You can  create a hierarchy of folder structures to effectively sort and store the files).
The original json files in the folder “Data” have the name of the “id” key in the  file. You can even think of changing the name of the json file when you read and  store them.




In [3]:
#Importing all the necessary files
import json
import glob
import os
import re
import csv

In [4]:
#Writing a function to create a parent folder data_processed (takes relative path and creates a folder if it does not exist)
def create_parent_directory():
    current_dir = os.path.dirname('__file__')
    home_folder = os.path.join(current_dir, 'Data Processed')
    if not os.path.exists(home_folder):
        os.mkdir(home_folder)
    return home_folder

In [5]:
#Writing a function to create a folder structure by joining the input parameters and return the folder path (if folder is created then return the respective path)
def make_directory_with_country(home_folder, country_name, city_name, term, category):
    directory =os.path.join(home_folder, country_name, city_name, term, category)
    if not os.path.exists(directory):
        os.makedirs(directory)
        return directory
    else:
        return directory


In [6]:
#Writing a function to write data to json file at its respective location
def write_to_json_file(file_path, json_data):
    with open(file_path, 'w') as json_out:
        json.dump(json_data, json_out)
        
#Lambda expression to remove numbers
remove_numbers_lam = lambda value: re.sub(r'\d+', '', value).strip()

In [9]:
home_folder = create_parent_directory()
#Read all the json files at location using glob
for filename in glob.glob(r'C:\Users\rohini\Downloads\DataAnalysis4Python_Spring17-master\Assignment 2\Data\*.json'):
    
#Getting  the file name currently reading
file_name = os.path.basename(filename)
with open(filename) as f:
        
#Loading the  data from JSON file
data_from_file = json.load(f)
        
#Getting allthe categories
categories = [category["title"] for category in data_from_file["categories"]]
for category in categories:
            
#Removing unnecessary spaces
category = ' '.join(category.split())
            
#If there are extra spaces and numbers in the city names, then remove it
city_name = ' '.join((data_from_file["location"]["city"]).split())
city_name = remove_numbers_lam(city_name)
            
#Calling the function to get the required folder path
data_dir = make_directory_with_country(home_folder, data_from_file["location"]["country"], city_name, data_from_file["term"], category)
            
#Creating file path by joining folder structure and file name
file_path = os.path.join(data_dir, file_name)
            
#Write everything to JSON file
write_to_json_file(file_path, data_from_file)

# PART 2:
Read all the json files in the folder called Data.
	Read only the json files which contain the key called “restaurants”
	Each (or most of the json files) contain a key called “open” which contains the details  of the operation (timings) of the restaurants. For each json file, read the timings of the  restaurants.
	Data of the operation timings of the restaurants is present for each day of the week. I  want you to extract each of this data and write it in an excel sheet.
	Example:
	For a particular restaurant named “The Coffee Grounds”, the excel sheet should look  like this:



In [14]:
#Reading all the json files to find "restaurants" and getting the timing of each restaurant for each day and storing the result in a CSV file
def add_rows(file, details):
    rows = []
    for detail in details:
        row = []
        row.append(file["name"])
        row.append(' '.join((file["location"]["city"]).split()))
        row.append(file["location"]["country"])
        row.append(detail['day'])
        row.append(detail['start'][:2])
        row.append(detail['start'][2:])
        row.append(detail['end'][:2])
        row.append(detail['end'][2:])
        rows.append(row)
    return rows

In [15]:
#Writing a function to write data to a CSV file
def write_to_csv(file_name, restaurant_timings):
    
#Using utf-8 encoding as some restaurants are not in english language
    with open(file_name, 'w', encoding='utf-8') as csv_output:
        
        writer = csv.writer(csv_output, delimiter=',', quoting=csv.QUOTE_NONE, lineterminator='\n',escapechar='\\')
        writer.writerow(("Restaurant Name", 'City', 'Country Code', 'Day of Week', 'Start Time Hour', 'Start Time Minutes', 'End Time Hour', 'End Time Minutes'))
        for row in restaurant_timings:
            writer.writerow(row)
        csv_output.close()

In [16]:
#Read all the json files at location using glob
restaurant_timings = []
for filename in glob.glob(r'C:\Users\infer\Desktop\Spring17\Python\lectures\DataAnalysis4Python_Spring17\Assignment 2\Data\*.json'):
    with open(filename) as f:
        
#Loading the data
        data_from_file = json.load(f)
        
#Validating all the files to check if it is a restaurant
        if data_from_file["term"] == 'restaurants':
            
#Checking if there are hours mentioned for a restaurant
            try:
                details = [value for value in data_from_file["hours"]]
            
#Create a dummy row to populate the data
            except:
                details = [{"open":[{"day": "NA","start": "NANA","end": "NANA"}]}]
                
#Populating the data in a list
            restaurant_timings.extend(add_rows(data_from_file, details[0]["open"]))
            
write_to_csv('restaurant_timings.csv', restaurant_timings)