## Part 1: Pizza Menu

The "CMP Pizza Shop", has menu options stored in an XML data file, called **pizza.xml**. You will find the menu in the **pizza.xml** file in the **data/** provided with this project.

The **pizza.xml** file includes the following information:

- the name of the pizza shop
- the sizes of the pizzas
- the toppings available
- the different crusts

Here is an example menu in xml, for illustrative purposes (note that the pizza.xml file provided in the data.zip archive, might be different from this example):

    <pizza>
    	<shopname>CMP Pizza Shop</shopname>
    	<sizes>
    		<size code="L">Large</size>
    		<size code="XL">Extra Large</size>
    	</sizes>
    	<toppings>
    		<topping code="x">Extra Cheese</topping>
    		<topping code="m">Mushrooms</topping>
    	</toppings>
    	<crusts>
    		<crust code="thick">Thick Crust</crust>
    	</crusts>
    </pizza>

### Question 1 (a)

Write python code in the below indicated cell, that reads in the **pizza.xml** file (located in data/), as an input using **xml.etree.ElementTree module**. 

### Question 1 (b)

Using the ElementTree structure, print a human readable menu of pizza options for the pizza shop. This must include the pizza shop name, the pizza sizes, toppings, and crust options. For example, a human readable menu might look like the below:

        CMP Pizza Shop
        Sizes
        - Large
        - Extra Large

        Toppings
        - Extra Cheese
        - Mushrooms

        Crusts
        - Thick Crust

In [2]:
# 1(a)
import xml.etree.ElementTree as ET  # Import Element Tree module
try:
    tree = ET.parse('data/pizza.xml')  # Parse the xml file
    root = tree.getroot()  # Get the root element of the ElementTree
except Exception as e: # Throw an error
    print ('error %s' %e)

In [3]:
#1(b)
import xml.etree.ElementTree as ET  # Import Element Tree module
try:
    tree = ET.parse('data/pizza.xml')  # Parse the xml file
    root = tree.getroot()  # Get the root element of the ElementTree
    for child in root:  # Loop the ElementTree
        print('%s \n%s' %(child.tag,child.text.strip())) # Print 'shopname','size',etc.
        for grchild in child:
            print('- %s' %(grchild.text.strip())) # Print the content of 'shopname','size',etc.
except Exception as e:
    print ('error %s' %e)

shopname 
CMP Pizza Shop
sizes 

- Extra Small
- Small
- Medium
- Large
- Extra Large
toppings 

- Chilli
- Mushrooms
- Extra Cheese
crusts 

- Thin Crust
- Thick Crust
- Cheesy Crust
- Tomato Crust


## Part 2: Pizza Report

Using the datastructure created in **Part 1** of this project, write a program in the below indicated space that calculates the following:

### Question 2 (a)

1. The number of pizza sizes that are available from our pizza menu
2. The number of toppings that are available in the pizza menu
3. The number of crust options available in the pizza menu
4. The total number of combinations of **different** pizzas that can be assembled from this menu. 

Assume the following about each pizza:
	
- a pizza can only be one size (one pizza cannot be large and extra large at the same time)
- a pizza can have any combinations of toppings, including none, one or all
- a pizza can only have one type of crust (i.e. either thick or thin crust)

Print the output of each calculation, along with a string denoting the value. For example

    Number of pizza sizes = 2
    Number of toppings = ...
    ...

### Question 2 (b)

After printing these calculations, use the python CSV module to output your calculations as a single row of data to a CSV file with the name **pizza_report.csv**. Assume the following header for your CSV file along with the calculated numbers from the previous example:

    sizes,toppings,crusts,total_combo  

### Question 2 (c)

After writing the report, re-open the file **pizza_report.csv**, and print each line of the file to the notebook.

In [4]:
# 2(a)
def factorial(n): # Calculate n!
    if(n == 1 or n == 0):
        return 1
    else:
        return (n*factorial(n-1))
def combinations(m,n): # Calculate the combinations
    return factorial(n)/(factorial(n-m)*factorial(m))
choice = 1 # Total choice of pizza
number_list = [] # Store the number of size,toppings and crusts as a list
for child in root:
    count = 0
    number = 0
    if(child.tag == 'toppings'): # Calculate the number of the combinations of toppings
        count = combinations(0,3) + combinations(1,3) + combinations(2,3) + combinations(3,3)
    else:
        for grchild in child:
            count +=1
    if (count == 0): # Skip shopname
        continue
    for grchild in child:
        number +=1
    choice *= count # Calculate the number of choices of pizza
    number_list.append(number)
    print('Number of %s : %s' %(child.tag,str(number))) # Print the number of size, toppings and crusts
print('The total number of choices: %s' %(choice)) # Print the number of the choices of pizza
number_list.append(choice) # Store the number of choices of pizza into the list

Number of sizes : 5
Number of toppings : 3
Number of crusts : 4
The total number of choices: 160


In [5]:
#2(b)
import csv;
try:
    with open('data/pizza_report.csv','w') as pizza: # Open the file
        pizza.write('sizes,toppings,crusts,total_combo\n') # Write the title into the file
        pizza.write(str(number_list[0:4])) # Write the number of variables into list
except IOError as ioe:
    print('An I/O Error occurred opening this file: %s' %ioe)

In [7]:
#2(c)
try:
    with open('data/pizza_report.csv','r') as file: # Open the file
        f = csv.reader(file)
        for row in f: # Print the file
            print(','.join(row))
except IOError as ioe:
    print('An I/O Error occurred opening this file: %s' %ioe)  

sizes,toppings,crusts,total_combo
[5, 3, 4, 160]


## Part 3: Pizza Specials


In addition to the standard menu, the CMP pizza shop also stores **Pizza Specials** in a file **pizza_specials.csv**, with the following format:

    name,size,toppings,crust  
    Supreme,XL,xm,thick  
    Simple Cheese,L,x,thick


Every pizza special has a name, a size, a combination of toppings, and a crust. The size, toppings, and crust are referenced by a code attribute, which is specified in the pizza.xml XML file. For example, the topping code for Mushrooms is "m", and the topping code for Extra Cheese is "x". The Supreme pizza has both Mushrooms and Extra Cheese as designated by the string "xm". 

### Part 3 (a)

Write code in the answer cell below, that reads in the pizza.xml file (again using the xml.etree.ElementTree module) and stores the sizes, toppings, and crusts in dictionaries with the code for that element as the key and the text for the element as the value.

### Part 3 (b)

Using these dictionaries, read in the Pizza Specials from the pizza_specials.csv file and convert them to a menu text description. 

### Part 3 (c)

Write code to output each special's menu description as a human readable line in a text file called **pizza_specials.txt**. For example, the Supreme special would be output in the txt file as:

        Supreme: Extra Large Pizza with Extra Cheese and Mushrooms and Thick Crust 
   
### Part 3 (d)

After writing the pizza Specials, re-open the file **pizza_specials.txt**, and print each line of the file to the notebook.

In [8]:
#c(a)
import xml.etree.ElementTree as et
import csv
pizza = {} # Create a dictionary to store the variables
try:
    tree = et.ElementTree(file='data/pizza.xml') # Read the file
    root = tree.getroot()
    for child in root:
        for grchild in child:
            pizza[grchild.attrib.get('code')] = grchild.text.strip() # Use code as a key, and text as a value, save them into a dictionary
except IOError as ioe:
    print('An I/O Error occurred opening this file: %s' %ioe)
#c(b)
pizza_list=[]
try:
    with open('data/pizza_specials.csv') as file: # Read the file
        f = csv.reader(file)
        for row in f:
            pizza_list.append(row) # Store the file in a list
except IOError as ioe:
    print('An I/O Error occurred opening this file: %s' %ioe)
for i in pizza_list[1:]: # Loop the list and skip title row
    for j in range(4):
        tmpo = ''
        if(i[j] in pizza.keys()): # find the full name in the dictionary
            tmp = pizza[i[j]]
            i[j] = tmp # Convert the abbreviation to full name
        elif(len(i[j])>1 and len(i[j])<4): # If contains two or more than two toppings
            tmpo = i[j]
            i[j] = ''
            for u in tmpo: # Loop the string character by character
                if(u in pizza.keys()): # Find the full name sperately and combine them with ' and '
                    i[j] = i[j] + ' and ' + pizza[u] 
            i[j] = '' * 5 + i[j][5:] # Replace the first unnecessary string ' and ' to ''
print(pizza_list) # Print the list
#3(c)
try:
    with open('data/pizza_specials.txt','w') as f:
        for row in pizza_list[1:]: # Write the list into 'pizza_specials.txt' file
            f.write('%s: %s with %s and %s\n' %(row[0],row[1],row[2],row[3]))
except IOError as ioe:
    print('An I/O Error occurred opening this file: %s' %ioe)
#3(d)
try:
    with open('data/pizza_specials.txt','r') as f: # Print the file
        print(f.read())
except IOError as ioe:
    print('An I/O Error occurred opening this file: %s' %ioe)

[['name', 'size', 'toppings', 'crust'], ['Supreme', 'Extra Large', 'Extra Cheese and Mushrooms', 'Thick Crust'], ['Simple Cheese', 'Large', 'Extra Cheese', 'Thick Crust']]
Supreme: Extra Large with Extra Cheese and Mushrooms and Thick Crust
Simple Cheese: Large with Extra Cheese and Thick Crust



## Part 4: Pizza Decider


**Who should get a pizza?** 

This part of the project will use a data set collected from the Reddit group **"Random Acts of Pizza"** (https://www.reddit.com/r/Random_Acts_Of_Pizza/). Random Acts of Pizza is a community on the website Reddit.com that facilitates the sending and receiving of pizzas between strangers. People write a request for a pizza on the Reddit group and someone may accept their request and order them a pizza!

(a version of this data is also available on Kaggle)

A data set has been collated for the textual requests to this Reddit Group. A simplified excerpt of the requests to /r/Random_Acts_of_Pizza has been provided in the text file **random_acts_pizza.csv**. There are 4 columns in this CSV file:

- requester_username - the name of the user requesting pizza
- request_text - the text of the pizza request written by the user
- requester_received_pizza - a Boolean whether or not the pizza request was accepted
- requester_account_age_in_days_at_request - the age of the reddit user account, measured at the time the user made the request

In this question, you will need to write code that will use this the random_acts_pizza.csv file to automate the decision as to whether a new request for pizza should be accepted or not.

### Part 4 (a)

Using the csv python module, read in the "Random Acts of Pizza" request history that is contained in the **random_acts_pizza.csv** file.  Add a class called PizzaDecider to your program, which uses the request history to output a boolean value, deciding whether or not a new request for pizza should be accepted. This decision is based according to the following criteria:

1) The user requesting a new pizza (identified by their username) has **not previously received** a pizza from the "Random Acts of Pizza" group.
2) The text of the user's pizza request is longer than **400 characters** in length.

### Part 4 (b)

After you have created the PizzaDecider class, use the json python module to have your program read in a file called pizza_request.json, which contains one request for pizza. The request file is a dictionary which has two keys:  
-- requester_username - the name of the user requesting pizza
-- request_text - the text of the pizza request written by the user

For an example pizza request see the file, **pizza_request.json** in the data.zip.

Your program will pass the pizza request read in from the pizza_request.json file, to a method in the PizzaDecider class that will return a boolean whether or not the user should get a pizza according to the criteria above.

### Part 4 (c)

Your program should now write out the decision from the PizzaDecider to a new JSON file called pizza_decision.json which is a dictionary with the following keys: 
- requester_username - the name of the user requesting pizza
- request_text - the text of the pizza request written by the user
- receive_pizza - a Boolean, stating whether or not the pizza request should be accepted according to the criteria

For an example of the expected output, see the file **pizza_decision.json** in the data.zip, for the pizza decision from the **pizza_request.json** example mentioned earlier.

### Part 4 (d)
Print to the notebook the decision for the user given in **pizza_decision.json**. For example,
        
        User: "spez" request for pizza should not be accepted.

In [10]:
# 4(a)
import json
import csv
rap = []
try:
    with open('data/random_acts_pizza.csv') as file: # Open the file
        f = csv.reader(file)
        for row in f:
            rap.append(row) # Store the file into a list named rap
except IOError as ioe:
    print('An I/O Error occurred opening this file: %s' %ioe)
class PizzaDecider(): # Create the pizzaDecider class
    def __init__(self,request):
        self.list = request
    def decision(self): # Decide if a requester could get a pizza
        length = len(self.list[0])
        recived_pizza = self.list[2]
        name = self.list[3]
        if(length>400 and recived_pizza == 'FALSE'): # If the request length is longer than 400 and the requester hasn't recieved a pizza before
            return True # The requester should get a pizza
        else:
            return False
# Test the decision() function
# my_p1 = PizzaDecider(rap[5])
# print(my_p1.decision())
#4(b)
rap = []
try:
    with open('data/random_acts_pizza.csv') as file: # Open the file
        f = csv.reader(file)
        for row in f:
            rap.append(row) # Store the file into a list named rap
except IOError as ioe:
    print('An I/O Error occurred opening this file: %s' %ioe)
def find_request(name): # Use a requester name to find the full information from the list
    for row in rap:
        if(name == row[3]):
            return row[3]
        else:
            return 'requester name cannot be found' 
class PizzaDecider(): # Create the pizzaDecider class
    def __init__(self,request):
        tmp_list = []
        if(type(request) == dict): # If the input is a dictionary, use the fuction find_request to return the full record of the requester
            request = find_request(request['requester_username'])
        self.list = request
    def decision(self): # Decide if a requester could get a pizza
        length = len(self.list[0])
        recived_pizza = self.list[2]
        name = self.list[3]
        if(length>400 and recived_pizza == 'FALSE'): # If the request length is longer than 400 and the requester hasn't recieved a pizza before
            return True # The requester should get a pizza
        else:
            return False
try:
    with open('data/pizza_request.json') as f: # Load the file
        request = json.load(f)
except IOError as ioe:
    print('An I/O Error occurred opening this file: %s' %ioe)
my_p2 = PizzaDecider(request)
print(my_p2.decision()) # Print if the requester could get a pizza
# print(request['requester_username'])
# print(request)
#4(c)
def find_request(name): # Use a requester name to find the full information from the list
    for row in rap:
        if(name == row[3]):
            return row
    return 'requester name cannot be found' 
class PizzaDecider():
    def __init__(self,request):
        tmp_list = []
        if(type(request) == dict): # If the input is a dictionary, use the fuction find_request to return the full record of the requester
            request = find_request(request['requester_username'])
        self.list = request
    def decision(self):
        length = len(self.list[0])
        recived_pizza = self.list[2]
        mydict = {'requester_username':self.list[3],'requester_text':self.list[0],
                  'recieve_pizza':''} # Store requester_username, requester_text into a dictionary
        if(length>400 and recived_pizza == 'FALSE'): # If the request length is longer than 400 and the requester hasn't recieved a pizza before
            mydict['recieve_pizza'] = 'true' # The value of the 'recieve_pizza' key should be 'true'
        else: # If the request didn't meet the criteria
            mydict['recieve_pizza'] = 'false' # The value of the 'recieve_pizza' key should be 'false'
        return mydict
try:
    with open('data/pizza_request.json') as f: # Read data from pizza_request.json
        request = json.load(f)
except IOError as ioe:
    print('An I/O Error occurred opening this file: %s' %ioe)
my_p2 = PizzaDecider(request) # Create a new object
try:
    with open('data/pizza_decision.json','w') as f:
        json.dump(my_p2.decision(),f) # Convert the dictionary to a string and write it into pizza_decision.json
except IOError as ioe:
    print('uh Oh! I/O Error: %s' % ioe)
#4(d)
try:
    with open('data/pizza_request.json') as f: # Read data from pizza_request.json
        request = json.load(f)
except IOError as ioe:
    print('An I/O Error occurred opening this file: %s' %ioe)
my_p3 = PizzaDecider(request) # Create a new object
if(my_p3.decision()['recieve_pizza'] == 'true'): # Check if the requester should get a pizza
    a = ''
else:
    a = 'not'
print('%s: request for pizza should %s be accepted' %(my_p3.decision()['requester_username'],a)) # Print the result

False
anyquestions: request for pizza should not be accepted


## Part 5: Visualing Pizza data

For this final part of the project, you will use statistical visualisation techniques you have covered in the module, to help explore the data. This question will use the **random_acts_pizza.csv** file from the previous question.


Using the **random_acts_pizza.csv** file, create a figure containing a subplot, with two rows, and two columns. Plot the following diagrams from the **seaborn** library in the subplots.
1. A distplot of the account age at the time of request (**requester_account_age_in_days_at_request**) 
2. A boxplot, where the x-axis (the catagory) is if the request was fullfilled or not (**requester_received_pizza**), and the y-axis is the lenth of the request string (the length of **request_text**)
3. A violin plot, where the x-axis (the catagory) is if the request was fullfilled or not (**requester_received_pizza**), and the y-axis is the number of times the string **pizza** is counted in **request_text**
4. A 2D scatterplot, where the x-axis is the account age, the y-axis is the length of the request, and the marker color if different depending on if the request was fullfilled (**requester_received_pizza**)

Remember to label axes, and choose appropriate informative titles for the plots.

In [12]:
# Enter your code for Part 5 here
#%matplotlib inline
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
figure = plt.figure()
sns.set(rc={'figure.figsize':(10,10)})
# Add subplots and set title for each subplots
ax1 = figure.add_subplot(221)
ax1.set_title("Age Density") 
ax2 = figure.add_subplot(222)
# ax2.set_title("the relationship between the length of request and the request was fulfilled")
ax3 = figure.add_subplot(223)
# ax3.set_title("the relationship between the times \n pizza was mentioned in request and the request was fulfilled")
ax4 = figure.add_subplot(224)
# ax4.set_title("the relationship among age, the length of\n request and if the request was fulfilled")

# titles overlap with plots and labels of y-axis.

# Read the file
df = pd.read_csv('data/random_acts_pizza.csv')

sns.distplot(df['requester_account_age_in_days_at_request'],ax = ax1) # Display the relative fequency of account age
sns.boxplot(x = df['requester_received_pizza'], y = df['request_text'].str.len(),ax = ax2) # Display the relationship between length of the request text and if their request were fulfilled
sns.violinplot(x = df['requester_received_pizza'], y = df['request_text'].str.lower().str.count('pizza'),ax = ax3) # Display the relationship between the times pizza was mentioned in request and the request was fulfilled
sns.scatterplot(df['requester_account_age_in_days_at_request'],y = df['request_text'].str.len(),hue = df['requester_received_pizza'],s = 20, ax = ax4) # Display the relationship among age, the length of request and if the request was fulfilled

# Set the labels of x-axis and y-axis
ax1.set(xlabel = 'age', ylabel = 'relative frequency')
ax2.set(xlabel = 'if the request is fulfilled', ylabel = 'length of request text')
ax3.set(xlabel = 'if the request is fulfilled', ylabel = 'number of times of string pizza')
ax4.set(xlabel = 'age', ylabel = 'length of request text')

# Show the plot
plt.show()

ImportError: No module named seaborn