# How To Improve: An Analysis of Restaurant Orders With The Aim of Improving Customer Experience and Company Performance

# Plan
- Set the stage (background) (what each restaurant is, meaning of column names)
- Flow into what you would like to know to make the restaurants more successful
- Factors to consider:
  - Trends (leading to menu selection, facility layout)
  - Seasonality (leading to staffing, facility layout)
  - Substitutes and Complements (leading to pricing)
  - Inventory Management (leading to stock rules)
  - Revenue from products (leading to branding considerations, menu selection)
- Clean data if needed
- Tabulate the above in chunks and visualise it and subconclude at each stage (each stage is the above points)
- Summary of findings

The following is an analysis of the order history of two restaurants, with the aim of improving their service level and financial performance. To this end, the following themes will be explored:

- Trends
- Seasonality
- Sales Performance

- Pricing
- Inventory Management
- Facility Management

Data Import

In [0]:
# Creates an object used to load the data
sqlContext

Out[147]: <pyspark.sql.context.SQLContext at 0x7f120c3ac220>

In [0]:
# Loading order data from both restaurants
restaurant1_orders = sqlContext.read.load('/FileStore/tables/restaurant_1_orders.csv', format='csv', header= True)
restaurant2_orders = sqlContext.read.load('/FileStore/tables/restaurant_2_orders.csv', format='csv', header= True)
# Loading product prices
restaurant1_product_prices = sqlContext.read.load('/FileStore/tables/restaurant_1_products_price.csv', format='csv', header= True)
restaurant2_product_prices = sqlContext.read.load('/FileStore/tables/restaurant_2_products_price.csv', format='csv', header= True)


## Data Cleaning

In [0]:
restaurant1_orders.show()

+------------+----------------+--------------------+--------+-------------+--------------+
|Order Number|      Order Date|           Item Name|Quantity|Product Price|Total products|
+------------+----------------+--------------------+--------+-------------+--------------+
|       16118|03/08/2019 20:25|       Plain Papadum|       2|          0.8|             6|
|       16118|03/08/2019 20:25|    King Prawn Balti|       1|        12.95|             6|
|       16118|03/08/2019 20:25|         Garlic Naan|       1|         2.95|             6|
|       16118|03/08/2019 20:25|       Mushroom Rice|       1|         3.95|             6|
|       16118|03/08/2019 20:25| Paneer Tikka Masala|       1|         8.95|             6|
|       16118|03/08/2019 20:25|       Mango Chutney|       1|          0.5|             6|
|       16117|03/08/2019 20:17|          Plain Naan|       1|          2.6|             7|
|       16117|03/08/2019 20:17|       Mushroom Rice|       1|         3.95|             7|

In [0]:
restaurant2_orders.show()

+--------+----------------+--------------------+--------+-------------+--------------+
|Order ID|      Order Date|           Item Name|Quantity|Product Price|Total products|
+--------+----------------+--------------------+--------+-------------+--------------+
|   25583|03/08/2019 21:58|Tandoori Mixed Grill|       1|        11.95|            12|
|   25583|03/08/2019 21:58|        Madras Sauce|       1|         3.95|            12|
|   25583|03/08/2019 21:58|       Mushroom Rice|       2|         3.95|            12|
|   25583|03/08/2019 21:58|         Garlic Naan|       1|         2.95|            12|
|   25583|03/08/2019 21:58|             Paratha|       1|         2.95|            12|
|   25583|03/08/2019 21:58|          Plain Rice|       1|         2.95|            12|
|   25583|03/08/2019 21:58|         Prawn Puree|       1|         4.95|            12|
|   25583|03/08/2019 21:58|       Plain Papadum|       1|          0.8|            12|
|   25583|03/08/2019 21:58|       Mango Chu

In [0]:
restaurant1_orders.printSchema()

root
 |-- Order Number: string (nullable = true)
 |-- Order Date: string (nullable = true)
 |-- Item Name: string (nullable = true)
 |-- Quantity: string (nullable = true)
 |-- Product Price: string (nullable = true)
 |-- Total products: string (nullable = true)



In [0]:
restaurant2_orders.printSchema()

root
 |-- Order ID: string (nullable = true)
 |-- Order Date: string (nullable = true)
 |-- Item Name: string (nullable = true)
 |-- Quantity: string (nullable = true)
 |-- Product Price: string (nullable = true)
 |-- Total products: string (nullable = true)



From the above we see that the two tables contain similar headers, and contain data in the same formats. However, the data type of every column in both tables is the string data type. We will there change the title of the 'Order Number' column in restaurant1_orders to 'Order ID' and cast the values of the tables to their respective appropriate data types.

In [0]:
restaurant1_orders = restaurant1_orders.withColumnRenamed("Order Number", "Order ID")
restaurant1_orders = restaurant1_orders.withColumnRenamed("Total products", "Total Products")
restaurant2_orders = restaurant2_orders.withColumnRenamed("Total products", "Total Products")
restaurant1_orders.show()

+--------+----------------+--------------------+--------+-------------+--------------+
|Order ID|      Order Date|           Item Name|Quantity|Product Price|Total Products|
+--------+----------------+--------------------+--------+-------------+--------------+
|   16118|03/08/2019 20:25|       Plain Papadum|       2|          0.8|             6|
|   16118|03/08/2019 20:25|    King Prawn Balti|       1|        12.95|             6|
|   16118|03/08/2019 20:25|         Garlic Naan|       1|         2.95|             6|
|   16118|03/08/2019 20:25|       Mushroom Rice|       1|         3.95|             6|
|   16118|03/08/2019 20:25| Paneer Tikka Masala|       1|         8.95|             6|
|   16118|03/08/2019 20:25|       Mango Chutney|       1|          0.5|             6|
|   16117|03/08/2019 20:17|          Plain Naan|       1|          2.6|             7|
|   16117|03/08/2019 20:17|       Mushroom Rice|       1|         3.95|             7|
|   16117|03/08/2019 20:17|Tandoori Chicken

In [0]:
from pyspark.sql.functions import col, rtrim, ltrim

restaurant1_orders = restaurant1_orders.withColumn("Order ID", rtrim(ltrim(col("Order ID"))))
restaurant2_orders = restaurant2_orders.withColumn("Order ID", rtrim(ltrim(col("Order ID"))))

restaurant1_orders = restaurant1_orders.withColumn("Item Name", rtrim(ltrim(col("Item Name"))))
restaurant2_orders = restaurant2_orders.withColumn("Item Name", rtrim(ltrim(col("Item Name"))))

In [0]:
from pyspark.sql.functions import to_timestamp
restaurant1_orders = restaurant1_orders.withColumn("timestamp", to_timestamp(restaurant1_orders["Order Date"], "dd/MM/yyyy HH:mm"))
restaurant2_orders = restaurant2_orders.withColumn("timestamp", to_timestamp(restaurant2_orders["Order Date"], "dd/MM/yyyy HH:mm"))

In [0]:
# Check for non-whole numbers in 'Quantity' and 'Total Products'
restaurant1_orders.select(restaurant1_orders['Quantity'] % 1 != 0).count() == restaurant1_orders.count()

Out[156]: True

In [0]:
restaurant2_orders.select(restaurant2_orders["Quantity"] % 1 != 0).count() == restaurant2_orders.count()

Out[157]: True

In [0]:
restaurant1_orders.select(restaurant1_orders["Total Products"] % 1 != 0).count() == restaurant1_orders.count()

Out[158]: True

In [0]:
restaurant2_orders.select(restaurant2_orders["Total Products"] % 1 != 0).count() == restaurant2_orders.count()

Out[159]: True

In [0]:
restaurant1_orders = restaurant1_orders.withColumn("Quantity", restaurant1_orders['Quantity'].cast('int'))
restaurant2_orders = restaurant2_orders.withColumn("Quantity", restaurant2_orders['Quantity'].cast('int'))
restaurant1_orders = restaurant1_orders.withColumn("Total Products", restaurant1_orders['Total Products'].cast('int'))
restaurant2_orders = restaurant2_orders.withColumn("Total Products", restaurant2_orders['Total Products'].cast('int'))

In [0]:
restaurant1_orders = restaurant1_orders.withColumn("Product Price", restaurant1_orders['Product Price'].cast('float'))
restaurant2_orders = restaurant2_orders.withColumn("Product Price", restaurant2_orders['Product Price'].cast('float'))

In [0]:
restaurant1_orders.printSchema()
restaurant1_orders.printSchema()

root
 |-- Order ID: string (nullable = true)
 |-- Order Date: string (nullable = true)
 |-- Item Name: string (nullable = true)
 |-- Quantity: integer (nullable = true)
 |-- Product Price: float (nullable = true)
 |-- Total Products: integer (nullable = true)
 |-- timestamp: timestamp (nullable = true)

root
 |-- Order ID: string (nullable = true)
 |-- Order Date: string (nullable = true)
 |-- Item Name: string (nullable = true)
 |-- Quantity: integer (nullable = true)
 |-- Product Price: float (nullable = true)
 |-- Total Products: integer (nullable = true)
 |-- timestamp: timestamp (nullable = true)



## Data Analysis

### Trends

In [0]:
display(restaurant1_orders)

Order ID,Order Date,Item Name,Quantity,Product Price,Total Products,timestamp
16118,03/08/2019 20:25,Plain Papadum,2,0.8,6,2019-08-03T20:25:00.000+0000
16118,03/08/2019 20:25,King Prawn Balti,1,12.95,6,2019-08-03T20:25:00.000+0000
16118,03/08/2019 20:25,Garlic Naan,1,2.95,6,2019-08-03T20:25:00.000+0000
16118,03/08/2019 20:25,Mushroom Rice,1,3.95,6,2019-08-03T20:25:00.000+0000
16118,03/08/2019 20:25,Paneer Tikka Masala,1,8.95,6,2019-08-03T20:25:00.000+0000
16118,03/08/2019 20:25,Mango Chutney,1,0.5,6,2019-08-03T20:25:00.000+0000
16117,03/08/2019 20:17,Plain Naan,1,2.6,7,2019-08-03T20:17:00.000+0000
16117,03/08/2019 20:17,Mushroom Rice,1,3.95,7,2019-08-03T20:17:00.000+0000
16117,03/08/2019 20:17,Tandoori Chicken (1/4),1,4.95,7,2019-08-03T20:17:00.000+0000
16117,03/08/2019 20:17,Vindaloo - Lamb,1,7.95,7,2019-08-03T20:17:00.000+0000
