## Steps peformed to generate sales data for a Restaurant by using given sample data.

### From the sample data provided I have decided to drop some columns as they are out of scope from the approach that I will be going with.

### Columns dropped : StoreCode, DTS,  Month, Date,	Year, Time,	TicketCode,	PartySize, ItemPrice.

### NOTE : Instead of generating Time for Lunch and Dinner, I will be generating rows for Lunch and Dinner labels only as this would also fulfill the purpose.

### As per requirements I will be generating randomized data for 6 months span from January 1,2019 to June 30,2019 and at the end I will replace the data for 1st & 2nd January with the provided sample data.

 ## Various Imports and utility functions

In [None]:
import datetime
import pandas as pd
import numpy as np
import calendar

''' Function for getting day of week from date '''
def findDay(date): 
    day = date.weekday() 
    return (calendar.day_name[day]) 

''' Function for getting day type from day '''
def findDayType(day):
    if day in ['Sunday','Saturday']:
        return 'Weekend'
    else:
        return 'Weekday'

### Creating Initial data for 6 months with columns : Date , Day ,  Day Type

In [None]:
start = datetime.date(2019, 1, 1)

end = datetime.date(2019, 6, 30)

data_df = pd.DataFrame(columns=['Date'])

data_df['Date'] = pd.date_range(start,end)

data_df['Date'] = data_df.Date.apply(lambda x: x.date())

data_df['Day'] = data_df.Date.apply(lambda x : findDay(x))

data_df['Day Type'] = data_df.Day.apply(lambda x: findDayType(x))

### Creating Two dataframes which will contain data for Lunch and Dinner Separately with proper repetitions of rows for each date using Numpy's Repeat() function, meeting the mentioned requirements in the problem statement.

### As per requirements number of customers will be more if it is Friday's Dinner or Lunch for Saturday or Sunday as compared to timings of other days.

### Windows selected for generating a random number using Numpy's randint() function for number of orders during different timings :
        Friday's Dinner : 65-85
        Other Days Dinner : 35-55
        Sunday/Saturday Lunch : 85-105
        Other Days Lunch : 55-75

In [None]:
''' Converting Dataframe values into Numpy Array '''
data_vals = data_df.values

''' Creating empty dataframe '''
new_df_lunch = pd.DataFrame()

''' Iterating over each row in array and repeating data with proper conditions '''
for i in data_vals:
    if i[1] in ['Sunday','Saturday']:
        temp = np.repeat([i],np.random.randint(85,105),axis=0).tolist()
        new_df_lunch = new_df_lunch.append(temp,ignore_index=True)
    else:
        temp = np.repeat([i],np.random.randint(55,75),axis=0).tolist()
        new_df_lunch = new_df_lunch.append(temp,ignore_index=True)
        
new_df_lunch.columns = data_df.columns

''' Creating Shift column with all values = Lunch '''
new_df_lunch['Shift'] = 'Lunch'

''' Creating empty dataframe '''
new_df_dinner = pd.DataFrame()

''' Iterating over each row in array and repeating data with proper conditions '''
for i in data_vals:
    if i[1] == 'Friday':
        temp = np.repeat([i],np.random.randint(65,85),axis=0).tolist()
        new_df_dinner = new_df_dinner.append(temp,ignore_index=True)
    else:
        temp = np.repeat([i],np.random.randint(35,55),axis=0).tolist()
        new_df_dinner = new_df_dinner.append(temp,ignore_index=True)
        
new_df_dinner.columns = data_df.columns

''' Creating Shift column with all values = Dinner '''
new_df_dinner['Shift'] = 'Dinner'

### Creating list of Menu categories using sample data set and also a dictionary for mapping each Menu category with corresponding Menu Item so that we can randomize data in proper manner.

In [None]:
category = ['Starter', 'VEGETABLE SPECIALS' ,'BREADS', 'CHICKEN SPECIALS' ,'RICE SPECIALS', 'DESSERTS', 'LAMB SPECIALTIES', 'SEAFOOD SPECIALTIES']

mapping = {'Starter':['GOBI MANCHURIAN','TASTY FLATBREAD','MASALA CHICKEN WINGS','COCKTAIL CHICKEN SAMOSAS','VEGETABLE SAMOSA','CHAAT PAPRI','VEGETABLE PAKORA','HARA BHARA KABOB','FISH PAKORA','SHRIMP STRIPS','TASTY SLIDERS : CHICKEN PANEER','SPICY CHICKEN BITES'] ,
           'VEGETABLE SPECIALS':['SARSON DA SAAG','PANEER VINDALOO','BAINGAN BARTHA','MALAI KOFTA','KADAHI PANEER','SHAHI PANEER','YELLOW DAL FRY','BHINDI DO PIAZZA'],
           'BREADS' :['GARLIC NAAN','NAAN','ONION KULCHA','TANDOORI ROTI','ALOO PARATHA','LACHA PARATHA','SPINACH NAAN'],
           'CHICKEN SPECIALS':['CHICKEN KORMA','CHICKEN TIKKA MASALA','CHICKEN SAAG','COCONUT CHICKEN CURRY','BUTTER CHICKEN'],
           'RICE SPECIALS':['CHICKEN BIRYANI','RICE','TIKKA RICE BOWL : PANEER | CHICKEN'],
           'DESSERTS':['GULABJAMUN','MALPURA','CARROT HALWA','RASMALAI','KHEER'],
           'LAMB SPECIALTIES':['KADAHI LAMB'],
           'SEAFOOD SPECIALTIES':['FISH CURRY','FISH KORMA']}

### Creating 'MenuCategory' and 'MenuItem' columns in both dataframes using Numpy's Choice() function by also specifying proper weights for each category.

### Weights were decided after examining the value counts of each category in the sample data.

### By using weights , we are making sure that certain menu categories are given more importance over the others when we will be populating randomized data.

#### Weights for each category are : ['Starter':0.17, 'VEGETABLE SPECIALS': 0.14 ,'BREADS' : 0.25 , 'CHICKEN SPECIALS': 0.13  ,'RICE SPECIALS': 0.10 , 'DESSERTS' : 0.15 , 'LAMB SPECIALTIES' : 0.02, 'SEAFOOD SPECIALTIES': 0.04 ]

### After generating data for 'MenuCategory' columns , we will be iterating over values from this column and use Mapping dictionary to populate only corresponding values in 'MenuItem' column so that values do not get inter-mixed.  

In [None]:
new_df_lunch['MenuCategory'] = 'Empty'

new_df_lunch['MenuItem'] = 'Empty'

new_df_lunch['MenuCategory'] = new_df_lunch['MenuCategory'].apply(lambda x : np.random.choice(category,1,p=[0.17,0.14,0.25,0.13,0.10,0.15,0.02,0.04])[0])

new_df_lunch['MenuItem'] = new_df_lunch['MenuCategory'].apply(lambda x : np.random.choice(mapping.get(x),1)[0])


new_df_dinner['MenuCategory'] = 'Empty'

new_df_dinner['MenuItem'] = 'Empty'

new_df_dinner['MenuCategory'] = new_df_dinner['MenuCategory'].apply(lambda x : np.random.choice(category,1,p=[0.17,0.14,0.25,0.13,0.10,0.15,0.02,0.04])[0])

new_df_dinner['MenuItem'] = new_df_dinner['MenuCategory'].apply(lambda x : np.random.choice(mapping.get(x),1)[0])


### We will be now concatenating both data frame row wise to create a final dataframe for holding data and then also creating our last desired column 'ItemQty' using Numpy's Choice() function with proper weights again so that certain values get importance when generating random quantity of ordered food item.

### Weights for each order quantity : [1 : 0.55 ,2 : 0.25 , 3 : 0.15 , 4 : 0.05]

### And then we will sort our data by 'Date' columns in Ascending Order and by 'Shift' column in Descending order and after this we will export data into an excel file.

In [None]:
final_data = pd.concat([new_df_lunch,new_df_dinner],axis =0)

final_data['ItemQty'] = ''

final_data['ItemQty'] = final_data['ItemQty'].apply(lambda x : np.random.choice([1,2,3,4],1,p=[0.55,0.25,0.15,0.05])[0])

final_data = final_data.sort_values(['Date', 'Shift'], ascending=[True, False])
print(final_data.shape)

In [None]:
''' Export data into excel file '''
# final_data.to_excel('prepared_data.xlsx',index=False)

### Total rows generated : 22038

### Total columns generated : 7