My goals in this Kernel are as follows:

* I will use the awesome kernels published by great Kagglers for this competition. I will try to explain why each step is neccessary, especially for data preparation. I have used the amazing work of following users:  
-[the1owe](https://www.kaggle.com/the1owl/surprise-me)  
-[Bojan Tunguz](https://www.kaggle.com/tunguz/surprise-me-2/code)  
* I will test several algorithms and I will  take advantage of `RandomizedSearchCV` class of `Scikit-learn` for hyper parameter optimization. Randomized search perfomrs drastically better than grid search. Please see [here](http://scikit-learn.org/stable/auto_examples/model_selection/plot_randomized_search.html).

In [53]:
import glob,re, os
import numpy as np
import pandas as pd
from sklearn import *
from xgboost import XGBRegressor
from datetime import datetime
import matplotlib.pyplot as plt
%matplotlib inline

#import h2o
#from h2o.automl import H2OAutoML

import warnings
def fxn():
    warnings.warn("deprecated", DeprecationWarning)
with warnings.catch_warnings():
    warnings.simplefilter("ignore")
    fxn()
    
#address = '../input/'
address= 'C:\\Users\\noori\\Dropbox (MIT)\\DS\\Kaggle\\RecruitRestaurent' #windows
#address='/Users/MehD/Dropbox (MIT)/DS/Kaggle/RecruitRestaurent' #mac
os.chdir(address)

In [54]:
#Assiging a name to each data frame
data={
    'tra':pd.read_csv('air_visit_data.csv'),
    'as':pd.read_csv('air_store_info.csv'),
    'hs':pd.read_csv('hpg_store_info.csv'),
    'ar': pd.read_csv('air_reserve.csv'),
    'hr': pd.read_csv('hpg_reserve.csv'),
    'id': pd.read_csv('store_id_relation.csv'),
    'tes': pd.read_csv('sample_submission.csv'),
    'hol':pd.read_csv('date_info.csv').rename(columns={'calendar_date':'visitor_date'})
}

In [55]:
# Now let's add to the hpg_reserve the ids from id dataset
data['hr']=data['hr'].merge(data['id'],on=['hpg_store_id'],how='inner')

In [56]:
#let's tranfrom date to datetime objects. Please note, to_datetime also includes the actual time. Using .dt.date we only capture date.
for df in ['ar','hr']:
    data[df]['visit_datetime'] = pd.to_datetime(data[df]['visit_datetime']).dt.date
    data[df]['reserve_datetime'] = pd.to_datetime(data[df]['reserve_datetime']).dt.date
    
    #here, we are actually engineering a new feature that captures the difference between visit and reserve times
    data[df]['reserve_datetime_diff'] = data[df].apply(lambda r: (r['visit_datetime'] - r['reserve_datetime']).days, axis=1)
    
    #let's group datasets by id and visit date, then get the sum and mean of reserve and reserve differnce, then rename the columns
    temp1 = data[df].groupby(['air_store_id','visit_datetime'], as_index=False)[['reserve_datetime_diff', 'reserve_visitors']].sum().rename(columns={'visit_datetime':'visit_date', 'reserve_datetime_diff': 'rs1', 'reserve_visitors':'rv1'})
    
    temp2 = data[df].groupby(['air_store_id','visit_datetime'], as_index=False)[['reserve_datetime_diff', 'reserve_visitors']].mean().rename(columns={'visit_datetime':'visit_date', 'reserve_datetime_diff': 'rs2', 'reserve_visitors':'rv2'})
    #now let's merge these two new temp dataframes.
    data[df]=temp1.merge(temp2,how='inner',on=['air_store_id','visit_date'])

In [57]:
data['tra']['visit_date']=pd.to_datetime(data['tra']['visit_date'])
data['tra']['dow']=data['tra']['visit_date'].dt.dayofweek
data['tra']['year']=data['tra']['visit_date'].dt.year
data['tra']['month']=data['tra']['visit_date'].dt.month
data['tra']['visit_date']=data['tra']['visit_date'].dt.date

We do the same thing for test set. Please note, we should first split the test ids and get dates and ids seperately.

In [58]:
data['tes']['visit_date'] = data['tes']['id'].map(lambda x: str(x).split('_')[2])
data['tes']['air_store_id'] = data['tes']['id'].map(lambda x: '_'.join(x.split('_')[:2]))
data['tes']['visit_date'] = pd.to_datetime(data['tes']['visit_date'])
data['tes']['dow'] = data['tes']['visit_date'].dt.dayofweek
data['tes']['year'] = data['tes']['visit_date'].dt.year
data['tes']['month'] = data['tes']['visit_date'].dt.month
data['tes']['visit_date'] = data['tes']['visit_date'].dt.date

In [59]:
unique_stores=data['tes']['air_store_id'].unique()
print('The number of unique stores is:', unique_stores.shape[0])
print('total number of data records in test set is',data['tes'].shape[0])

The number of unique stores is: 821
total number of data records in test set is 32019


Now, we'd like to create a new dataframe, that has 7\*821 rows, for each 7 days of the week. Later, we will add values to this new dataframe.

In [60]:
stores=pd.concat([pd.DataFrame({'air_store_id': unique_stores, 'dow': [i]*len(unique_stores)}) for i in range(7)],
            axis=0,ignore_index=True).reset_index(drop=True)

Now, We will add a new feature to the train set. This feature is taken from the competition conversations. Please see [here](https://www.kaggle.com/c/recruit-restaurant-visitor-forecasting/discussion/46179).
This feature is an exponentially weighted rolling average of the number of visitors.

We should find a way to replace the missing values. For this purpose, I first calculate the mean of ewm, then create a new id for both mean_ewm and data['tra'] datasets, and then set the index to the new id, and then replace the null values with the mean of ewm.

**Note** please let me know if you think of any better way to perform this.

In [61]:
data['test_ewm']=pd.read_csv('test_ewm.csv')
mean_ewm={}

def calc_shifted_ewm(series, alpha, adjust=True):
    return series.shift().ewm(alpha=alpha, adjust=adjust).mean()

for df in ['tra','test_ewm']:
    data[df]['ewm'] = data[df].groupby(['air_store_id', 'dow'])\
                  .apply(lambda g: calc_shifted_ewm(g['visitors'], 0.1)).sort_index(level=['air_store_id','dow']).values
    
    #finding the mean of ewm
    mean_ewm[df]=data[df].groupby(['air_store_id','dow']).mean().reset_index()
    
    #setting new index for new_ewm
    mean_ewm[df]['id_dow']=mean_ewm[df].apply(lambda x: '_'.join([str(x['air_store_id']),str(x['dow'])]),axis=1)
    mean_ewm[df]=mean_ewm[df].set_index('id_dow')
    
    #setting new index for data['tra']
    data[df]['id_dow']=data[df].apply(lambda x: '_'.join([str(x['air_store_id']),str(x['dow'])]),axis=1)
    data[df]=data[df].set_index('id_dow')
    
    #filling na
    data[df]['ewm']=data[df]['ewm'].fillna(mean_ewm[df]['ewm'])

In [62]:
#merging new ewm with test set.
data['tes']=data['tes'].merge(data['test_ewm'],on=['id'],how='left')
data['tes']=data['tes'][['id','visitors_x','visit_date','air_store_id_x','dow_x','year','month','ewm']]
data['tes']=data['tes'].rename(columns={'visitors_x':'visitors','air_store_id_x':'air_store_id','dow_x':'dow'})

Here' we calculate min, max, median, mean, and the number of times each store has been visited per each day of the week. The following code might sound complicated but we are using the aggregate method of groupby.

In [63]:
temp=data['tra'].groupby(['air_store_id','dow']).agg({'visitors':[np.min, np.mean, np.median, np.max, np.size]}).reset_index()

temp.columns = ['air_store_id', 'dow', 'min_visitors', 'mean_visitors', 'median_visitors','max_visitors','count_observations']

stores=stores.merge(temp, on=['air_store_id','dow'],how='left')
# let's add the store information to this dataframe. Including, genre, name, latitude, and longtitude.
stores = pd.merge(stores, data['as'], how='left', on=['air_store_id'])

Let's create new features based on name and area. We use [`LabelEncoder`](http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.LabelEncoder.html) class of preprocessing in python.

In [66]:
stores['prefecture'] = stores['air_area_name'].map(lambda x: str(x).split(' ')[0]) 

stores['air_genre_name'] = stores['air_genre_name'].map(lambda x: str(str(x).replace('/',' ')))
stores['air_area_name'] = stores['air_area_name'].map(lambda x: str(str(x).replace('-',' ')))

lbl = preprocessing.LabelEncoder()
for i in range(10):
    stores['air_genre_name'+str(i)] = lbl.fit_transform(stores['air_genre_name'].map(lambda x: str(str(x).split(' ')[i]) if len(str(x).split(' '))>i else ''))
    stores['air_area_name'+str(i)] = lbl.fit_transform(stores['air_area_name'].map(lambda x: str(str(x).split(' ')[i]) if len(str(x).split(' '))>i else ''))
stores['air_genre_name'] = lbl.fit_transform(stores['air_genre_name'])
stores['air_area_name'] = lbl.fit_transform(stores['air_area_name'])
stores['prefecture']=lbl.fit_transform(stores['prefecture'])

In [67]:
stores.head()

Unnamed: 0,air_store_id,dow,min_visitors,mean_visitors,median_visitors,max_visitors,count_observations,air_genre_name,air_area_name,latitude,longitude,prefecture,air_genre_name0,air_area_name0,air_genre_name1,air_area_name1,air_genre_name2,air_area_name2,air_genre_name3,air_area_name3,air_genre_name4,air_area_name4,air_genre_name5,air_area_name5,air_genre_name6,air_area_name6,air_genre_name7,air_area_name7,air_genre_name8,air_area_name8,air_genre_name9,air_area_name9
0,air_00a91d42b08b08d9,0,1.0,22.457143,19.0,47.0,35.0,10,42,35.694003,139.753595,42,10,42,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,air_0164b9927d20bcc3,0,2.0,7.5,6.0,19.0,20.0,10,62,35.658068,139.751599,62,10,62,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,air_0241aa3964b7f861,0,2.0,8.920635,8.0,23.0,63.0,11,84,35.712607,139.779996,84,11,84,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,air_0328696196e46f18,0,2.0,6.416667,4.0,27.0,12.0,8,101,34.701279,135.52809,101,8,101,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,air_034a3d5b40d5b1b1,0,1.0,11.864865,10.0,66.0,37.0,6,5,34.692337,135.472229,5,6,5,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


Successfully label encoded. Let's also label holidays. 

In [68]:
data['hol']['visit_date']=pd.to_datetime(data['hol']['visitor_date'])
data['hol']['day_of_week']=lbl.fit_transform(data['hol']['day_of_week'])
data['hol']['visit_date']=data['hol']['visit_date'].dt.date
data['hol']=data['hol'].drop('visitor_date',axis=1)

#merge the holiday flags to train and test sets.
train=data['tra'].merge(data['hol'],on=['visit_date'],how='left')
test=data['tes'].merge(data['hol'],on=['visit_date'],how='left')

In [69]:
train=train.merge(stores,how='left',on=['air_store_id','dow'])
test=test.merge(stores,how='left',on=['air_store_id','dow'])

In [70]:
for df in ['ar','hr']:
    train = pd.merge(train, data[df], how='left', on=['air_store_id','visit_date']) 
    test = pd.merge(test, data[df], how='left', on=['air_store_id','visit_date'])

In [71]:
train['id'] = train.apply(lambda r: '_'.join([str(r['air_store_id']), str(r['visit_date'])]), axis=1)

In [72]:
#engineering new features

train['total_reserv_sum'] = train['rv1_x'] + train['rv1_y']
train['total_reserv_mean'] = (train['rv2_x'] + train['rv2_y']) / 2
train['total_reserv_dt_diff_mean'] = (train['rs2_x'] + train['rs2_y']) / 2

test['total_reserv_sum'] = test['rv1_x'] + test['rv1_y']
test['total_reserv_mean'] = (test['rv2_x'] + test['rv2_y']) / 2
test['total_reserv_dt_diff_mean'] = (test['rs2_x'] + test['rs2_y']) / 2

In [73]:
# engineeirng new features, Please refer to original codes mentioned in the introduction for more info.

train['date_int'] = train['visit_date'].apply(lambda x: x.strftime('%Y%m%d')).astype(int)
test['date_int'] = test['visit_date'].apply(lambda x: x.strftime('%Y%m%d')).astype(int)
train['var_max_lat'] = train['latitude'].max() - train['latitude']
train['var_max_long'] = train['longitude'].max() - train['longitude']
test['var_max_lat'] = test['latitude'].max() - test['latitude']
test['var_max_long'] = test['longitude'].max() - test['longitude']

In [74]:
train['lon_plus_lat'] = train['longitude'] + train['latitude'] 
test['lon_plus_lat'] = test['longitude'] + test['latitude']

In [75]:
lbl = preprocessing.LabelEncoder()
train['air_store_id2'] = lbl.fit_transform(train['air_store_id'])
test['air_store_id2'] = lbl.transform(test['air_store_id'])

In [76]:
col = [c for c in train if c not in ['id', 'air_store_id', 'visit_date','visitors','air_genre_name','']]
train = train.fillna(-1)
test = test.fillna(-1)

#let's see how many features are we traning on
print('number of features are: ', len(col))

number of features are:  51


In [77]:
# XGB starter template borrowed from @anokas: https://www.kaggle.com/anokas/simple-xgboost-starter-0-0655

for c, dtype in zip(train.columns, train.dtypes):
    if dtype == np.float64:
        train[c] = train[c].astype(np.float32)

for c, dtype in zip(test.columns, test.dtypes):
    if dtype == np.float64:
        test[c] = test[c].astype(np.float32)

In [78]:
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)
train[col].head()

Unnamed: 0,dow,year,month,ewm,day_of_week,holiday_flg,min_visitors,mean_visitors,median_visitors,max_visitors,count_observations,air_area_name,latitude,longitude,prefecture,air_genre_name0,air_area_name0,air_genre_name1,air_area_name1,air_genre_name2,air_area_name2,air_genre_name3,air_area_name3,air_genre_name4,air_area_name4,air_genre_name5,air_area_name5,air_genre_name6,air_area_name6,air_genre_name7,air_area_name7,air_genre_name8,air_area_name8,air_genre_name9,air_area_name9,rs1_x,rv1_x,rs2_x,rv2_x,rs1_y,rv1_y,rs2_y,rv2_y,total_reserv_sum,total_reserv_mean,total_reserv_dt_diff_mean,date_int,var_max_lat,var_max_long,lon_plus_lat,air_store_id2
0,2,2016,1,19.026867,6,0,7.0,23.84375,25.0,57.0,64.0,62.0,35.65807,139.751602,62.0,8.0,62.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,-1.0,-1.0,-1.0,-1.0,-1.0,-1.0,-1.0,-1.0,-1.0,-1.0,-1.0,20160113,8.362564,4.521799,175.409668,603
1,3,2016,1,20.0,4,0,2.0,20.292307,21.0,54.0,65.0,62.0,35.65807,139.751602,62.0,8.0,62.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,-1.0,-1.0,-1.0,-1.0,-1.0,-1.0,-1.0,-1.0,-1.0,-1.0,-1.0,20160114,8.362564,4.521799,175.409668,603
2,4,2016,1,22.631578,0,0,4.0,34.738461,35.0,61.0,65.0,62.0,35.65807,139.751602,62.0,8.0,62.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,-1.0,-1.0,-1.0,-1.0,-1.0,-1.0,-1.0,-1.0,-1.0,-1.0,-1.0,20160115,8.362564,4.521799,175.409668,603
3,5,2016,1,20.184502,2,0,6.0,27.651516,27.0,53.0,66.0,62.0,35.65807,139.751602,62.0,8.0,62.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,-1.0,-1.0,-1.0,-1.0,-1.0,-1.0,-1.0,-1.0,-1.0,-1.0,-1.0,20160116,8.362564,4.521799,175.409668,603
4,0,2016,1,18.967724,1,0,2.0,13.754386,12.0,34.0,57.0,62.0,35.65807,139.751602,62.0,8.0,62.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,-1.0,-1.0,-1.0,-1.0,-1.0,-1.0,-1.0,-1.0,-1.0,-1.0,-1.0,20160118,8.362564,4.521799,175.409668,603


Okay! our data preparation is done. Let's move on to model fitting.

## Model fitting

In [79]:
#error metric
def RMSLE(y, pred):
    return metrics.mean_squared_error(y, pred)**0.5

#from sklearn.model_selection import cross_val_score
#from sklearn.model_selection import train_test_split

X_train=train[col]
y_train=np.log1p(train['visitors'])

#X_train_grid,X_valid,y_train_grid,y_valid=train_test_split(X_train,y_train,random_state=1,train_size=0.75,shuffle=True)

Per scikit-learn [documentation](http://scikit-learn.org/stable/auto_examples/model_selection/plot_randomized_search.html):
> The `randomized search` and the `grid search` explore exactly the same space of parameters. The result in parameter settings is quite similar, while the run time for randomized search is drastically lower.

In [83]:
best_params={}
best_score={}
n_iter_search = 1

3.489665458467073e-05 seconds


In [None]:
import time

start_time = time.clock()

from xgboost import XGBRegressor
from sklearn.model_selection import RandomizedSearchCV

rand_grid_xgb={'learning_rate':[0.05,0.1,0.2],
               'n_estimators':[200,400],
               'max_depth':[5,10,20],
               'subsample':[0.7,0.8],
              'colsample_bytree':[0.7,0.8]}

model_xgb=XGBRegressor()
grid_xgb=RandomizedSearchCV(model_xgb,param_distributions=rand_grid_xgb,scoring='neg_mean_squared_error',n_iter=n_iter_search)
grid_xgb.fit(X_train, y_train)

best_params(xgb)=grid_xgb.best_params_
best_score(xgb)=grid_xgb.best_score_

preds_xgb=grid_xgb.predict(X_train)

print('Cross-validated RMSE XGBRegressor: ', RMSE(y_train, preds_xgb))

print (time.clock() - start_time, "seconds")

In [None]:
model_gbr=ensemble.GradientBoostingRegressor()
rand_grid_gbr={'learning_rate':[0.05,0.1,0.2],
               'loss':['ls','lad','huber'],
                'n_estimators':[200,400],
               'max_depth':[5,10,20],
               'subsample':[0.7,0.8],
              'alpha':[0.9,0.7]}
grid_gbr=RandomizedSearchCV(model_gbr,param_distributions=rand_grid_gbr,scoring='neg_mean_squared_error',n_iter=n_iter_search)
grid_gbr.fit(X_train, y_train)

best_params(gbr)=grid_gbr.best_params_
best_score(gbr)=grid_gbr.best_score_

preds_gbr=grid_gbr.predict(X_valid)

print('RMSE GradientBoostingRegressor: ', RMSLE(y_valid, preds_gbr))

In [None]:
model_kn=neighbors.KNeighborsRegressor()
rand_grid_kn={'n_jobs':-1,
               'n_neighbors':[3,4,5,8],
               'weights':['uniform','distance']}

grid_kn=RandomizedSearchCV(model_kn,param_distributions=rand_grid_kn,scoring='neg_mean_squared_error',n_iter=n_iter_search)
grid_kn.fit(X_train, y_train)

best_params(kn)=grid_gbr.best_params_
best_score(kn)=grid_gbr.best_score_

preds_kn=grid_kn.predict(X_valid)

print('RMSE KNeighborsRegressor: ', RMSLE(y_valid, preds_kn))

In [None]:
from sklearn.ensemble import ExtraTreesRegressor

model_et=ExtraTreesRegressor()
rand_grid_et={
    'n_jobs':-1,
    'n_estimators':[200,1000,10000],
    'max_depth':[5,10,20],
    'min_samples_leaf':[100,150,300],
    'max_features':[0.7,0.8]}

grid_et=RandomizedSearchCV(model_et,param_distributions=rand_grid_et,scoring='neg_mean_squared_error',n_iter=n_iter_search)
grid_et.fit(X_train, y_train)

best_params(et)=grid_et.best_params_
best_score(et)=grid_et.best_score_

preds_et=grid_et.predict(X_valid)

print('RMSE ExtraTreesRegressor: ', RMSLE(y_valid, preds_et))

#### Predicting on all rows of the test set
Now, we predict each best model on all of the rows of tes set.

In [None]:
print(best_params)
print(best_score)

best_params.to_csv('best_params.csv')
best_score.to_csv('best_score.csv')

In [None]:
preds_xgb_test=grid_xgb.predict(test[col])
preds_gbr_test=grid_gbr.predict(test[col])
preds_kn_test=grid_kn.predict(test[col])

In [None]:
test['visitors'] = 0.3*preds_kn_test+0.3*preds_gbr_test+0.4*preds_xgb_test
test['visitors'] = np.expm1(test['visitors']).clip(lower=0.)
sub1 = test[['id','visitors']].copy()
#del train; del data;

In [None]:
# from hklee
# https://www.kaggle.com/zeemeen/weighted-mean-comparisons-lb-0-497-1st/code
dfs = { re.search('/([^/\.]*)\.csv', fn).group(1):
    pd.read_csv(fn)for fn in glob.glob('../input/*.csv')}

for k, v in dfs.items(): locals()[k] = v

wkend_holidays = date_info.apply(
    (lambda x:(x.day_of_week=='Sunday' or x.day_of_week=='Saturday') and x.holiday_flg==1), axis=1)
date_info.loc[wkend_holidays, 'holiday_flg'] = 0
date_info['weight'] = ((date_info.index + 1) / len(date_info)) ** 5  

visit_data = air_visit_data.merge(date_info, left_on='visit_date', right_on='calendar_date', how='left')
visit_data.drop('calendar_date', axis=1, inplace=True)
visit_data['visitors'] = visit_data.visitors.map(pd.np.log1p)

wmean = lambda x:( (x.weight * x.visitors).sum() / x.weight.sum() )
visitors = visit_data.groupby(['air_store_id', 'day_of_week', 'holiday_flg']).apply(wmean).reset_index()
visitors.rename(columns={0:'visitors'}, inplace=True) # cumbersome, should be better ways.

sample_submission['air_store_id'] = sample_submission.id.map(lambda x: '_'.join(x.split('_')[:-1]))
sample_submission['calendar_date'] = sample_submission.id.map(lambda x: x.split('_')[2])
sample_submission.drop('visitors', axis=1, inplace=True)
sample_submission = sample_submission.merge(date_info, on='calendar_date', how='left')
sample_submission = sample_submission.merge(visitors, on=[
    'air_store_id', 'day_of_week', 'holiday_flg'], how='left')

missings = sample_submission.visitors.isnull()
sample_submission.loc[missings, 'visitors'] = sample_submission[missings].merge(
    visitors[visitors.holiday_flg==0], on=('air_store_id', 'day_of_week'), 
    how='left')['visitors_y'].values

missings = sample_submission.visitors.isnull()
sample_submission.loc[missings, 'visitors'] = sample_submission[missings].merge(
    visitors[['air_store_id', 'visitors']].groupby('air_store_id').mean().reset_index(), 
    on='air_store_id', how='left')['visitors_y'].values

sample_submission['visitors'] = sample_submission.visitors.map(pd.np.expm1)
sub2 = sample_submission[['id', 'visitors']].copy()
sub_merge = pd.merge(sub1, sub2, on='id', how='inner')

sub_merge['visitors'] = 0.7*sub_merge['visitors_x'] + 0.3*sub_merge['visitors_y']* 1.2
sub_merge[['id', 'visitors']].to_csv('submission.csv', index=False)