# Should we continue in our decision to invest in Mcdonald's? And if so, should our decision be location dependent?
In a previous analysis we determined that it would be better to invest in **Mcdonald's** as opposed to **Subway**, now with this in mind what if we have an addtional threshold by which we will decide whether or not it is a good idea? Something like the rate of a review count change (tbd) greater than 5% over a period of 30 days for each business location?

In [2]:
from typing import Union
from datetime import date, datetime
import pandas as pd
import numpy as np
from scipy import stats
import joblib
pd.options.display.float_format = '{:,.4f}'.format

To begin we'll first separate all the Mcdonald's locations making sure as well to take into account variations in spelling.


In [3]:
bh_df: pd.DataFrame = pd.read_parquet('bus_holdings', engine='pyarrow')
mcdonalds = bh_df.loc[bh_df.ChainName.str.lower().str.contains(r'mcdonald[\']?s'), :].reset_index(drop=True)

Now for getting businesses with 30 days of data we'll modify the `CloseDate` column.

In [4]:

mcdonalds.info()
mcdonalds['CloseDate'] = pd.to_datetime(mcdonalds['CloseDate']).dt.date


<class 'pandas.core.frame.DataFrame'>
RangeIndex: 21494 entries, 0 to 21493
Data columns (total 11 columns):
 #   Column                  Non-Null Count  Dtype   
---  ------                  --------------  -----   
 0   BusinessName            21494 non-null  object  
 1   ChainName               21494 non-null  object  
 2   BusinessRating          21494 non-null  object  
 3   ReviewCount             21494 non-null  int64   
 4   previous_review_cnt     20807 non-null  float64 
 5   previous_rating         20807 non-null  object  
 6   abs_review_diff         20807 non-null  float64 
 7   abs_rating_diff         20807 non-null  object  
 8   total_review_cnt_delta  21489 non-null  float64 
 9   total_bus_rating_delta  21489 non-null  object  
 10  CloseDate               21494 non-null  category
dtypes: category(1), float64(3), int64(1), object(6)
memory usage: 1.7+ MB


To decide the best interval for our data we can take a look at the range of days covered in the data.

In [6]:
print(mcdonalds['CloseDate'].min())
print('\n')
print(mcdonalds['CloseDate'].max())

2021-12-29


2022-02-09


Having decided the best interval, The observations will be filtered

In [23]:
mcdonalds_one_month = mcdonalds.loc[mcdonalds['CloseDate'].between(left=date(2022, 1,9), right=date(2022, 2,9), inclusive='left')]
bus_instance_counts = mcdonalds_one_month.groupby(by=['BusinessName'], as_index=False)['CloseDate'].count() 
bus_instance_counts_gte_30 = bus_instance_counts.loc[bus_instance_counts.CloseDate >= 30]
bus_instance_counts_gte_30.head()

Unnamed: 0,BusinessName,CloseDate
15,mcdonalds-andalusia,30
22,mcdonalds-ashburn,30
28,mcdonalds-attalla,30
41,mcdonalds-beaver-dam-3,30
58,mcdonalds-blakely,30


In [24]:
mcd_bus_instances_gte_30 = mcdonalds[mcdonalds['BusinessName'].isin(bus_instance_counts_gte_30['BusinessName'])]
mcd_bus_instances_gte_30.head()

Unnamed: 0,BusinessName,ChainName,BusinessRating,ReviewCount,previous_review_cnt,previous_rating,abs_review_diff,abs_rating_diff,total_review_cnt_delta,total_bus_rating_delta,CloseDate
15,mcdonalds-andalusia,McDonald's,2.5,18,,,,,1.0,0.0,2021-12-29
22,mcdonalds-ashburn,McDonald's,1.5,22,,,,,1.0,0.0,2021-12-29
28,mcdonalds-attalla,McDonald's,3.5,7,,,,,0.0,0.0,2021-12-29
41,mcdonalds-beaver-dam-3,McDonald's,2.5,10,,,,,2.0,0.0,2021-12-29
59,mcdonalds-blakely,McDonald's,3.0,4,,,,,0.0,0.0,2021-12-29


In [25]:
mcd_bus_instances_gte_30.shape

(1426, 11)

In [26]:
mcd_bus_instances_gte_30_sorted = mcd_bus_instances_gte_30.sort_values(by=['BusinessName', 'CloseDate'], ascending=True)
mcd_bus_instances_gte_30_sorted

Unnamed: 0,BusinessName,ChainName,BusinessRating,ReviewCount,previous_review_cnt,previous_rating,abs_review_diff,abs_rating_diff,total_review_cnt_delta,total_bus_rating_delta,CloseDate
15,mcdonalds-andalusia,McDonald's,2.500000000,18,,,,,1.0000,0E-9,2021-12-29
687,mcdonalds-andalusia,McDonald's,2.500000000,18,18.0000,2.500000000,0.0000,0E-9,1.0000,0E-9,2022-01-04
706,mcdonalds-andalusia,McDonald's,2.500000000,18,18.0000,2.500000000,0.0000,0E-9,1.0000,0E-9,2022-01-06
780,mcdonalds-andalusia,McDonald's,2.500000000,18,18.0000,2.500000000,0.0000,0E-9,1.0000,0E-9,2022-01-07
1459,mcdonalds-andalusia,McDonald's,2.500000000,18,18.0000,2.500000000,0.0000,0E-9,1.0000,0E-9,2022-01-08
...,...,...,...,...,...,...,...,...,...,...,...
19346,mcdonalds-washington-10,McDonald's,2.000000000,4,4.0000,2.000000000,0.0000,0E-9,0.0000,0E-9,2022-02-05
20026,mcdonalds-washington-10,McDonald's,2.000000000,4,4.0000,2.000000000,0.0000,0E-9,0.0000,0E-9,2022-02-06
20706,mcdonalds-washington-10,McDonald's,2.000000000,4,4.0000,2.000000000,0.0000,0E-9,0.0000,0E-9,2022-02-07
21010,mcdonalds-washington-10,McDonald's,2.000000000,4,4.0000,2.000000000,0.0000,0E-9,0.0000,0E-9,2022-02-08


In [27]:
# take first and last and find the total difference then make sure above a threshold
# let's say we want at least 5% of business to have a review count change of 15 percent or higher

first_last_review_cnts = mcd_bus_instances_gte_30_sorted.groupby(['BusinessName'], as_index=False).agg({'CloseDate': ['first','last'], 'ReviewCount': ['first','last']})
first_last_review_cnts['relative_change'] = ((first_last_review_cnts['ReviewCount']['last'] - first_last_review_cnts['ReviewCount']['first']) / first_last_review_cnts['ReviewCount']['first']) * 100
first_last_review_cnts['relative_change'][:5]

0    5.5556
1    4.5455
2    0.0000
3   20.0000
4    0.0000
Name: relative_change, dtype: float64

In [28]:
first_last_review_cnts['relative_change_gte_15'] = np.where(first_last_review_cnts['relative_change'] >= 15, 1, 0)
relative_change_gte_15_stat = first_last_review_cnts['relative_change_gte_15'].mean() * 100
relative_change_gte_15_stat

14.634146341463413

In [29]:
# our goal is to have relative_change_gte_15_stat - 5% > 0  so we can make the following hypothesis test
# H0: relative_change_gte_15_stat - 5% <= 0
# Ha: relative_change_gte_15_stat - 5% > 0

# we can do a two-sided test and then split the p-value in 2 since we are only concerning with whether the popmean would positive
# after subtracting our threshold and we already have acheived a statistic greater than our threshold
# otherwise if we got a value lower than the null from the get-go (beginning) there would have been no need to continue
test_result = stats.ttest_1samp(a=first_last_review_cnts['relative_change_gte_15'], popmean=.05, nan_policy="omit", alternative='two-sided')
test_result

Ttest_1sampResult(statistic=1.7239213328643397, pvalue=0.0924440815593107)

In [30]:
test_result.pvalue / 2

0.04622204077965535

In [31]:
# Here is a method of doing the equivalent in scipy but specfically for a one-sided test
test_result2 = stats.ttest_1samp(a=first_last_review_cnts['relative_change_gte_15'], popmean=.05, nan_policy="omit", alternative='greater')
test_result2

Ttest_1sampResult(statistic=1.7239213328643397, pvalue=0.04622204077965535)

In [32]:
# so in this case one could say that we could invest in Mcdonald's since we can expect that their businesses' review counts
# do increase by 15% more than 5% of the days 
# only if it weren't for the fact that our sample size is pretty small so this would require us to lower
# our level of signficance and so if we did so at 1% then we would end up not rejecting the null hypothesis
# or we could conduct further analysis 

#[]
# Another question we could ask is whether this is the sort of thing we could observe across businesses, so is Mcdonald's even on
# a higher playing field? perhaps the standard we gave it is too low?

In [34]:
bh_df['CloseDate'] = pd.to_datetime(bh_df['CloseDate']).dt.date
bh_df_one_month = bh_df.loc[bh_df['CloseDate'].between(left=date(2022, 1,9), right=date(2022, 2,9), inclusive='left')]

bh_df_instance_counts = bh_df_one_month.groupby(by=['BusinessName'], as_index=False)['CloseDate'].count() 
bh_df_instance_counts_gte_30 = bh_df_instance_counts.loc[bh_df_instance_counts.CloseDate >= 30]
bh_df_instance_counts_gte_30

Unnamed: 0,BusinessName,CloseDate
26,108-ale-house-rincon,30
55,13-gypsies-jacksonville-2,30
65,15th-street-pizza-and-pub-mcdonough,30
67,16-bit-bar-arcade-columbus-5,30
68,16-east-cordele,30
...,...,...
62686,zoners-pizza-wings-and-waffles-waycross,30
62688,zoo-miami-miami,30
62690,zoological-wildlife-foundation-miami,30
62721,zyka-the-taste-indian-restaurant-decatur-decat...,30


In [35]:
bh_df_bus_instances_gte_30 = bh_df[bh_df['BusinessName'].isin(bh_df_instance_counts_gte_30['BusinessName'])]
bh_df_bus_instances_gte_30.head()

Unnamed: 0,BusinessName,ChainName,BusinessRating,ReviewCount,previous_review_cnt,previous_rating,abs_review_diff,abs_rating_diff,total_review_cnt_delta,total_bus_rating_delta,CloseDate
27,108-ale-house-rincon,108 Ale House,4.5,77,,,,,2.0,0.0,2021-12-29
56,13-gypsies-jacksonville-2,13 Gypsies,4.0,290,,,,,3.0,0.0,2021-12-29
66,15th-street-pizza-and-pub-mcdonough,15th Street Pizza & Pub,4.0,143,,,,,0.0,0.0,2021-12-29
68,16-bit-bar-arcade-columbus-5,16-Bit Bar+Arcade,4.5,319,,,,,4.0,0.0,2021-12-29
69,16-east-cordele,16 East,3.5,132,,,,,4.0,0.0,2021-12-29


In [36]:
bh_df_bus_instances_gte_30.shape

(132706, 11)

In [37]:
bh_df_bus_instances_gte_30_sorted = bh_df_bus_instances_gte_30.sort_values(by=['BusinessName', 'CloseDate'], ascending=True)
bh_df_bus_instances_gte_30_sorted

Unnamed: 0,BusinessName,ChainName,BusinessRating,ReviewCount,previous_review_cnt,previous_rating,abs_review_diff,abs_rating_diff,total_review_cnt_delta,total_bus_rating_delta,CloseDate
27,108-ale-house-rincon,108 Ale House,4.500000000,77,,,,,2.0000,0E-9,2021-12-29
73403,108-ale-house-rincon,108 Ale House,4.500000000,78,77.0000,4.500000000,1.0000,0E-9,2.0000,0E-9,2022-01-07
135954,108-ale-house-rincon,108 Ale House,4.500000000,79,78.0000,4.500000000,1.0000,0E-9,2.0000,0E-9,2022-01-08
198499,108-ale-house-rincon,108 Ale House,4.500000000,79,79.0000,4.500000000,0.0000,0E-9,2.0000,0E-9,2022-01-09
261037,108-ale-house-rincon,108 Ale House,4.500000000,79,79.0000,4.500000000,0.0000,0E-9,2.0000,0E-9,2022-01-10
...,...,...,...,...,...,...,...,...,...,...,...
1787236,àlavita-boise-2,ÀLAVITA,4.000000000,307,307.0000,4.000000000,0.0000,0E-9,3.0000,0E-9,2022-02-05
1849303,àlavita-boise-2,ÀLAVITA,4.000000000,307,307.0000,4.000000000,0.0000,0E-9,3.0000,0E-9,2022-02-06
1911354,àlavita-boise-2,ÀLAVITA,4.000000000,307,307.0000,4.000000000,0.0000,0E-9,3.0000,0E-9,2022-02-07
1939704,àlavita-boise-2,ÀLAVITA,4.000000000,307,307.0000,4.000000000,0.0000,0E-9,3.0000,0E-9,2022-02-08


In [38]:
first_last_review_cnts_bh_df = bh_df_bus_instances_gte_30_sorted.groupby(['BusinessName'], as_index=False).agg({'CloseDate': ['first','last'], 'ReviewCount': ['first','last']})
first_last_review_cnts_bh_df['relative_change'] = ((first_last_review_cnts_bh_df['ReviewCount']['last'] - first_last_review_cnts_bh_df['ReviewCount']['first']) / first_last_review_cnts_bh_df['ReviewCount']['first']) * 100
first_last_review_cnts_bh_df['relative_change']

0      2.5974
1      1.0345
2      0.0000
3      1.2539
4      3.0303
        ...  
3815   7.1429
3816   1.2594
3817   0.4310
3818   0.2717
3819   0.9868
Name: relative_change, Length: 3820, dtype: float64

In [39]:
first_last_review_cnts_bh_df['relative_change_gte_15'] = np.where(first_last_review_cnts_bh_df['relative_change'] >= 15, 1, 0)
relative_change_gte_15_stat_bh_df = first_last_review_cnts_bh_df['relative_change_gte_15'].mean() * 100
relative_change_gte_15_stat_bh_df

4.319371727748691

In [40]:
# it's a bit below 5% , so it's in the null hypothesis and we can see that with the t-statistic and there's not much reason to conduct the hypothesis test
# nonetheless we can carry it out and see what happens
test_result3_t_stat = (first_last_review_cnts_bh_df['relative_change_gte_15'].mean() - 0.05) / (first_last_review_cnts_bh_df['relative_change_gte_15'].std() / np.sqrt(first_last_review_cnts_bh_df['relative_change_gte_15'].shape[0]))
test_result3_t_stat

-2.069009641207847

In [41]:
test_result3 = stats.ttest_1samp(a=first_last_review_cnts_bh_df['relative_change_gte_15'], popmean=.05, nan_policy="omit", alternative='greater')
test_result3

Ttest_1sampResult(statistic=-2.069009641207847, pvalue=0.9806938450843626)

In [44]:
# having tested one hypothesis at a time we can also test multiple while also keeping in mind the caveats
# here we are going to see how this applies to the top 10 most populous states for the states in the bottom 10 in terms of 
# population
cg_df = pd.read_parquet('cg_est', engine='pyarrow')

state_counts = cg_df.groupby(['StateName'], as_index=False)['EstimatedPopulation'].sum()
state_counts_sorted = state_counts.sort_values(by='EstimatedPopulation', ascending=True)


Unnamed: 0,StateName,EstimatedPopulation
51,Wyoming,578759
46,Vermont,623989
8,District of Columbia,705749
1,Alaska,731545
34,North Dakota,762062


In [45]:
state_counts_sorted_wo_dc = state_counts_sorted.loc[state_counts_sorted.StateName != 'District of Columbia'].reset_index(drop=True)
state_counts_sorted_wo_dc.head()

Unnamed: 0,StateName,EstimatedPopulation
0,Wyoming,578759
1,Vermont,623989
2,Alaska,731545
3,North Dakota,762062
4,South Dakota,884659


In [46]:
top10_low10 = state_counts_sorted_wo_dc[(state_counts_sorted_wo_dc.index < 10)  | (state_counts_sorted_wo_dc.index > 40)]
top10_low10.shape

(20, 2)

In [47]:
bus_cats_df: pd.DataFrame = pd.read_parquet('bus_cats', engine='pyarrow')

bus_cats_df_uq_bus = bus_cats_df.groupby(['BusinessName']).first()

bus_cats_df_uq_bus_states = bus_cats_df_uq_bus.loc[bus_cats_df_uq_bus['StateName'].isin(values=top10_low10.StateName.unique())] 
bus_cats_df_uq_bus_states.shape

(29552, 10)

In [48]:
bh_df_loc_states = pd.merge(left=bh_df, right=bus_cats_df_uq_bus_states, on='BusinessName', how='inner', suffixes=(None, '_right'))
bh_df_loc_states.shape

(929550, 21)

In [50]:
bh_df_loc_states.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 929550 entries, 0 to 929549
Data columns (total 21 columns):
 #   Column                  Non-Null Count   Dtype   
---  ------                  --------------   -----   
 0   BusinessName            929550 non-null  object  
 1   ChainName               929550 non-null  object  
 2   BusinessRating          929550 non-null  object  
 3   ReviewCount             929550 non-null  int64   
 4   previous_review_cnt     899998 non-null  float64 
 5   previous_rating         899998 non-null  object  
 6   abs_review_diff         899998 non-null  float64 
 7   abs_rating_diff         899998 non-null  object  
 8   total_review_cnt_delta  929394 non-null  float64 
 9   total_bus_rating_delta  929394 non-null  object  
 10  CloseDate               929550 non-null  object  
 11  BusinessKey             929550 non-null  int64   
 12  ChainName_right         929550 non-null  object  
 13  PaymentLevelName        929550 non-null  object  
 14  Long

In [51]:
bh_df_loc_states_cut =  bh_df_loc_states.loc[bh_df_loc_states['CloseDate'].between(left=date(2022, 1,9), right=date(2022, 2,9), inclusive='left')]


bh_df_loc_states_cut_sorted = bh_df_loc_states_cut.sort_values(by=['BusinessName', 'CloseDate'], ascending=True)
bh_df_loc_states_cut_sorted

Unnamed: 0,BusinessName,ChainName,BusinessRating,ReviewCount,previous_review_cnt,previous_rating,abs_review_diff,abs_rating_diff,total_review_cnt_delta,total_bus_rating_delta,...,BusinessKey,ChainName_right,PaymentLevelName,Longitude,Latitude,BusinessCategoryName,CityName,CountyName,CountryName,StateName
3,1-chinese-restaurant-coinjock,1 Chinese Restaurant,3.000000000,2,2.0000,3.000000000,0.0000,0E-9,0.0000,0E-9,...,36777,1 Chinese Restaurant,Unknown,-75.946890000,36.333440000,Chinese,Coinjock,Currituck County,US,North Carolina
4,1-chinese-restaurant-coinjock,1 Chinese Restaurant,3.000000000,2,2.0000,3.000000000,0.0000,0E-9,0.0000,0E-9,...,36777,1 Chinese Restaurant,Unknown,-75.946890000,36.333440000,Chinese,Coinjock,Currituck County,US,North Carolina
5,1-chinese-restaurant-coinjock,1 Chinese Restaurant,3.000000000,2,2.0000,3.000000000,0.0000,0E-9,0.0000,0E-9,...,36777,1 Chinese Restaurant,Unknown,-75.946890000,36.333440000,Chinese,Coinjock,Currituck County,US,North Carolina
6,1-chinese-restaurant-coinjock,1 Chinese Restaurant,3.000000000,2,2.0000,3.000000000,0.0000,0E-9,0.0000,0E-9,...,36777,1 Chinese Restaurant,Unknown,-75.946890000,36.333440000,Chinese,Coinjock,Currituck County,US,North Carolina
7,1-chinese-restaurant-coinjock,1 Chinese Restaurant,3.000000000,2,2.0000,3.000000000,0.0000,0E-9,0.0000,0E-9,...,36777,1 Chinese Restaurant,Unknown,-75.946890000,36.333440000,Chinese,Coinjock,Currituck County,US,North Carolina
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
929544,él-torito-restaurant-georgetown,Él Torito Restaurant,4.000000000,2,2.0000,4.000000000,0.0000,0E-9,0.0000,0E-9,...,6882,Él Torito Restaurant,Unknown,-75.388630000,38.693780000,Spanish,Georgetown,Sussex County,US,Delaware
929545,él-torito-restaurant-georgetown,Él Torito Restaurant,4.000000000,2,2.0000,4.000000000,0.0000,0E-9,0.0000,0E-9,...,6882,Él Torito Restaurant,Unknown,-75.388630000,38.693780000,Spanish,Georgetown,Sussex County,US,Delaware
929546,él-torito-restaurant-georgetown,Él Torito Restaurant,4.000000000,2,2.0000,4.000000000,0.0000,0E-9,0.0000,0E-9,...,6882,Él Torito Restaurant,Unknown,-75.388630000,38.693780000,Spanish,Georgetown,Sussex County,US,Delaware
929547,él-torito-restaurant-georgetown,Él Torito Restaurant,4.000000000,2,2.0000,4.000000000,0.0000,0E-9,0.0000,0E-9,...,6882,Él Torito Restaurant,Unknown,-75.388630000,38.693780000,Spanish,Georgetown,Sussex County,US,Delaware


In [52]:
first_last_review_cnts_loc_states = bh_df_loc_states_cut_sorted.groupby(['BusinessName'], as_index=False).agg({'CloseDate': ['first','last'], 'ReviewCount': ['first','last'], 'StateName': ['first']})

first_last_review_cnts_loc_states['relative_change'] = ((first_last_review_cnts_loc_states['ReviewCount']['last'] - first_last_review_cnts_loc_states['ReviewCount']['first']) / first_last_review_cnts_loc_states['ReviewCount']['first']) * 100
first_last_review_cnts_loc_states['relative_change']

0        0.0000
1        0.0000
2        0.0000
3       -1.0309
4        4.6296
          ...  
29379    0.0000
29380    0.1357
29381    4.5455
29382    0.0000
29383    0.0000
Name: relative_change, Length: 29384, dtype: float64

In [53]:
first_last_review_cnts_loc_states['relative_change_gte_15'] = np.where(first_last_review_cnts_loc_states['relative_change'] >= 15, 1, 0)


In [54]:
first_last_review_cnts_loc_states['StateName1'] = first_last_review_cnts_loc_states['StateName']['first']
def group_t_stat_and_mean(x: pd.Series, a_number: Union[float, int], the_alternative: str= 'two-sided') -> pd.Series:
    stats_dict = {}
    stats_dict['stn_dev'] = np.std(x)
    stats_dict['array_mean'] = x.mean()
    stats_dict['array_count'] = x.shape[0]
    stats_dict['stn_err'] = stats_dict['stn_dev'] / np.sqrt(stats_dict['array_count']) 
    stats_dict['array_tstat'] = (stats_dict['array_mean'] - a_number) / stats_dict['stn_err']
    stats_dict['ttest'] = stats.ttest_1samp(a=x, popmean=a_number, nan_policy="omit", alternative=the_alternative)
    return pd.Series(stats_dict)

In [55]:
multiple_hypothesis_test_results = first_last_review_cnts_loc_states.groupby(by=['StateName1'], as_index=False, observed=True)['relative_change_gte_15'].apply(group_t_stat_and_mean, .05, 'greater')
multiple_hypothesis_test_results

Unnamed: 0,StateName1,stn_dev,array_mean,array_count,stn_err,array_tstat,ttest
0,North Carolina,0.1504,0.0232,3066,0.0027,-9.8823,"(-9.880694848404932, 1.0)"
1,Wyoming,0.1223,0.0152,461,0.0057,-6.1129,"(-6.106284097152835, 0.9999999989144855)"
2,New York,0.1029,0.0107,1867,0.0024,-16.4901,"(-16.48570379086796, 1.0)"
3,North Dakota,0.0974,0.0096,626,0.0039,-10.3785,"(-10.370246028333757, 1.0)"
4,Vermont,0.2023,0.0428,304,0.0116,-0.6237,"(-0.6226239529891576, 0.7330000208351626)"
5,Maine,0.061,0.0037,268,0.0037,-12.4232,"(-12.4, 1.0)"
6,South Dakota,0.1306,0.0174,749,0.0048,-6.8408,"(-6.836267987113936, 0.9999999999915585)"
7,Pennsylvania,0.128,0.0167,1741,0.0031,-10.8705,"(-10.867407920315781, 1.0)"
8,California,0.0989,0.0099,2328,0.002,-19.5722,"(-19.56796599595625, 1.0)"
9,Georgia,0.1584,0.0258,3260,0.0028,-8.7329,"(-8.73151052044111, 1.0)"


In [None]:
# joblib.dump(multiple_hypothesis_test_results, 'multiple_hypothesis_test_results_dump')
# joblib.dump(top10_low10, 'top10_low10_states_dump')