In [28]:
import pandas as pd
import numpy as np
import matplotlib as plt
import seaborn as sns

Import the raw order history from Shopee and then run some basic EDA:

In [3]:
shopee_df = pd.read_csv('order_brush_order.csv')

In [142]:
len(shopee_df['shopid'].unique())

18770

In [132]:
thing2.to_csv('shopee_submissions23.csv', index=False)

In [409]:
len(shopee_df)

222750

In [410]:
len(shopee_df.groupby(['shopid']))

18770

In [137]:
shopee_df.head()

Unnamed: 0,orderid,shopid,userid,event_time
0,31076582227611,93950878,30530270,2019-12-27 00:23:03
1,31118059853484,156423439,46057927,2019-12-27 11:54:20
2,31123355095755,173699291,67341739,2019-12-27 13:22:35
3,31122059872723,63674025,149380322,2019-12-27 13:01:00
4,31117075665123,127249066,149493217,2019-12-27 11:37:55


Since the submission format is a two column table, a dictionary with shopid as key is an appropriate data-type choice. All values in the dict are initially set to 0 to reflect the initial assumption that all shops have not conducted order brushing. Shops that have been flagged will have their values updated accordingly.

In [595]:
submissions_dict = shopee_df.set_index('shopid').T.to_dict('list')

  """Entry point for launching an IPython kernel.


In [596]:
submissions_dict.update({key:[0] for key in submissions_dict})
submissions_dict

{93950878: [0],
 156423439: [0],
 173699291: [0],
 63674025: [0],
 127249066: [0],
 173811070: [0],
 107921853: [0],
 178400128: [0],
 147941492: [0],
 164933170: [0],
 9374147: [0],
 145694343: [0],
 96464079: [0],
 30988921: [0],
 199867753: [0],
 67162407: [0],
 65883234: [0],
 33242381: [0],
 3285661: [0],
 95138572: [0],
 286003: [0],
 12662873: [0],
 152569117: [0],
 8051258: [0],
 12480907: [0],
 60239193: [0],
 106779896: [0],
 111250776: [0],
 51526935: [0],
 195394274: [0],
 52637837: [0],
 43719124: [0],
 60489360: [0],
 161269907: [0],
 1175477: [0],
 213154942: [0],
 39938958: [0],
 168748997: [0],
 93363430: [0],
 96757689: [0],
 90339629: [0],
 193010376: [0],
 137754804: [0],
 152871252: [0],
 67960532: [0],
 64625969: [0],
 4669871: [0],
 62713846: [0],
 39554718: [0],
 84706933: [0],
 200296452: [0],
 25958852: [0],
 47415942: [0],
 13451191: [0],
 33236404: [0],
 173718481: [0],
 112904482: [0],
 24759976: [0],
 10009: [0],
 116628441: [0],
 54615708: [0],
 178718312

In [597]:
len(submissions_dict)

18770

## Initial Attempt

The following section details how shops that did not have a minimum of 3 orders were not considered to have conducted order brushing since the Concentrate Rate threshold had a value of 3. 

In [149]:
less_3_df = shopee_df.groupby(['shopid']).filter(lambda x: len(x) <3)
len(less_3_df)

12393

Eliminating these from consideration means ~12k fewer rows to analyze, which isn't too amazing considering there are ~220k rows initially. 

In [27]:
less_3_df.groupby(['shopid']).count()

Unnamed: 0_level_0,orderid,userid,event_time
shopid,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
10009,1,1,1
10051,2,2,2
10107,1,1,1
10108,2,2,2
10110,2,2,2
...,...,...,...
214662358,1,1,1
214949521,2,2,2
214964814,1,1,1
215175775,2,2,2


Irregardless, these 9,739 shops can be considered to be clean shops and can be marked as such in the final submission.

In [123]:
less_3_df.to_csv('shopee_submissions.csv', index=False)

Looking at the shops that have 3 or more orders:

In [411]:
more_3_df = shopee_df.groupby(['shopid']).filter(lambda x: len(x)>=3)

In [598]:
more_3_df.head()

Unnamed: 0,orderid,shopid,userid,event_time
0,31076582227611,93950878,30530270,2019-12-27 00:23:03
1,31118059853484,156423439,46057927,2019-12-27 11:54:20
2,31123355095755,173699291,67341739,2019-12-27 13:22:35
3,31122059872723,63674025,149380322,2019-12-27 13:01:00
4,31117075665123,127249066,149493217,2019-12-27 11:37:55


In [599]:
len(more_3_df)

210357

Convert the event_time column from str to pandas datetime format:

In [154]:
more_3_df.event_time = pd.to_datetime(more_3_df.event_time)

The above can be done in a more optimal fashion by explicitly stating the format of the datetime feature:

In [465]:
more_3_df.event_time = pd.to_datetime(more_3_df.event_time, format='%Y-%m-%d %H:%M:%S')

In [600]:
more_3_df['event_time']

0        2019-12-27 00:23:03
1        2019-12-27 11:54:20
2        2019-12-27 13:22:35
3        2019-12-27 13:01:00
4        2019-12-27 11:37:55
                 ...        
222745   2019-12-28 23:17:59
222746   2019-12-28 19:07:20
222747   2019-12-28 08:17:52
222748   2019-12-28 10:14:31
222749   2019-12-28 00:45:56
Name: event_time, Length: 210357, dtype: datetime64[ns]

In [107]:
more_3_df.groupby(['shopid']).count()

Unnamed: 0_level_0,orderid,userid,event_time
shopid,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
10061,4,4,4
10084,55,55,55
10100,42,42,42
10132,10,10,10
10133,12,12,12
...,...,...,...
213532450,6,6,6
213675545,3,3,3
213784505,11,11,11
213900783,7,7,7


## Post-competition Attempt

Ok, so a total of ~9k shops with potential order brushing. The general procedure is tentatively as follows:
1. Iterate through each shop:
    - Sort orders by event_time
2. Iterate through each row:
    - Create a time interval of event_time + 1 hour
    - Count number of orders in that interval
    - Count number of unique buyers in that interval

`Concentrate Rate (C.R.) = Number of orders in 1 hour / Number of Unique Buyers in 1 hour`

3. Calculate the C.R. 
4. If the C.R. is higher than 3, execute secondary iteration:

    *Since the Concentrate Rate threshold is 3 and suspicious buyers are the ones with highest proportion of orders:*
    - Identify buyers with highest order count (max)
    - Select userid and insert as value into submissions_dict under corresponding shopid key.




#### Calculating C.R. for 1 Example Shop

First, a shop with a large number of orders is extracted to be used as an example. Then steps 2-4 from above will be implemented:

In [161]:
shop_example = more_3_df.query('shopid == 10084').sort_values(by='event_time')
shop_example

Unnamed: 0,orderid,shopid,userid,event_time
167859,31075686185309,10084,4401933,2019-12-27 00:08:06
178051,31077155357404,10084,13837190,2019-12-27 00:32:36
91837,31079024994425,10084,39828049,2019-12-27 01:03:44
20401,31079688206563,10084,73993513,2019-12-27 01:14:49
199320,31103178638264,10084,80643747,2019-12-27 07:46:18
219313,31108766989338,10084,11753447,2019-12-27 09:19:26
67315,31122489886365,10084,102616150,2019-12-27 13:08:10
111298,31122994584099,10084,162847440,2019-12-27 13:16:34
34650,31123641739732,10084,8457753,2019-12-27 13:27:21
190869,31134354630457,10084,96570515,2019-12-27 16:25:54


Use .values to get the datetime in a numpy array to avoid the *ValueError: Can only compare identically-labeled Series objects*

In [509]:
(shop_example[:1].event_time)

167859   2019-12-27 00:08:06
Name: event_time, dtype: datetime64[ns]

In [486]:
start_time = shop_example[:1].event_time.values
start_time

array(['2019-12-27T00:08:06.000000000'], dtype='datetime64[ns]')

In [202]:
concentrate_period = (shop_example[:1].event_time + pd.Timedelta(hours=1)).values
concentrate_period

array(['2019-12-27T01:08:06.000000000'], dtype='datetime64[ns]')

In [205]:
mask = ((shop_example['event_time'].values >= start_time) & (shop_example['event_time'].values <= concentrate_period))

In [212]:
interval = shop_example.loc[mask]
interval

Unnamed: 0,orderid,shopid,userid,event_time
167859,31075686185309,10084,4401933,2019-12-27 00:08:06
178051,31077155357404,10084,13837190,2019-12-27 00:32:36
91837,31079024994425,10084,39828049,2019-12-27 01:03:44


In [216]:
cr_rate = len(interval) / len(interval.userid.unique())
cr_rate

1.0

The building blocks for establishing the C.R. and creating the intervals via masks are in place.<br> Next, we use an example given by the Shopee of a confirmed case of order brushing to flesh out **Step 4**:
- Identify buyers with highest order count (max)
- Select userid and insert as value into submissions_dict under corresponding shopid key.


In [218]:
shop_example_2 = more_3_df.query('shopid == 8996761').sort_values(by='event_time')
shop_example_2

Unnamed: 0,orderid,shopid,userid,event_time
85639,31197009072133,8996761,2136861,2019-12-28 09:50:10
64988,31197099132601,8996761,2136861,2019-12-28 09:51:40
1436,31221433435774,8996761,162508227,2019-12-28 16:37:13
57411,31221501326851,8996761,162508227,2019-12-28 16:38:22
142422,31221615158739,8996761,162508227,2019-12-28 16:40:15
110220,31289016357484,8996761,13135622,2019-12-29 11:23:36
180859,31289143347095,8996761,13135622,2019-12-29 11:25:43
165247,31289198789997,8996761,13135622,2019-12-29 11:26:39
141887,31295266255667,8996761,151327544,2019-12-29 13:07:46
217111,31295328244372,8996761,151327544,2019-12-29 13:08:49


In [231]:
shop_example_row = shop_example_2[shop_example_2['orderid'] == 31463329902935]
shop_example_row

Unnamed: 0,orderid,shopid,userid,event_time
17116,31463329902935,8996761,215382704,2019-12-31 11:48:49


Abstracting away the previous steps into an intermediary function (to be refined later):

In [594]:
def calc_cr(shop_example_row):
    start_time = shop_example_row.event_time.values
    concentrate_period = (shop_example_row.event_time + pd.Timedelta(hours=1)).values
    mask = ((shop_example_2['event_time'].values >= start_time) & (shop_example_2['event_time'].values <= concentrate_period))
    
    interval = shop_example_2.loc[mask]
    cr_rate = len(interval) / len(interval.userid.unique())
#     if cr_rate >= 3:
    return(interval) # all rows within the 1 hour mark of the examined row

We stop developing the function at the C.R. check to flesh out the order proportions steps. <br> The following **sample** returns a df with all rows within 1 hour of order *'31463329902935'*

In [252]:
sample = calc_cr(shop_example_row)
sample

Unnamed: 0,orderid,shopid,userid,event_time
17116,31463329902935,8996761,215382704,2019-12-31 11:48:49
197220,31463516755431,8996761,215382704,2019-12-31 11:51:56
166235,31463618079296,8996761,215382704,2019-12-31 11:53:38
64862,31463701425020,8996761,215382704,2019-12-31 11:55:01
26346,31463906062704,8996761,2136861,2019-12-31 11:58:26
83426,31463960795761,8996761,2136861,2019-12-31 11:59:20


In [275]:
grouped_sample = sample.groupby('userid').count().sort_values(by='orderid', ascending=False)
grouped_sample

Unnamed: 0_level_0,orderid,shopid,event_time
userid,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
215382704,4,4,4
2136861,2,2,2


Next, the highest order proportion for each user is determined, given by:

`order_proportion = number of orders by individual user / total orders in 1 hour` 

As such, we can simply flag all users with the highest number of orders in an hour.

In [271]:
highest_order = sample.groupby('userid')['orderid'].count().agg('max')
highest_order

4

The following code identifies the userid with the highest number of orders in the time period:

In [282]:
grouped_sample[grouped_sample['orderid'] == highest_order] 

Unnamed: 0_level_0,orderid,shopid,event_time
userid,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
215382704,4,4,4


Taking the index of the above will grant us the userid. This user has conducted order brushing and his/her userid will need to be appended into the submissions_dict.

In [331]:
user_index = grouped_sample[grouped_sample['orderid'] == highest_order].index[0]
user_index

215382704

The shopid is to be used as the key for searching the submissions_dict.

In [307]:
key_search = int(shop_example_row.shopid)
key_search

8996761

A before and after of appending the user to the submissions_dict:

In [341]:
submissions_dict.get(key_search)

[]

In [342]:
submissions_dict[(key_search)].append(user_index)

In [343]:
submissions_dict.get(key_search)

[215382704]

#### Multiple suspicious users with highest proportion of orders

The above code will work for evaluated periods with only 1 suspicious user, but does not account for the event of multiple suspicious users. <br>
To do this, we first create a fictional example with equal number of multiple suspicious users to work on:

In [359]:
sample_multiple = sample.append(sample[4:])

In [360]:
sample_multiple

Unnamed: 0,orderid,shopid,userid,event_time
17116,31463329902935,8996761,215382704,2019-12-31 11:48:49
197220,31463516755431,8996761,215382704,2019-12-31 11:51:56
166235,31463618079296,8996761,215382704,2019-12-31 11:53:38
64862,31463701425020,8996761,215382704,2019-12-31 11:55:01
26346,31463906062704,8996761,2136861,2019-12-31 11:58:26
83426,31463960795761,8996761,2136861,2019-12-31 11:59:20
26346,31463906062704,8996761,2136861,2019-12-31 11:58:26
83426,31463960795761,8996761,2136861,2019-12-31 11:59:20


In [361]:
highest_order = sample_multiple.groupby('userid')['orderid'].count().agg('max')
highest_order

4

In [366]:
grouped_sample = sample_multiple.groupby('userid').count().sort_values(by='orderid', ascending=False)
grouped_sample[grouped_sample['orderid'] == highest_order] 

In [368]:
grouped_sample[grouped_sample['orderid'] == highest_order] 

Unnamed: 0_level_0,orderid,shopid,event_time
userid,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
2136861,4,4,4
215382704,4,4,4


The code now outputs both userids that match the highest number of orders. The next step is to capture these userid's into the same value under the corresponding shopid key.

In [379]:
user_index = grouped_sample[grouped_sample['orderid'] == highest_order].index.tolist()
user_index

[2136861, 215382704]

In [382]:
key_search = int(shop_example_row.shopid)
key_search

8996761

Remove the value from the previous working example:

In [400]:
submissions_dict[key_search] = []
submissions_dict.get(key_search)

[]

In [401]:
submissions_dict[(key_search)].extend(user_index)
submissions_dict.get(key_search)

[2136861, 215382704]

Alright! All the different working components have been fleshed out. The next step is to combine the steps into a single function and apply it across the given df. <br>

*Note: this would probably be a good place to include unit tests.*

### Putting it all together

In order to combine all the different steps into a single function, a different test sample will need to be created. <br> This sample df will contain two different shops, with one of them being a confirmed order-brushing culprit as per the example. Recall that the function was previously in this state:

In [None]:
def calc_cr(df_row):
    # Use the row's event_time as the start_time and +1 hour for end_time
    start_time = df_row.event_time.values
    end_time = (df_row.event_time + pd.Timedelta(hours=1)).values
    # The below only works on grouped shit. 
    mask = ((shop_example_2['event_time'].values >= start_time) & (shop_example_2['event_time'].values <= concentrate_period))
    
    interval = shop_example_2.loc[mask]
    cr_rate = len(interval) / len(interval.userid.unique())
#     if cr_rate >= 3:
    return(interval) # all rows within the 1 hour mark of the examined row
        

We expand on the scope of the function to include:
1. Determine the highest order number in the interval
2. Determine which user(s) correspond to the highest number
3. Get the user's id
4. Include user's id into the submission dictionary

In [467]:
more_3_df_wip = more_3_df.groupby(['shopid'])

In [455]:
more_3_df_wip.event_time = pd.to_datetime(more_3_df_wip.event_time, format='%Y-%m-%d %H:%M:%S')

TypeError: 'Series' objects are mutable, thus they cannot be hashed

We ensure that **more_3_df** is passed through a groupby() function and has the correct datetime format for *event_time*. <br>
Next, a working sample is created from two selected shops:

In [555]:
sample = more_3_df_wip.get_group(145777302)
sample2 = more_3_df_wip.get_group(91799978)
sample_group = sample.append(sample2)

In [564]:
sample_grouped = sample_group.groupby(['shopid'])

In [551]:
sample_grouped.groups.keys()

dict_keys([91799978, 145777302])

In [557]:
len(sample_group)

780

In [601]:
len(sample_grouped)

2

In [565]:
(sample_grouped.head(10))

Unnamed: 0,orderid,shopid,userid,event_time
50784,31507130623005,91799978,3201499,2019-12-31 23:58:50
48504,31507006986155,91799978,2291648,2019-12-31 23:56:46
16061,31506853884014,91799978,651222,2019-12-31 23:54:13
42924,31506735191545,91799978,101185112,2019-12-31 23:52:16
26861,31506176050152,91799978,144093538,2019-12-31 23:42:57
58001,31505229109816,91799978,15220103,2019-12-31 23:27:09
161465,31505024174135,91799978,80788978,2019-12-31 23:23:44
191122,31504837894066,91799978,3538134,2019-12-31 23:20:38
183212,31504293923099,91799978,29253423,2019-12-31 23:11:33
161500,31503712740580,91799978,3436527,2019-12-31 23:01:52


#### Attempting the Function

The consolidated function is as follows:

In [605]:
def calc_cr(shop_group_row, shop_group):
    # Use the row's event_time as the start_time and +1 hour for end_time
    start_time = pd.Timestamp.to_datetime64(shop_group_row.event_time)
    end_time = pd.Timestamp.to_datetime64(shop_group_row.event_time + pd.Timedelta(hours=1))

    # Use a location mask to find rows that fit within the 1 hour window, saved to the variable interval:
    mask = ((shop_group['event_time'].values >= start_time) & (shop_group['event_time'].values <= end_time))
    interval = shop_group.loc[mask] 
    cr_rate = len(interval) / len(interval.userid.unique()) # as per the given formula
    
    if cr_rate >= 3:
        # find the the number of highest orders in the interval
        highest_order = interval.groupby('userid')['orderid'].count().agg('max')
        # group interval rows by user id to find which user corresponds to the highest order number(s)
        interval_grouped = interval.groupby('userid').count().sort_values(by='orderid', ascending=False)
        user_index = interval_grouped[interval_grouped['orderid'] == highest_order].index.tolist()
        
        # find the corresponding shopid key in the submissions_dict and add it:
        key_search = int(shop_group_row.shopid)
        # since all values were initially set to 0, reset the value the first time it is encountered
        if submissions_dict.get(key_search)==[0]:
            submissions_dict[key_search] = []
            submissions_dict[(key_search)].extend(user_index)
            print(submissions_dict.get(key_search))
        else:
            submissions_dict[(key_search)].extend(user_index)
            print(submissions_dict.get(key_search))        

For this trial, we run the function through the sample group defined previously, which contains two shops.

In [592]:
for shopid, shop_group in sample_grouped:
    for row_index, row in shop_group.iterrows():
        calc_cr(row, shop_group)


[201343856]


The output refers to the value of the key-value pair in the **submissions_dict** dictionary defined at the beginning of the notebook. The next step will be to reset all values in the dictionary (to clear out any test data) and run the function through driver code which will include the entirety of **more_3_df**.

In [606]:
for shopid, shop_group in more_3_df_wip:
    for row_index, row in shop_group.iterrows():
        calc_cr(row, shop_group)

[77819]
[672345]
[672345, 672345]
[740844]
[170385453]
[170385453, 170385453]
[170385453, 170385453, 170385453]
[190449497]
[190449497, 190449497]
[214992524]
[72914921]
[264511]
[181682008]
[7670129]
[75558350]
[75558350, 75558350]
[62618064]
[62618064, 62618064]
[188942105]
[122277324]
[181408876]
[15053804]
[15053804, 15053804]
[15053804, 15053804, 15053804]
[123959597]
[214568881]
[80690628]
[212325226]
[143847348]
[556867]
[9753706]
[162508227]
[162508227, 215382704]
[162508227, 215382704, 13135622]
[162508227, 215382704, 13135622, 137245836]
[139795934]
[210920501]
[8405753]
[95058664]
[95058664, 95058664]
[199416406]
[152292010]
[148215831]
[214546342]
[136680607]
[148215831]
[156614746]
[214588488]
[48412388]
[215424202]
[32594]
[200925208]
[211907762]
[205729485]
[205729485, 205729485]
[205729485, 205729485, 205729485]
[205729485, 205729485, 205729485, 205729485]
[205729485, 205729485, 205729485, 205729485, 205729485]
[205729485, 205729485, 205729485, 205729485, 205729485, 205

[199416406]
[6059093]
[148215831]
[193338089]
[50198835]
[211296094]
[215115251]
[1866916]
[1866916, 1866916]
[96046105]
[215301243]
[12597591]
[81928284]
[132704747]
[132704747, 132704747]
[187697407]
[187697407, 215009429]
[98709440]
[98709440, 98709440]
[98709440, 98709440, 98709440]
[71152760]
[101832161]
[101832161, 214208720]
[199382229]
[210932914]
[210932914, 210932914]
[158048102]
[158048102, 158048102]
[158048102, 158048102, 158048102]
[144902703]
[174783274]
[31215088]
[31916119]
[31916119, 31916119]
[27456547]
[799445]
[214925963]
[191211430]
[191211430, 191211430]
[179171579]
[213646699]
[4624716]
[1762129]
[105935455]
[129799840]
[34132265]
[89254393]
[214605778]
[189834273]
[73308605]
[114282846]
[114282846, 114282846]
[198662175]
[198662175, 198662175]
[198662175, 198662175, 198662175]
[214111334]
[52867898]
[52867898, 52867898]
[213646699]


Some jostling around with the output to see what we got:

In [607]:
submissions_dict.values()

dict_values([[0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [122277324], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [213502289], [0], [9753706], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [61893096], [76102350, 188025647], [0], [0], [181408876], [0], [0], [0], [0], [0], [0], [174145893, 174145893, 174145893], [0], [0], [0], [0], [0], [0], [152352709], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [123158564, 1231585

In [611]:
submissions_dict.items()

dict_items([(93950878, [0]), (156423439, [0]), (173699291, [0]), (63674025, [0]), (127249066, [0]), (173811070, [0]), (107921853, [0]), (178400128, [0]), (147941492, [0]), (164933170, [0]), (9374147, [0]), (145694343, [0]), (96464079, [0]), (30988921, [0]), (199867753, [0]), (67162407, [0]), (65883234, [0]), (33242381, [0]), (3285661, [0]), (95138572, [0]), (286003, [0]), (12662873, [0]), (152569117, [0]), (8051258, [0]), (12480907, [0]), (60239193, [0]), (106779896, [0]), (111250776, [0]), (51526935, [0]), (195394274, [0]), (52637837, [0]), (43719124, [0]), (60489360, [0]), (161269907, [0]), (1175477, [122277324]), (213154942, [0]), (39938958, [0]), (168748997, [0]), (93363430, [0]), (96757689, [0]), (90339629, [0]), (193010376, [0]), (137754804, [0]), (152871252, [0]), (67960532, [0]), (64625969, [0]), (4669871, [0]), (62713846, [0]), (39554718, [0]), (84706933, [0]), (200296452, [0]), (25958852, [0]), (47415942, [0]), (13451191, [0]), (33236404, [0]), (173718481, [0]), (112904482, [

Doing a sanity check on one of the identified values to see if the output is right:

In [614]:
sample_check = more_3_df_wip.get_group(1175477).sort_values(by='event_time')
sample_check
# 122277324

Unnamed: 0,orderid,shopid,userid,event_time
182227,31077738308029,1175477,133347149,2019-12-27 00:42:18
166971,31079037951785,1175477,72245454,2019-12-27 01:03:57
112516,31089161941864,1175477,83327927,2019-12-27 03:52:41
138009,31102146730897,1175477,207094814,2019-12-27 07:29:06
175221,31108088459314,1175477,9759791,2019-12-27 09:08:08
...,...,...,...,...
167148,31478694578633,1175477,49626167,2019-12-31 16:04:54
2935,31479933288654,1175477,97526779,2019-12-31 16:25:34
130548,31493293134509,1175477,34472559,2019-12-31 20:08:13
82104,31495946114847,1175477,96929547,2019-12-31 20:52:27


In [617]:
with pd.option_context('display.max_rows', None, 'display.max_columns', None):
   print(sample_check)

               orderid   shopid     userid          event_time
182227  31077738308029  1175477  133347149 2019-12-27 00:42:18
166971  31079037951785  1175477   72245454 2019-12-27 01:03:57
112516  31089161941864  1175477   83327927 2019-12-27 03:52:41
138009  31102146730897  1175477  207094814 2019-12-27 07:29:06
175221  31108088459314  1175477    9759791 2019-12-27 09:08:08
108664  31109768493579  1175477   11041634 2019-12-27 09:36:08
162734  31114666396316  1175477   18338916 2019-12-27 10:57:46
19750   31121598490534  1175477    2153215 2019-12-27 12:53:18
125147  31123749663881  1175477  207210281 2019-12-27 13:29:09
35      31124045990881  1175477  150633728 2019-12-27 13:34:05
55592   31124144566028  1175477     866118 2019-12-27 13:35:44
175155  31130125339780  1175477    6070806 2019-12-27 15:15:25
47040   31131581699169  1175477   58987755 2019-12-27 15:39:42
89019   31136275704211  1175477  105951627 2019-12-27 16:57:55
30079   31143477277973  1175477   74453011 2019-12-27 1

The user 122277324 had placed 3 consecutive orders starting at 22:50:39, 2019-12-28. So far the output checks out. <br> Let's go through all non-zero values in the dictionary to see the rest of the order-brushers:

In [632]:
for (key,value) in submissions_dict.items():
    if value != [0]:
        print(key,value)

1175477 [122277324]
66861410 [213502289]
8715449 [9753706]
58543771 [61893096]
156883302 [76102350, 188025647]
1532569 [181408876]
27476241 [174145893, 174145893, 174145893]
58835561 [152352709]
80049863 [123158564, 123158564]
210197928 [52867898, 52867898]
104590058 [81928284]
188546697 [31916119, 31916119]
29979455 [107641182]
43412276 [50672876]
68862371 [67554410]
27121667 [183926374]
52377417 [213919641, 213919641]
131387639 [192251866]
96757644 [215243653]
87621695 [87846708]
130100254 [18688337]
50713918 [172106152]
169902791 [50198835]
42818 [170385453, 170385453, 170385453]
161160594 [74027394]
123548863 [131515076, 131515076]
66337054 [122507717]
22800308 [99997899]
11896733 [156614746]
126639670 [101296643]
86368642 [144612139]
27015534 [188025647]
145777302 [201343856]
168046193 [6059093]
14646941 [200925208]
189308408 [27456547]
93358941 [79419297]
89126206 [3295689]
118949192 [211604080]
14184981 [32594]
64394533 [194833170, 194833170, 194833170, 194833170, 194833170, 194

In [637]:
for (key,value) in submissions_dict.items():
    if value != [0]:
        # create a new dictionary by running the original dict.values through a set to remove duplicates.
        submissions_dict_nodups = {k: list(sorted(set(v))) for k, v in submissions_dict.items()} # sorted creates a sorted set of values

In [639]:
for (key,value) in submissions_dict_nodups.items():
    if value != [0]:
        print(key,value)

1175477 [122277324]
66861410 [213502289]
8715449 [9753706]
58543771 [61893096]
156883302 [76102350, 188025647]
1532569 [181408876]
27476241 [174145893]
58835561 [152352709]
80049863 [123158564]
210197928 [52867898]
104590058 [81928284]
188546697 [31916119]
29979455 [107641182]
43412276 [50672876]
68862371 [67554410]
27121667 [183926374]
52377417 [213919641]
131387639 [192251866]
96757644 [215243653]
87621695 [87846708]
130100254 [18688337]
50713918 [172106152]
169902791 [50198835]
42818 [170385453]
161160594 [74027394]
123548863 [131515076]
66337054 [122507717]
22800308 [99997899]
11896733 [156614746]
126639670 [101296643]
86368642 [144612139]
27015534 [188025647]
145777302 [201343856]
168046193 [6059093]
14646941 [200925208]
189308408 [27456547]
93358941 [79419297]
89126206 [3295689]
118949192 [211604080]
14184981 [32594]
64394533 [194833170]
134968430 [59725263]
171407673 [211296094]
16001939 [205729485]
63001696 [81928284]
178273138 [71152760]
51134277 [29857724, 212200633]
15052673

The following code can be used to manually check if the above output is accurrate. Replace the value in *more_3_df_wip.get_group(VALUE)* with any shopid and see if the corresponding userid displays orderbrushing in the output.

In [625]:
with pd.option_context('display.max_rows', None, 'display.max_columns', None):
    sample_check_2 = more_3_df_wip.get_group(156883302).sort_values(by='event_time')
    print(sample_check_2) # 76102350, 188025647

               orderid     shopid     userid          event_time
215218  31100672373467  156883302   68977756 2019-12-27 07:04:32
68439   31100977597782  156883302  155379241 2019-12-27 07:09:37
162531  31101488337409  156883302  192062938 2019-12-27 07:18:08
72319   31103141463757  156883302  187119675 2019-12-27 07:45:41
152477  31104942645315  156883302  160218790 2019-12-27 08:15:42
87402   31105140135135  156883302  136087303 2019-12-27 08:19:01
43778   31106329531411  156883302  122736767 2019-12-27 08:38:50
22494   31106974728041  156883302   45050041 2019-12-27 08:49:34
7106    31110087220831  156883302  127299122 2019-12-27 09:41:27
7070    31111060412201  156883302   38945250 2019-12-27 09:57:40
196483  31112368306131  156883302     198978 2019-12-27 10:19:29
16679   31113317704632  156883302   80022509 2019-12-27 10:35:17
8139    31113387412632  156883302  117615979 2019-12-27 10:36:27
191942  31114283969578  156883302  187119675 2019-12-27 10:51:23
148253  31117602042023  1

### Cleaning up the Output and export to csv

A few last things to do:
1. Remove all duplicate values in the dictionary.
2. Ensure that the values are sorted numerically
3. Concatenate values with a '+'

Here we create our final dict. (**submissions_dict_final**) from the previous dict with removed duplicates to join the values with the '&' character as required by the solution.

In [647]:
for (key,value) in submissions_dict_nodups.items():
    if value != [0]:  
        submissions_dict_final = {k: "&".join(map(str,v)) for k, v in submissions_dict_nodups.items()} 

Iterate through the final dict to check output:

In [648]:
for (key,value) in submissions_dict_final.items():
    if value != [0]: 
        print(key,value)

93950878 0
156423439 0
173699291 0
63674025 0
127249066 0
173811070 0
107921853 0
178400128 0
147941492 0
164933170 0
9374147 0
145694343 0
96464079 0
30988921 0
199867753 0
67162407 0
65883234 0
33242381 0
3285661 0
95138572 0
286003 0
12662873 0
152569117 0
8051258 0
12480907 0
60239193 0
106779896 0
111250776 0
51526935 0
195394274 0
52637837 0
43719124 0
60489360 0
161269907 0
1175477 122277324
213154942 0
39938958 0
168748997 0
93363430 0
96757689 0
90339629 0
193010376 0
137754804 0
152871252 0
67960532 0
64625969 0
4669871 0
62713846 0
39554718 0
84706933 0
200296452 0
25958852 0
47415942 0
13451191 0
33236404 0
173718481 0
112904482 0
24759976 0
10009 0
116628441 0
54615708 0
178718312 0
155145027 0
192895429 0
171916860 0
165030850 0
64492135 0
175116620 0
641249 0
172185419 0
66861410 213502289
140937896 0
8715449 9753706
157026776 0
63663065 0
129257169 0
174032222 0
99787848 0
158448460 0
141604059 0
70506399 0
115294353 0
84547411 0
8121841 0
61556313 0
137762642 0
1473984

134968588 0
168917434 0
50968349 0
83669215 0
60317321 0
23076960 0
166676440 0
156292400 0
202407295 0
16391635 0
45329853 0
110868046 0
164031738 0
141603619 0
183462157 0
10380 0
66375973 0
1808698 0
89122449 0
193459562 0
197191551 0
83764530 0
95137178 0
115725491 0
163743165 0
84172725 0
143899629 0
114932479 0
186729040 0
801940 0
64112995 0
190557713 0
190512141 0
165045834 0
73834330 0
15336163 0
119337470 0
28619747 0
16629965 0
75715238 0
130117591 0
164768716 0
76357257 0
3711886 0
163102991 0
178104812 0
90904399 0
54566649 0
798888 0
69034419 0
130787776 0
126264419 0
195280699 0
19811463 0
121804353 0
168577334 0
85459183 0
74652209 0
168652240 0
76473021 0
12530609 0
134946458 0
44673604 0
171916911 0
40263807 0
19052512 0
193422616 0
1335777 0
130119029 0
13511 0
170480649 0
97004319 0
178496438 0
145824102 0
89116953 0
137251611 0
131669808 0
15618985 0
97083776 0
54619920 0
129926387 0
111164466 0
201118334 0
83440644 0
26167337 0
167134527 0
29238238 15861876
710187

163134195 0
191000229 0
145496441 0
130099326 0
88030821 0
98815227 0
141003071 0
123188926 0
11474437 148215831
29238486 0
57327530 0
47432062 0
68266167 0
27430595 0
173798766 0
140130382 0
78686907 0
182888446 0
195860965 0
12852945 0
54593721 0
1005535 0
60315503 0
161672415 0
119595082 0
173552065 0
30101998 0
129877943 0
143897805 0
190793215 0
209270473 0
20660177 0
135041816 0
162291936 0
184639455 0
125073665 0
211994177 0
75424355 0
151938290 0
153763955 0
29194590 0
156315311 0
139614313 0
190648689 0
72376784 0
164763766 0
163304907 0
76357052 0
153206786 0
92937409 0
165420424 0
45349944 0
104278513 0
132964122 142710562
147116324 0
137490898 0
54799439 0
81190174 77374158
95446709 0
12990524 0
139241014 0
63546470 0
96573864 0
175755100 0
185229671 0
156292340 0
63636957 0
96755478 0
50968872 0
60863338 0
130946507 0
68478964 29466045
72568210 0
115309793 0
66630848 0
76357110 0
177825912 0
46703854 0
131387121 0
173696254 0
68973439 0
196535720 0
60863422 0
80776569 0
23

39153592 0
98481320 124597967
10063876 0
32729279 205785599
60863536 0
98548360 0
135041672 0
40204979 0
657962 0
119196780 0
119312623 0
19785711 0
109630912 0
66862105 0
143169445 0
172444943 0
81087553 0
88685658 0
50968294 0
152911385 0
43230556 0
209278116 0
28051451 0
134972689 0
160376238 0
139759739 0
128268206 0
186756393 0
193690760 0
173471584 0
163968746 0
173697248 0
145806234 0
49071150 0
104802096 0
188690274 0
201232115 0
187110939 0
157633413 202607600
103741407 0
138762857 0
182622636 0
52636778 0
50115844 0
143899548 0
61936730 0
155382137 0
121729278 0
107932403 0
56539845 0
193419316 0
30870624 0
186658712 0
32811676 0
160281005 0
129916340 0
154878985 0
155282586 0
161502570 0
77529949 0
129018264 0
58530468 0
10199219 8405753
70609295 0
161521441 0
139259486 0
167906285 0
135041876 0
156491018 0
138638581 0
73654673 0
69891777 0
134391409 0
126788643 0
139241412 0
158067558 0
173468477 0
63002245 0
139581402 0
89117437 0
175673079 0
129926373 0
35583557 0
2036664

160406037 0
100458655 0
130159345 0
117921438 0
201603821 0
86371836 0
188116771 0
98758982 0
164948974 0
91088086 0
100262732 0
203588935 0
27121470 0
26097555 0
85965597 0
184869037 0
208793476 0
130099434 0
134937818 0
60866927 0
159621692 0
40263031 0
169105108 0
80763275 0
164933259 0
189157401 0
201287310 0
193421886 0
6259078 0
12216137 0
190626947 0
184570008 0
163089921 0
180957311 0
6360366 0
199939750 0
147425929 0
9886045 0
162153111 0
186970124 0
173822220 0
119362011 0
199316284 0
66861680 0
164786255 0
21222256 0
191065421 0
169489186 0
170608964 0
118964457 0
10342387 0
160393970 0
77548457 0
201539499 0
49802667 0
141602437 0
804140 0
201537882 0
128278570 0
2746342 0
202960637 0
126018672 0
9490500 0
143598794 0
57775788 0
176433157 0
186977059 0
133288339 0
106935669 0
194058116 0
166392632 0
201202473 0
63841244 0
182852854 0
90187350 0
146752779 0
38143850 0
6275128 0
52868403 0
159556422 0
48702311 0
29652480 0
74286996 0
162536489 0
130912802 0
90336797 0
8376189

144606574 0
159780195 0
127263408 0
85818 0
169037270 0
203552493 0
203008277 0
203091622 0
148043541 0
138761416 0
98793048 0
163045382 0
130118802 0
201533544 0
80006509 0
104756497 0
183618498 0
190951548 0
3682508 0
22884785 0
126599833 0
24762993 0
140626250 0
140236032 0
118120145 0
163786659 0
169856158 0
115725006 0
186721073 0
203121757 0
173317538 0
203574907 0
160139209 0
159879078 0
105064487 0
163145687 0
190957583 0
176361250 0
12695177 0
201792532 0
53486257 0
130168017 0
109869442 0
20659 0
171196974 0
3160808 0
53486430 0
3575737 0
153498238 0
115119563 0
37534128 0
140206635 0
104732132 0
44669658 0
25146625 0
145961495 0
98546882 0
187206508 0
180939333 0
190372020 0
129406129 0
114054349 0
118981067 0
57776355 0
104860134 0
180222243 0
160709200 0
3300935 0
195823006 0
186723145 0
193411197 0
193418948 0
97868707 0
16282102 0
94264180 0
96890153 0
17032156 0
173822694 0
169903775 0
208011984 0
211107379 0
146856854 0
168242609 0
156562815 0
21222675 0
199250128 0
15

208803655 0
28084506 0
79840464 0
195855021 0
163318918 0
94154189 15826134
146072997 0
209490927 0
186678201 0
163830685 0
10074104 0
173554019 0
203643788 0
7870870 0
43271089 0
131974582 0
199998660 0
43230749 0
100030014 0
133345670 0
98793058 0
51552866 0
190674726 0
117076586 0
120618263 0
95192543 0
153711069 0
75620273 0
195864654 0
169878610 0
190928881 0
32737848 0
40891593 0
141604496 0
202343389 0
118956088 0
203579751 0
163106145 0
166418546 0
19711519 0
121992041 0
58528257 0
203724906 0
58722602 0
4867543 0
100866136 0
186763043 0
220988 0
201522295 0
99779965 0
192098643 0
155962526 0
200255504 0
83080084 0
163300885 0
162549683 0
19190 0
133520049 0
188360696 0
66868548 0
127605095 0
12513859 0
140380048 0
173176517 0
87828592 0
26490116 0
186667395 0
155277790 0
203555348 0
173472823 0
66783 0
65883103 0
173485453 0
124119741 0
187002886 0
62666720 0
212875338 0
30337244 0
161176938 0
147115674 0
88673187 0
163339333 0
137762508 0
163098931 0
65761398 0
145282112 0
13

187598248 0
134311407 0
143899660 0
209020723 0
163146069 0
162556150 0
20131677 0
131670005 0
27122110 0
154639436 0
143899651 0
188285813 0
201573045 0
208718102 0
136793067 0
68465751 0
89747333 0
188768 0
161775818 0
164932887 0
203736067 0
189607124 0
61361053 0
195858222 0
58559967 0
152249818 0
163525191 0
41056024 0
65644802 0
153671410 0
196583745 0
135041740 0
209249386 0
57605508 0
119690650 0
58685380 0
164763969 0
3959595 0
148856653 199382229
86453400 0
181308113 0
162256432 0
123391395 0
134630074 0
168062274 0
162279623 0
193420427 0
95507169 0
3318265 0
173318849 0
186752166 0
13390110 0
195870043 0
102371504 0
190692719 0
30235452 0
134968668 0
190757565 0
162840914 0
87579227 0
207416040 0
174047624 0
17079611 0
191964403 0
203547348 0
171795817 0
9916762 0
207952808 0
177601307 0
19753598 0
98524089 0
63840564 0
79113204 0
52280537 0
98567418 0
160386998 0
102634861 0
8223757 0
66363843 0
155278251 0
60867136 0
195840355 0
111677993 0
160403686 0
169446714 0
8525855

192001449 0
162182618 0
72385961 0
162531526 0
186660875 0
33542933 0
203603386 0
134320586 0
210104194 0
177701952 0
142362028 0
126599799 0
200667339 0
77216634 0
163138478 0
19051363 0
51525093 0
85256495 0
127641976 0
136876516 0
173548859 0
153674755 0
99825399 0
163318262 0
96523677 0
173261912 0
132658366 0
937944 0
101773184 0
165667942 0
133315693 0
203610986 0
182260888 0
86814930 0
76621276 0
163484867 0
163139694 0
173998553 0
163531532 0
201524578 0
54551673 0
114540284 0
173187603 0
133341346 0
199923322 0
762717 0
182888780 0
169105354 0
47442958 0
22254816 0
91680371 0
163160271 0
158091517 0
163143219 0
187091261 0
114154490 0
161675251 0
156674533 0
157012995 0
164811257 0
139431099 0
191957699 0
96197811 0
199249615 0
84533430 0
188926513 0
174910936 0
187017353 0
170271314 0
108382512 0
986631 0
140653710 0
91711044 0
162265700 0
201515860 0
71231490 0
172441973 0
91588901 0
26008488 0
875520 0
134968809 0
173817123 0
175745494 0
199248444 0
65740806 0
59545466 0
19

130789896 0
52362883 0
89765240 0
173204994 0
203746738 0
173471593 0
47745581 0
53202838 0
118945471 0
84533813 0
13782202 0
133303179 0
38323641 0
201486123 0
68020156 0
27517901 0
162270614 0
173259476 0
163534032 0
31643542 0
4878652 0
163481817 0
53630765 0
37146548 0
134252682 0
203580445 0
84805338 0
162846146 0
77586555 0
99618930 0
147718794 0
203725644 0
202529043 0
209183709 0
144177344 0
173315987 0
201570013 0
77216247 0
145263678 0
86637817 0
208835033 0
182888875 0
203534830 0
153675864 0
2230180 0
186789133 0
9885746 0
8840787 0
191939507 0
158569173 0
193459844 0
199099924 0
133499640 0
160170601 0
173474837 0
90338972 0
72365990 0
58241322 0
64655454 0
203551584 0
109890128 0
117564218 0
193422225 0
114805071 0
127379068 0
14636830 0
195824123 0
82368015 0
71007634 0
8019790 0
162286973 0
173816870 0
213145842 0
195795180 0
54570418 0
55249242 0
209041504 0
164932816 0
163537392 0
203573352 0
84119334 0
163298494 0
130119922 0
168655684 0
140150085 0
173454039 0
16352

134972692 0
163105400 0
14847256 0
179883869 0
119177082 0
66207514 0
205858628 0
102457988 0
206109283 0
210384319 0
21003104 0
88229967 0
501932 0
161209949 0
185650462 0
149133988 0
133530721 0
163126103 0
173458480 0
34356347 0
165471989 0
182853436 0
130099774 0
139926994 0
163124119 0
161732089 0
162181118 0
106943584 0
209040904 0
116628819 0
193422665 0
84673305 0
132688963 0
199977165 0
1189349 0
203572647 0
162819223 0
212060630 0
203447489 0
173471318 0
127227575 0
120757264 0
187211517 0
162287726 0
151958490 0
76229005 0
175484452 0
208797142 0
173311951 0
160400782 0
96833874 0
187091464 0
193231538 0
203309094 0
203089147 0
191962526 0
190761507 0
201783462 0
161969714 0
163097022 0
191523325 0
196071152 0
193459138 0
209086781 0
160301182 0
180234391 0
162537691 0
201524518 0
180216381 0
203587649 0
207636840 0
76226570 0
137323224 0
193423455 0
122043534 0
95302417 0
46915331 0
32722065 0
203515350 0
19966156 0
225459 0
202065523 0
27573813 0
76818425 0
173700039 0
190

138269099 0
16007057 0
199370435 0
104493048 0
42351877 0
98100354 0
163132364 0
173199839 0
11509276 0
161224200 0
163091108 0
25199934 0
58685176 0
163967082 0
173323084 0
101552369 0
131672003 0
6523899 0
203605890 0
207627077 0
162572250 0
133918731 0
19175905 0
162830713 0
160441307 0
170347637 0
2767461 0
130117669 0
203407048 0
201175339 0
163328804 0
173315329 0
160373846 0
173319118 0
118898248 0
201539626 0
175755633 0
173328304 0
209445527 0
158157681 0
173818957 0
199312992 0
202962209 0
209022061 0
159622335 0
204245272 0
195855747 0
199332938 0
106984516 0
8510293 0
73662914 0
199596523 0
91679236 0
141605927 0
199337048 0
55314237 0
10596997 0
191088368 0
162838262 0
190956518 0
162551155 0
10432 0
114058911 0
159363477 0
139259937 0
182299984 0
163317135 0
154589565 0
40390924 0
91292495 0
162297463 0
30109246 0
115712334 0
170284822 0
211926693 0
61359929 0
158076043 0
38000833 0
211994008 0
199960720 0
160382749 0
153530155 0
213334384 0
11436014 0
134968861 0
4378283

68466427 0
19686714 0
92549088 0
186734975 0
133311817 0
209198236 0
162568416 0
163124388 0
160168780 0
191986467 0
137774747 0
139732198 0
144980919 0
61441376 0
180509028 0
102115038 0
203600398 0
173488919 0
131501868 0
163726871 0
77211680 0
135623130 0
203554083 0
133509863 0
187084754 0
112886124 0
86526477 0
209331368 0
185624689 0
161163130 0
160409352 0
119590825 0
98477239 0
39277699 0
80006607 0
203436781 0
146100412 0
201312226 0
193417657 0
130112873 0
159619973 0
121084369 0
182895710 0
160405892 0
80006843 0
51667143 0
163153609 0
186780706 0
163143281 0
163531604 0
163354336 0
133335833 0
192142250 0
173487204 0
18381281 0
182268152 0
88457296 0
80007026 0
150008257 0
98799636 0
173317512 0
441961 0
151732743 0
188300933 0
140393071 0
203608263 0
182342783 0
162832454 0
213507904 0
187182457 0
61899879 0
163355060 0
208710038 0
203599554 0
11518 0
203547767 0
98818920 0
4478926 0
19508558 0
163140518 0
29709529 0
39854919 0
129183537 0
192002738 0
44541692 0
162833133 

195866698 0
183378257 0
47911735 0
28794357 0
162835334 0
189466944 0
209025783 0
163139667 0
203596828 0
29394216 0
201528493 0
208958588 0
165501673 0
152857791 0
162301196 0
203655888 0
45387853 0
199026968 0
170137904 0
174020580 0
199085867 0
149193726 0
173317802 0
4377585 0
162280461 0
201539978 0
188718920 0
2489934 0
186676913 0
169213423 0
162525663 0
71007668 0
196677500 0
56011607 0
119337282 0
205523582 0
50391709 0
161167639 0
166372279 0
18723020 0
166687976 0
151322504 0
743313 0
87829666 0
39983965 0
76750563 0
190959051 0
193456037 0
101702742 0
199337881 0
193414880 0
191024794 0
207655344 0
95891080 0
202943576 0
57781183 0
39401486 0
76668317 0
173553245 0
95251808 0
170110652 0
88484947 0
149193451 0
162540971 0
12409211 0
199374311 0
130117283 0
170112826 0
3407349 0
102290430 0
133519199 0
203644158 0
208860824 0
190982276 0
132929721 0
30534082 0
162292615 0
64756412 0
163329581 0
191673591 0
160318139 0
171278903 0
186830644 0
173476335 0
173321255 0
205820716

208687125 0
203663691 0
163535679 0
193454677 0
154420667 0
173491273 0
173301338 0
126273751 0
119705031 0
2728464 0
199320905 0
161249142 0
193457213 0
175063393 0
166822460 0
193421047 0
143598009 0
51813683 0
170340488 0
201235024 0
173462239 0
201527989 0
133302798 0
163097760 0
10888295 0
173760568 0
156990063 0
53011212 0
199383638 0
203401828 0
162285767 0
173321102 0
145235500 0
13126620 0
189485658 0
199261020 0
13840154 0
203646847 0
149775347 0
105377652 0
178317441 0
104292134 0
190967596 0
203666583 0
72901132 0
186763495 0
20591056 0
10546 0
199250033 0
155134587 0
173822192 0
88345846 0
110350857 0
191267944 0
211917617 0
191978715 0
226837 0
126609551 0
173472469 0
6750463 0
3589923 0
163153702 0
100767487 0
87219518 0
203552007 0
62375196 0
202524225 0
203079213 0
209964799 0
208690815 0
173475503 0
197358 0
2622447 0
173324404 0
182857715 0
84533500 0
176120653 0
162827714 0
121726929 0
173322771 0
209290954 0
201600201 0
172110274 0
173312023 0
201570699 0
14175352 

173318465 0
161171850 0
203588723 0
128299620 0
162520984 0
195860316 0
203572166 0
163090545 0
201486146 0
163100087 0
209079960 0
162304669 0
191837158 0
90350598 0
77564713 0
88634465 0
86371875 0
158066512 0
201600821 0
114866093 0
208750208 0
132832482 0
51973413 0
27416923 0
144926789 0
170797965 0
182892685 0
173476415 0
162288430 0
191126530 0
202946928 0
208850956 0
195846790 0
205676293 0
168826152 0
191596225 0
203804105 0
203717962 0
156991954 0
160290040 0
178465897 0
191167365 0
186790276 0
156479521 0
199383978 0
1007013 0
156796353 0
119589985 0
163330147 0
45339649 0
191403980 0
195870462 0
193417619 0
208689621 0
83411597 0
173320098 0
194924047 0
201237676 0
198848816 0
29562478 0
162554676 0
162568190 0
133122078 0
203608129 0
151133727 0
153711585 0
198897466 0
203659231 0
205884371 0
117731782 0
147115161 0
198847478 0
173817685 0
183936318 0
165760599 0
191590567 0
159798457 0
11323894 0
1340138 0
95122605 0
110870578 0
161171701 0
173323917 0
211104964 0
1634955

Yahoo! Everything looks great. <br>
One final check to see if solution fits the required dimensions:

In [649]:
len(submissions_dict_final)

18770

#### Output the solution to csv:

In [652]:
import csv

In [650]:
order_brushing_submissions_file = open("order_brushing_submissions_file.csv", "w")

In [654]:
writer = csv.writer(order_brushing_submissions_file)

for key, value in submissions_dict_final.items():
    writer.writerow([key, value])

order_brushing_submissions_file.close()