<h1><font color='black'>Concatting tables</font></h1>

<br>

<ul>
    <li><b><code>pd.concat([df_1, df_2], ignore_index=True)</code></b> - Concats tables on top of one another, creating a longer table. The <code>ignore_index</code> element resets the index inside the results.</li>
    <br>
    <li><b><code>ignore_index=False, keys=['key_1', 'key_2']</code></b> - Specifying <code>keys</code> allows additional index data to be applied in the results. Below, keys 'table_1' & 'table_2' indicate origin tables.</li>
    <br>
    <li><b><code>diff_cols</code></b> - Two tables consisting of different number of fields can still be concatenated using <code>pd.concat</code>. However, <code>NaN</code> values return on fields not present.</li>
    <br>
    <li><b><code>diff_cols_inner</code></b> - Specifying <code>join='inner'</code> redacts columns which are not omnipresent in all concatenated tables.</li>
    <br>
    <li><b><code>desk.append(mob)</code></b> - <code>.append()</code> functions similarly to <code>pd.concat</code>and support many same elements.</li>    
</ul>

<br>

In [14]:
# libraries
import pandas as pd
import numpy as np

# import
def csv_to_df(file_path):
    df = pd.read_csv(file_path)
    return df

# data
dev = csv_to_df('Data\data_device.csv')
jrny = csv_to_df('Data\data_journey.csv')
desk = dev[dev['device_type']=='Desktop']
mob = dev[dev['device_type']=='Mobile']
mob_jrny = mob.merge(jrny, on='session_id', how='left')

# jrny
dev.head()

Unnamed: 0,session_id,device_type
0,ad22f37f-3090,Desktop
1,5c503911-6193,App iOS
2,a36043e6-3259,App Android
3,fbe343ca-4075,App Android
4,efaee988-3573,Mobile


<br>

<ul>
    <li><b><code>pd.concat([df_1, df_2], ignore_index=True)</code></b> - Concats tables on top of one another, creating a longer table. The <code>ignore_index</code> element resets the index inside the results.</li>
</ul>

<br>

In [11]:
# basic concat
conc = pd.concat([desk, mob], ignore_index=True)
conc

Unnamed: 0,session_id,device_type
0,ad22f37f-3090,Desktop
1,fca2239f-5308,Desktop
2,60cf4a7f-1066,Desktop
3,91d5cdde-4196,Desktop
4,447fefda-3199,Desktop
...,...,...
4004,4f196e78-6730,Mobile
4005,8b0de634-382,Mobile
4006,66a2ced8-4037,Mobile
4007,92167cc7-9704,Mobile


<br>

<ul>
    <li><b><code>ignore_index=False, keys=['key_1', 'key_2']</code></b> - Specifying <code>keys</code> allows additional index data to be applied in the results. Below, keys 'table_1' & 'table_2' indicate origin tables.</li>
</ul>

<br>

In [12]:
# add keys
keys = pd.concat([desk, mob], ignore_index=False, keys=['table_1', 'table_2'])
keys

Unnamed: 0,Unnamed: 1,session_id,device_type
table_1,0,ad22f37f-3090,Desktop
table_1,7,fca2239f-5308,Desktop
table_1,29,60cf4a7f-1066,Desktop
table_1,51,91d5cdde-4196,Desktop
table_1,53,447fefda-3199,Desktop
...,...,...,...
table_2,8014,4f196e78-6730,Mobile
table_2,8016,8b0de634-382,Mobile
table_2,8018,66a2ced8-4037,Mobile
table_2,8019,92167cc7-9704,Mobile


<br>

<ul>
    <li><b><code>diff_cols</code></b> - Two tables consisting of different number of fields can still be concatenated. <code>NaN</code> values return on fields not present.</li>
</ul>

<br>

In [16]:
# tables have different fields
diff_cols = pd.concat([desk, mob_jrny], ignore_index=True)
diff_cols

Unnamed: 0,session_id,device_type,account_login,basket_page,checkout_page,order_placed
0,ad22f37f-3090,Desktop,,,,
1,fca2239f-5308,Desktop,,,,
2,60cf4a7f-1066,Desktop,,,,
3,91d5cdde-4196,Desktop,,,,
4,447fefda-3199,Desktop,,,,
...,...,...,...,...,...,...
4004,4f196e78-6730,Mobile,0.0,0.0,0.0,0.0
4005,8b0de634-382,Mobile,1.0,0.0,0.0,0.0
4006,66a2ced8-4037,Mobile,0.0,0.0,0.0,0.0
4007,92167cc7-9704,Mobile,0.0,0.0,0.0,0.0


<br>

<ul>
    <li><b><code>diff_cols_inner</code></b> - Specifying <code>join='inner'</code> redacts columns which are not omnipresent in all concatenated tables.</li>
</ul>

<br>

In [17]:
# tables have different fields
diff_cols_inner = pd.concat([desk, mob_jrny], join='inner')
diff_cols_inner

Unnamed: 0,session_id,device_type
0,ad22f37f-3090,Desktop
7,fca2239f-5308,Desktop
29,60cf4a7f-1066,Desktop
51,91d5cdde-4196,Desktop
53,447fefda-3199,Desktop
...,...,...
3093,4f196e78-6730,Mobile
3094,8b0de634-382,Mobile
3095,66a2ced8-4037,Mobile
3096,92167cc7-9704,Mobile


<br>

<ul>
    <li><b><code>desk.append(mob)</code></b> - <code>.append()</code> functions similarly to <code>pd.concat</code>and support many same elements.</li>
</ul>

<br>

In [18]:
apnd = desk.append(mob, ignore_index=True, sort=True)
apnd

Unnamed: 0,device_type,session_id
0,Desktop,ad22f37f-3090
1,Desktop,fca2239f-5308
2,Desktop,60cf4a7f-1066
3,Desktop,91d5cdde-4196
4,Desktop,447fefda-3199
...,...,...
4004,Mobile,4f196e78-6730
4005,Mobile,8b0de634-382
4006,Mobile,66a2ced8-4037
4007,Mobile,92167cc7-9704
