# Take Home Exam: Mobility Index Calculation and Profiling

## 1. Create "Mobility Index" and "Mobility Class"

Using the the following features/indicators:<br>
<br>
    1. Total Distance Traveled<br>
    2. Radius of Gyration<br>
    3. Activity Entropy<br>
     
Create a calculated feature called **Mobility Index** (type: decimal/float) and **Mobility Class** (which is are categorized as Low, Mid, and High) for each subscriber.<br>
Team is free to use any methods or technique to arrive at the **OPTIMAL** Mobility Index and Mobility Class as long as it is supported by literature/s.<br>
<br>
**Deadline of the submission is September 1, 2023.** <br>
<br>
<br>
**Criteria for scoring**<br>
1. Creation of mobility class - 50 pts  <br>
2. Creation of mobility index - 30 pts  <br>
3. Efficiency of process      - 20 pts  <br>
    Total                     -100 pts 


## 2. Example of Mobility Index

In [1]:
import random
import shapely
import pendulum
import numpy as np
import pandas as pd
from scipy import stats
pd.options.display.max_rows=200
import geopandas as gpd
import matplotlib.pyplot as plt
from IPython.display import HTML, display
from functools import reduce
import pyproj
from functools import partial

#### Sample ABT

In [2]:
file_path_sample_data = "C:/Users/10012425/Desktop/sds4gdsp/scoring_base.csv"
ABT_mobility = pd.read_csv(file_path_sample_data)
ABT_mobility.sample(5)

Unnamed: 0.1,Unnamed: 0,sub_uid,gender,age,name,chi_indicator,ewallet_user_indicator,total_travel_distance,radius_of_gyration,activity_entropy
32,32,glo-sub-023,female,62,Brandi Taylor,True,Y,187400.353151,2773.800971,1.393043
22,22,glo-sub-061,male,64,David Evans,False,Y,168249.88212,1710.550891,
17,17,glo-sub-069,male,51,Kevin Gibson,True,N,162675.199376,1274.739666,1.279325
4,4,glo-sub-076,male,46,Colin Mejia,False,N,123646.921636,1671.299423,
98,98,glo-sub-046,male,24,Luis Jackson,True,Y,385134.568227,2206.90806,1.228771


### Apply min max scaling capping

In [3]:
from sklearn.preprocessing import MinMaxScaler

In [4]:
df_capping = pd.DataFrame(ABT_mobility, columns=['total_travel_distance', 'radius_of_gyration','activity_entropy'])

column_headers = df_capping.columns.tolist()

In [5]:
# Create a MinMaxScaler instance
scaler = MinMaxScaler()

# Fit the scaler on the data and transform it
scaled_data = scaler.fit_transform(df_capping)

# Convert scaled data back to a DataFrame
scaled_df = pd.DataFrame(scaled_data, columns=column_headers)
scaled_df

Unnamed: 0,total_travel_distance,radius_of_gyration,activity_entropy
0,0.0,0.015741,0.400992
1,0.076253,0.182373,0.504028
2,0.09854,0.175793,0.692796
3,0.100956,0.024102,0.004103
4,0.102464,0.369021,
5,0.126786,0.0,0.128279
6,0.130826,0.088019,0.368835
7,0.13777,0.024837,0.115453
8,0.139332,0.223741,0.638321
9,0.141096,0.183562,0.408091


In [7]:
scaled_df = scaled_df.fillna(0)

scaled_df["mobility_index"] = (scaled_df["total_travel_distance"] + scaled_df["radius_of_gyration"] + scaled_df["activity_entropy"]) /3

In [8]:
scaled_df.sample(5)

Unnamed: 0,total_travel_distance,radius_of_gyration,activity_entropy,mobility_index
93,0.821048,0.578163,0.781067,0.726759
2,0.09854,0.175793,0.692796,0.322376
88,0.771381,0.683648,0.770484,0.741838
27,0.263391,0.734126,0.0,0.332506
10,0.145877,0.018146,0.150974,0.104999


In [9]:
low_threshold = 0.3
high_threshold = 0.7

# Create a function to categorize values
def categorize(value):
    if value < low_threshold:
        return 'Low'
    elif value < high_threshold:
        return 'Mid'
    elif value > high_threshold:
        return 'High'
    else:
        return 'Low'

# Apply the categorize function to the 'Value' column
scaled_df['Category'] = scaled_df['mobility_index'].apply(categorize)

scaled_df.sample(5)

Unnamed: 0,total_travel_distance,radius_of_gyration,activity_entropy,mobility_index,Category
46,0.443289,0.37303,0.488329,0.434882,Mid
37,0.373848,0.392623,0.465124,0.410532,Mid
62,0.539789,0.722342,0.693652,0.651928,Mid
40,0.400479,0.487088,0.749415,0.545661,Mid
15,0.193627,0.094938,0.560008,0.282858,Low


In [11]:
mobility_class = pd.DataFrame(scaled_df)
scaled_df.groupby('Category').size()

Category
High    21
Low     18
Mid     61
dtype: int64