<!---------------------- Introduction Section ------------------->
<h1> NumMobility: A Mobility Data PreProcessing Library </h1>

<h2> Introduction </h2>

<p align='justify'>
NumMobility is a state-of-the art Mobility Data Preprocessing Library that mainly deals with filtering data, generating features and interpolation of Trajectory Data.

<b><i> The main features of NumMobility are: </i></b>

<ol align='justify'>
<li> NumMobility uses primarily parallel computation based on
     python Pandas and numpy which makes it very fast as compared
     to other libraries available.
</li>

<li> NumMobility harnesses the full power of the machine that
     it is running on by using all the cores available in the
     computer.
</li>

<li> NumMobility uses a customized DataFrame built on top of python
     pandas for representation and storage of Trajectory Data.
</li>

<li> NumMobility also provides several Temporal and spatial features
     which are calculated mostly using parallel computation for very
     fast and accurate calculations.
</li>

<li> Moreover, NumMobility also provides several filteration and
     outlier detection methods for cleaning and noise reduction of
     the Trajectory Data.
</li>

<li> Apart from the features mentioned above, <i><b> four </b></i>
     different kinds of Trajectory Interpolation techniques are
     offered by NumMobility which is a first in the community.
</li>
</ol>
</p>

<!----------------- Dataset Link Section --------------------->
<hr style="height:6px;background-color:black">

<p align='justify'>
In the introduction of the library, the seagulls dataset is used
which can be downloaded from the link below: <br>
<span> &#8618; </span>
<a href="https://github.com/YakshHaranwala/NumMobility/blob/main/examples/data/gulls.csv" target='_blank'> Seagulls Dataset </a>
</p>

<!----------------- NbViewer Link ---------------------------->
<hr style="height:6px;background-color:black">
<p align='justify'>
Note: Viewing this notebook in GitHub will not render JavaScript
elements. Hence, for a better experience, click the link below
to open the Jupyter notebook in NB viewer.

<span> &#8618; </span>
<a href="https://nbviewer.jupyter.org/github/YakshHaranwala/NumMobility/blob/main/examples/0.%20Intro%20to%20NumMobility.ipynb" target='_blank'> Click Here </a>
</p>

<!------------------------- Documentation Link ----------------->
<hr style="height:6px;background-color:black">
<p align='justify'>
The Link to NumMobility's Documentation is: <br>

<span> &#8618; </span>
<a href='https://nummobility.readthedocs.io/en/latest/' target='_blank'> <i> NumMobility Documentation </i> </a>
</p>
<hr style="height:6px;background-color:black">

<h2> Importing Trajectory Data into a NumPandasTraj Dataframe </h2>

<p align='justify'>
NumMobility Library stores Mobility Data (Trajectories) in a specialised
pandas Dataframe structure called NumPandasTraj. As a result, the following
constraints are enforced for the data to be able to be stores in a NumPandasTraj.

<ol align='justify'>
   <li>
        Firstly, for a mobility dataset to be able to work with NumMobility Library needs
        to have the following mandatory columns present:
       <ul type='square'>
           <li> DateTime </li>
           <li> Trajectory ID </li>
           <li> Latitude </li>
           <li> Longitude </li>
       </ul>
   </li>
   <li>
       Secondly, NumPandasTraj has a very specific constraint for the index of the
       dataframes, the Library enforces a multi-index consisting of the
       <b><i> Trajectory ID, DateTime </i></b> columns because the operations of the
       library are dependent on the 2 columns. As a result, it is recommended
       to not change the index and keep the multi-index of <b><i> Trajectory ID, DateTime </i></b>
       at all times.
   </li>
   <li>
        Note that since NumPandasTraj Dataframe is built on top of
        python pandas, it does not have any restrictions on the number
        of columns that the dataset has. The only requirement is that
        the dataset should atleast contain the above mentioned four columns.
   </li>
</ol>
</p>

<hr style="height:6px;background-color:black">

In [1]:
"""
    METHOD - I:
        1. Enter the trajectory data into a list.
        2. Then, convert the list into a NumPandasTraj
           Dataframe to be used with NumMobility Library.
"""
import pandas as pd
from core.TrajectoryDF import NumPandasTraj

list_data = [
    [39.984094, 116.319236, '2008-10-23 05:53:05', 1],
    [39.984198, 116.319322, '2008-10-23 05:53:06', 1],
    [39.984224, 116.319402, '2008-10-23 05:53:11', 1],
    [39.984224, 116.319404, '2008-10-23 05:53:11', 1],
    [39.984224, 116.568956, '2008-10-23 05:53:11', 1],
    [39.984224, 116.568956, '2008-10-23 05:53:11', 1]
]
list_df = NumPandasTraj(data_set=list_data,
                        latitude='lat',
                        longitude='lon',
                        datetime='datetime',
                        traj_id='id')
print(f"The dimensions of the dataframe:{list_df.shape}")
print(f"Type of the dataframe: {type(list_df)}")

The dimensions of the dataframe:(6, 2)
Type of the dataframe: <class 'core.TrajectoryDF.NumPandasTraj'>


In [2]:
"""
    METHOD - II:
        1. Enter the trajectory data into a dictionary.
        2. Then, convert the dictionary into a NumPandasTraj
           Dataframe to be used with NumMobility Library.
"""
dict_data = {
    'lat': [39.984198, 39.984224, 39.984094, 40.98, 41.256],
    'lon': [116.319402, 116.319322, 116.319402, 116.3589, 117],
    'datetime': ['2008-10-23 05:53:11', '2008-10-23 05:53:06', '2008-10-23 05:53:30', '2008-10-23 05:54:06', '2008-10-23 05:59:06'],
    'id' : [1, 1, 1, 3, 3],
}
dict_df = NumPandasTraj(data_set=dict_data,
                        latitude='lat',
                        longitude='lon',
                        datetime='datetime',
                        traj_id='id')
print(f"The dimensions of the dataframe:{dict_df.shape}")
print(f"Type of the dataframe: {type(dict_df)}")

The dimensions of the dataframe:(5, 2)
Type of the dataframe: <class 'core.TrajectoryDF.NumPandasTraj'>


In [3]:
# Now, printing the head of the dataframe with data
# imported from a list.
list_df.head()

Unnamed: 0_level_0,Unnamed: 1_level_0,lat,lon
traj_id,DateTime,Unnamed: 2_level_1,Unnamed: 3_level_1
1,2008-10-23 05:53:05,39.984094,116.319236
1,2008-10-23 05:53:06,39.984198,116.319322
1,2008-10-23 05:53:11,39.984224,116.319402
1,2008-10-23 05:53:11,39.984224,116.319404
1,2008-10-23 05:53:11,39.984224,116.568956


In [4]:
# Now, printing the head of the dataframe with data
# imported from a dictionary.
dict_df.head()

Unnamed: 0_level_0,Unnamed: 1_level_0,lat,lon
traj_id,DateTime,Unnamed: 2_level_1,Unnamed: 3_level_1
1,2008-10-23 05:53:06,39.984224,116.319322
1,2008-10-23 05:53:11,39.984198,116.319402
1,2008-10-23 05:53:30,39.984094,116.319402
3,2008-10-23 05:54:06,40.98,116.3589
3,2008-10-23 05:59:06,41.256,117.0


In [5]:
"""
    METHOD - III:
        1. First, import the seagulls dataset from the csv file
           using pandas into a pandas dataframe.
        2. Then, convert the pandas dataframe into a NumPandasTraj
           DataFrame to be used with NumMobility library.
"""
df = pd.read_csv('./data/gulls.csv')
seagulls_df = NumPandasTraj(data_set=df,
                            latitude='location-lat',
                            longitude='location-long',
                            datetime='timestamp',
                            traj_id='tag-local-identifier',
                            rest_of_columns=[])
print(f"The dimensions of the dataframe:{seagulls_df.shape}")
print(f"Type of the dataframe: {type(seagulls_df)}")

The dimensions of the dataframe:(89869, 8)
Type of the dataframe: <class 'core.TrajectoryDF.NumPandasTraj'>


In [6]:
# Now, print the head of the seagulls_df dataframe.
seagulls_df.head()

Unnamed: 0_level_0,Unnamed: 1_level_0,event-id,visible,lon,lat,sensor-type,individual-taxon-canonical-name,individual-local-identifier,study-name
traj_id,DateTime,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
91732,2009-05-27 14:00:00,1082620685,True,24.58617,61.24783,gps,Larus fuscus,91732A,Navigation experiments in lesser black-backed ...
91732,2009-05-27 20:00:00,1082620686,True,24.58217,61.23267,gps,Larus fuscus,91732A,Navigation experiments in lesser black-backed ...
91732,2009-05-28 05:00:00,1082620687,True,24.53133,61.18833,gps,Larus fuscus,91732A,Navigation experiments in lesser black-backed ...
91732,2009-05-28 08:00:00,1082620688,True,24.582,61.23283,gps,Larus fuscus,91732A,Navigation experiments in lesser black-backed ...
91732,2009-05-28 14:00:00,1082620689,True,24.5825,61.23267,gps,Larus fuscus,91732A,Navigation experiments in lesser black-backed ...


<h1> Kinematic Trajectory Features </h1>

<p align='justify'>

As mentioned above, NumMobility offers a multitude of features
which are calculated based on both Datetime, and the coordinates
of the points given in the data. Both the feature module are named
as follows:

<ul align='jusitfy'>
    <li> temporal_features (based on DateTime) </li>
    <li> spatial_features (based on geographical coordinates) </li>

</p>
<hr style="background-color:black; height:7px">

<h2> NumMobility Temporal Features </h2>

<p align='justify'>

The following steps are performed to demonstrate the usage of
Temporal features present in NumMobility:

<ul type='square', align='justify'>
<li>Various features Date, Time, Week-day, Time of Day etc are
    calculated using temporal_features.py module functions and
    the results are appended to the original dataframe.
</li>
<li> Not all the functions present in the module are demonstrated
     here. Only a few of the functions are demonstrated here, keeping
     the length of jupyter notebook in mind. Further functions can
     be explored in the documentation of the library. The documentation
     link is provided in the introduction section of this notebook.
</li>

</p>

In [7]:
%%time

"""
    To demonstrate the temporal features, we will:
        1. First, import the temporal_features.py module from the
           features package.
        2. Generate Date, Day_Of_Week, Time_Of_day features on
           the seagulls dataset.
        3. Print the execution time of the code.
        4. Finally, check the head of the dataframe to
           see the results of feature generation.
"""
from features.temporal_features import TemporalFeatures as temporal

temporal_features_df = temporal.create_date_column(seagulls_df)
temporal_features_df = temporal.create_day_of_week_column(temporal_features_df)
temporal_features_df = temporal.create_time_of_day_column(temporal_features_df)
temporal_features_df.head()

CPU times: user 262 ms, sys: 20.2 ms, total: 282 ms
Wall time: 281 ms


Unnamed: 0_level_0,Unnamed: 1_level_0,event-id,visible,lon,lat,sensor-type,individual-taxon-canonical-name,individual-local-identifier,study-name,Date,Day_Of_Week,Time_Of_Day
traj_id,DateTime,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
91732,2009-05-27 14:00:00,1082620685,True,24.58617,61.24783,gps,Larus fuscus,91732A,Navigation experiments in lesser black-backed ...,2009-05-27,Wednesday,Noon
91732,2009-05-27 20:00:00,1082620686,True,24.58217,61.23267,gps,Larus fuscus,91732A,Navigation experiments in lesser black-backed ...,2009-05-27,Wednesday,Evening
91732,2009-05-28 05:00:00,1082620687,True,24.53133,61.18833,gps,Larus fuscus,91732A,Navigation experiments in lesser black-backed ...,2009-05-28,Thursday,Early Morning
91732,2009-05-28 08:00:00,1082620688,True,24.582,61.23283,gps,Larus fuscus,91732A,Navigation experiments in lesser black-backed ...,2009-05-28,Thursday,Early Morning
91732,2009-05-28 14:00:00,1082620689,True,24.5825,61.23267,gps,Larus fuscus,91732A,Navigation experiments in lesser black-backed ...,2009-05-28,Thursday,Noon
