In [5]:
from pymodulo.SpatialStratification import GridStratification
from pymodulo.DataLoader import CSVDataLoader
from pymodulo.VehicleSelector import GreedyVehicleSelector, MaxPointsVehicleSelector, RandomVehicleSelector

## Spatial stratification process
In this example, we do a grid stratification. At this step, you need to decide the spatial granularity. Since this example uses a grid stratification, we need to decide the length of each side of a grid. In the following example, we keep this length as 1 km. 

In [6]:
# Here, we keep a cellSide of length 1 km (the first argument)
spatial = GridStratification(1, 77.58467674255371, 12.958180959662695, 77.60617733001709, 12.977167633046893)
spatial.stratify()

Now, `spatial.input_geojson` returns the GeoJSON containing the strata (along with stratum ID). Below, we print the first stratum that was generated. If desired, you can store this GeoJSON using the in-built Python `json` library.

In [7]:
spatial.input_geojson['features'][0]

{'type': 'Feature',
 'properties': {'stratum_id': 0},
 'geometry': {'type': 'Polygon',
  'coordinates': [[[77.58619882699857, 12.958681092717548],
    [77.58619882699857, 12.967674296354794],
    [77.5954270362854, 12.967674296354794],
    [77.5954270362854, 12.958681092717548],
    [77.58619882699857, 12.958681092717548]]]}}

## Data loading process

In this step, we upload the vehicle mobility data to a [MongoDB](https://docs.mongodb.com/) database. You need to take care of a few things here:
1. You must ensure that you have a MongoDB server (local or remote) running before you continue with this process.
2. The input CSV file must containing the following columns: vehicle_id, timestamp, latitude, longitude.
3. You will need to decide upon a `temporal_granularity` (in seconds). In this example, we use a temporal granularity of 1 hour (= 3600 seconds).
4. Decide the database name and a collection name (inside that database) that you want to upload your data to.

In [8]:
dataloader = CSVDataLoader('sample_mobility_data.csv', 3600,
                           anonymize_data=False,
                           mongo_uri='mongodb://localhost:27017/',
                           db_name='modulo',
                           collection_name='mobility_data')

At this point, if you want, you can check your MongoDB database using a [MongoDB GUI](https://www.mongodb.com/products/compass). You should see your data uploaded in the database.

Now, we need to compute the stratum ID that each vehicle mobility datum falls into. Similarly, we also need to calculate the temporal ID that each datum falls into. Think of the temporal ID as referring to a "time bucket", each of length `temporal_granularity`. Both these methods return the number of records that were updated with the `stratum_id` and the `temporal_id` respectively.

In [9]:
dataloader.compute_stratum_id_and_update_db(spatial)

80

In [10]:
dataloader.compute_temporal_id_and_update_db()

80

You can use the following helper function to fetch the vehicle mobility data stored in the database. This function will return the stored values as a Pandas DataFrame, which you can conveniently use to do any checks, operations, analysis, etc.

In [11]:
df = dataloader.fetch_data()
df.head()

Unnamed: 0,vehicle_id,timestamp,location,stratum_id,temporal_id
0,1,1589374800,"{'type': 'Point', 'coordinates': [77.586709756...",0,1589374800
1,1,1589378141,"{'type': 'Point', 'coordinates': [77.588850139...",1,1589374800
2,1,1589377691,"{'type': 'Point', 'coordinates': [77.596664352...",2,1589374800
3,1,1589376922,"{'type': 'Point', 'coordinates': [77.602595282...",3,1589374800
4,1,1589385081,"{'type': 'Point', 'coordinates': [77.591034132...",0,1589382000


## Vehicle Selection

Now, we can finally use the available algorithms to select the desired number of vehicles. In the following example, we assume that we want to choose 2 vehicles.

The vehicle selection ("training") process requires the vehicle mobility data from the database. We use another helper method in `DataLoader` to fetch this data as a Pandas DataFrame.

In [12]:
selection_df = dataloader.fetch_data_for_vehicle_selection()

In [13]:
# Using greedy
greedy = GreedyVehicleSelector(2, selection_df, 1589389199)
selected_vehicles = greedy.train()
greedy.test(selected_vehicles)

{'selected_vehicles': [7, 1], 'coverage': 100.0}

In [14]:
# Using max-points
maxpoints = MaxPointsVehicleSelector(2, selection_df, 1589389199)
selected_vehicles = maxpoints.train()
maxpoints.test(selected_vehicles)

{'selected_vehicles': [5, 7], 'coverage': 62.5}

In [18]:
# Using random
random_algo = RandomVehicleSelector(2, selection_df, 1589389199)
selected_vehicles = random_algo.train()
random_algo.test(selected_vehicles)

{'selected_vehicles': [1, 2], 'coverage': 56.25}