Codes from Demo.ipynb Located at Main Repo

In [1]:
import aedes

In [3]:
from aedes.remote_sensing_utils import df_to_ee_points, generate_random_ee_points
from aedes.remote_sensing_utils import visualize_on_map, get_satellite_measures_from_points
from aedes.automl_utils import perform_clustering
from aedes.osm_utils import initialize_OSM_network, get_OSM_network_data, reverse_geocode_points, reverse_geocode_center_of_geojson

In [9]:
import datetime
import pandas as pd
pd.options.display.max_rows = 999
pd.options.display.max_columns = 999

## Remote Sensing

In [4]:
# initialize AEDES
aedes.remote_sensing_utils.initialize()

In [6]:
# Location : QC
aoi_geojson = [[
                [120.98976275,14.58936896],
                [121.13383232,14.58936896],
                [121.13383232,14.77641364],
                [120.98976275,14.77641364],
                [120.98976275,14.58936896]
]]

In [7]:
# Sample long lat points and get satellite remote sensing data
points = generate_random_ee_points(aoi_geojson, sample_points=5)

Function generate_random_ee_points generates random points from a box boundary. There is no support for string location input.

In [10]:
# Sample long lat points and get satellite remote sensing data
points = generate_random_ee_points(aoi_geojson, sample_points=5)
%time satellite_df = get_satellite_measures_from_points(points, aoi_geojson)

clustering_model = perform_clustering(satellite_df, n_clusters=3)
satellite_df['labels'] = pd.Series(clustering_model.labels_)

mapper = visualize_on_map(satellite_df, ignore_labels=[1])


Wall time: 37.6 s


Collecting data from satellite takes quite a while, 37.6 sec for just 5 points.

In [11]:
mapper

Random points are used in data collection using get_satellite_measures_from_points. There is currently no function that returns raster data for the whole geoJson. Looking at the source code, get_satellite_measures_from_points collects data from a buffered_geometry but returns an average over the whole geometry.
```
def get_satellite_measures_from_points...
...
points_df['ndvi'] = points_df['buffered_geometry'].apply(lambda x: meanNDVICollection(sat_image, x))
...
def meanNDVICollection
...
Compute the mean of NDVI over the 'region'
    ndviValue = ndviImage.reduceRegion(**{
    'geometry': aoi.getInfo(),
    'reducer': ee.Reducer.mean(),
    'scale': 1000
    }).get('NDVI');  

...
```

A refactor of the code is possible to create a function that returns manipulatable raster data.

## Social Listening

In [13]:
%%time
network = initialize_OSM_network(aoi_geojson)

Requesting network data within bounding box from Overpass API in 1 request(s)
Posting to http://www.overpass-api.de/api/interpreter with timeout=180, "{'data': '[out:json][timeout:180];(way["highway"]["highway"!~"motor|proposed|construction|abandoned|platform|raceway"]["foot"!~"no"]["pedestrians"!~"no"](14.58936896,120.98976275,14.77641364,121.13383232);>;);out;'}"
Downloaded 27,654.3KB from www.overpass-api.de in 20.42 seconds
Downloaded OSM network data within bounding box from Overpass API in 1 request(s) and 21.07 seconds
Returning OSM data with 185,475 nodes and 49,866 ways...
Edge node pairs completed. Took 30.52 seconds
Returning processed graph with 64,692 nodes and 89,341 edges...
Completed OSM data download and Pandana node and edge table creation in 53.98 seconds
Wall time: 55.3 s


Function initialize_OSM_network only accepts a 4-coordinate geojson box. This denies collecting data by string location or by geometry of administrative boundaries. 
```
def initialize_OSM_network(aoi_geojson)
...   
    # Set AOI CSV from geojson
    aoi_csv = aoi_geojson[0][0][1], aoi_geojson[0][3][0], aoi_geojson[0][2][1], aoi_geojson[0][1][0]
...

    network = osm.pdna_network_from_bbox(*aoi_csv)
```

In [16]:
%%time
# Hospitals and clinics

final_with_hospital_df5k, hospital_amenities_df, hospital_count_distance_df = get_OSM_network_data(network,
                     satellite_df,
                     aoi_geojson,
                    ['clinic', 'hospital', 'doctors'],
                    5,
                    5000,
                    show_viz=True)

Wall time: 32.4 s


show_viz function does not work. Map does not show up. Again, data collection is time consuming, 33 sec for 5 points, although this time, a distance parameter is available to collect data faster at the cost of accuracy.

In [18]:
final_with_hospital_df5k

Unnamed: 0,geometry,buffered_geometry,longitude,latitude,ndvi,fapar,ndbi,ndwi,ndmi,aerosol,surface_temperature,precipitation_rate,relative_humidity,labels,OSM_network_id,nearest_clinic_hospital_doctors_1,nearest_clinic_hospital_doctors_2,nearest_clinic_hospital_doctors_3,nearest_clinic_hospital_doctors_4,nearest_clinic_hospital_doctors_5,count_clinic_hospital_doctors_within_5.0km
0,POINT (121.11160 14.66217),"ee.Geometry({\n ""functionInvocationValue"": {\...",121.111599,14.662168,0.154349,0.0,-0.009122,-0.156812,0.009122,116.431122,36.436481,9.171103e-07,77.704048,0,335307290,1710.287964,1710.287964,1746.331055,1830.51001,1874.386963,33.0
1,POINT (121.12419 14.59517),"ee.Geometry({\n ""functionInvocationValue"": {\...",121.124187,14.595171,0.191877,0.047703,-0.076692,-0.117247,0.076692,147.654912,33.343179,9.171103e-07,77.704048,2,1298216111,2230.879883,3574.030029,5000.0,5000.0,5000.0,2.0
2,POINT (121.04999 14.62731),"ee.Geometry({\n ""functionInvocationValue"": {\...",121.049989,14.627311,0.096444,0.0,-0.027322,-0.076104,0.027322,208.057935,36.603021,9.171103e-07,77.704048,0,32088850,301.251007,566.177979,566.177979,566.177979,597.189026,91.0
3,POINT (121.12382 14.60733),"ee.Geometry({\n ""functionInvocationValue"": {\...",121.123821,14.607332,0.188294,0.052,-0.068768,-0.124935,0.068768,148.002541,35.082785,9.171103e-07,77.704048,0,5373695258,3630.863037,4351.748047,4473.473145,4531.158203,4926.420898,5.0
4,POINT (121.05697 14.76081),"ee.Geometry({\n ""functionInvocationValue"": {\...",121.056969,14.760809,0.130988,0.032,-0.038837,-0.101535,0.038837,209.410579,35.887674,9.090909e-07,81.611534,1,8332611161,905.219971,993.471008,1210.093018,1724.852051,1905.125,18.0


## Trends

In [24]:
from aedes.social_listening_utils import get_search_trends

search_df = get_search_trends("PH-00")

get_search_trends only has geo-tag as parameter. Several integral parameters should include 
search term/s, 
bool to decide whether to search for related searches, 
number of related searches

The function have no support to choose range of date when to search.

It should also be noted that geo-tags are available only down to provincial divisions.

In [28]:
monthly_max_interest_df = search_df.resample('M').max().reset_index()
monthly_max_interest_df['date'] = monthly_max_interest_df['date'].astype(str)
monthly_max_interest_df.head()

Unnamed: 0,date,dengue,dengue symptoms,symptoms,dengue fever,fever,isPartial
0,2017-04-30,1,0,17,0,5,False
1,2017-05-31,1,0,20,0,5,False
2,2017-06-30,3,1,19,0,6,False
3,2017-07-31,4,1,19,1,7,False
4,2017-08-31,5,1,25,1,8,False


get_search_trends also uses related_queries which return related search terms. It it worth noting that some related queries could be too general. As shown 'symptoms' or 'fever' are general terms that may not be of interest.
```
...
related_queries = pytrend.related_queries()
...
```