# Creating hurricane tracks using big data analytics

The sample code below uses big data analytics (GeoAnalytics) to reconstruct hurricane tracks using data registered on a big data file share in the GIS.

## Reconstruct tracks
Reconstruct tracks is a type of data aggregation tool available under big data tools. This tool works with a layer of point features or polygon features that are time enabled. It first determines which points belong to a track using an identification number or identification string. Using the time at each location, the tracks are ordered sequentially and transformed into a line representing the path of movement.


## Inspect the data attributes
Let us read the subset as a FeatureLayer to view its attribute table

In [4]:
from arcgis.lyr import FeatureService
from pandas.io.json import json_normalize

subset_FS = FeatureService(subset_search[0].url, ago_gis)
subset_FL = subset_FS.layers[0]

query_result = subset_FL.query(where ='FID < 5', 
                                out_fields = "*", 
                                returnGeometry = False)

att_data_frame = json_normalize(query_result)
att_data_frame.columns = att_data_frame.columns.str.replace("attributes.","")
att_data_frame

Unnamed: 0,ATC_eye,ATC_grade,ATC_poci,ATC_pres,ATC_rmw,ATC_roci,ATC_w34_r1,ATC_w34_r2,ATC_w34_r3,ATC_w34_r4,...,basin_1,day,hour,min_,month,wmo_pres,wmo_pres__,wmo_wind,wmo_wind__,year
0,-999,-999.0,-999,-999,-999,-999,-999,-999,-999,-999,...,SI,1,0,0,1,-999,-999,-999,-999,1932
1,-999,-999.0,-999,-999,-999,-999,-999,-999,-999,-999,...,SI,1,6,0,1,0,-100,0,-100,1932
2,-999,-999.0,-999,-999,-999,-999,-999,-999,-999,-999,...,SI,1,12,0,1,-999,-999,-999,-999,1932
3,-999,-999.0,-999,-999,-999,-999,-999,-999,-999,-999,...,SI,1,18,0,1,-999,-999,-999,-999,1932


## Create a data store
For the GeoAnalytics server to process your big data, it needs the data to be registered as a data store. In our case, the data is in multiple shape files and we will register the folder containing the files as a data store of type `bigDataFileShare`.

Let us connect to an ArcGIS Enterprise

In [34]:
from arcgis.gis import *
gis = GIS("https://dev002759.esri.com/portal","admin","esri.agp")

# find the server running GeoAnalytics
ga_tools = gis.tools.bigdata
ga_tools

<BigDataTools url:"https://dev002857.esri.com/arcgis/rest/services/System/GeoAnalyticsTools/GPServer">

The datastores property of GIS provides you with a list of DatastoreManager objects, one for each server federated with the portal. This object allows you to query, inspect and manipulate the datastores available to your ArcGIS Server.

In [35]:
# Query the DataStoreManager objects available to the portal
data_mgr_bysite = gis.datastores
data_mgr_bysite

[<DatastoreManager for https://Dev002759.esri.com:6443/arcgis/admin>,
 <DatastoreManager for https://dev002857.esri.com:6443/arcgis/admin>]

In [36]:
# Find the data stores registered on the GeoAnalytics server
data_store_mgr = data_mgr_bysite[1]
data_store_mgr.search()

[<Datastore title:"/bigDataFileShares/Chicago_accidents" type:"bigDataFileShare">,
 <Datastore title:"/bigDataFileShares/ht_ui2" type:"bigDataFileShare">,
 <Datastore title:"/bigDataFileShares/ht_ui3" type:"bigDataFileShare">,
 <Datastore title:"/bigDataFileShares/ht_ui5" type:"bigDataFileShare">]

In [50]:
data_item = data_store_mgr.add_bigdata("full_dataset", "\\teton\atma_shared\datasets\Esri_Geoanalytics_datasets\hurricanes\entire_dataset")

Created Big Data file share for full_dataset


Once a big data file share is created, the GeoAnalytics server processes all the valid file types to discern the schema of the data. This process can take a few minutes depending on the size of your data. Once processed, querying the manifest property returns the schema. As you can see from below, the schema is similar to the subset we observed earlier in this sample.

In [37]:
data_item = data_store_mgr.search()[2]

## Perform data aggregation using reconstruct tracks tool
When you add a big data file share datastore, a corresponding item gets created on your portal. You can search for it like a regular item and query its layers.

In [38]:
search_result = gis.content.search("", item_type = "big data file share")
search_result

[<Item title:"bigDataFileShares_ht_ui5" type:Big Data File Share owner:admin>,
 <Item title:"bigDataFileShares_ht_ui2" type:Big Data File Share owner:admin>,
 <Item title:"bigDataFileShares_Chicago_accidents" type:Big Data File Share owner:admin>,
 <Item title:"bigDataFileShares_ht_ui3" type:Big Data File Share owner:admin>]

In [39]:
data_item = search_result[3]

In [43]:
data_item.url

'https://dev002857.esri.com/arcgis/rest/services/DataStoreCatalogs/bigDataFileShares_ht_ui3/BigDataCatalogServer'

In [44]:
data_item.layers

[<Layer url:"https://dev002857.esri.com/arcgis/rest/services/DataStoreCatalogs/bigDataFileShares_ht_ui3/BigDataCatalogServer/h1842_h1852">,
 <Layer url:"https://dev002857.esri.com/arcgis/rest/services/DataStoreCatalogs/bigDataFileShares_ht_ui3/BigDataCatalogServer/h1852_h1862">,
 <Layer url:"https://dev002857.esri.com/arcgis/rest/services/DataStoreCatalogs/bigDataFileShares_ht_ui3/BigDataCatalogServer/h1862_h1872">,
 <Layer url:"https://dev002857.esri.com/arcgis/rest/services/DataStoreCatalogs/bigDataFileShares_ht_ui3/BigDataCatalogServer/h1872_h1882">,
 <Layer url:"https://dev002857.esri.com/arcgis/rest/services/DataStoreCatalogs/bigDataFileShares_ht_ui3/BigDataCatalogServer/h1882_h1892">,
 <Layer url:"https://dev002857.esri.com/arcgis/rest/services/DataStoreCatalogs/bigDataFileShares_ht_ui3/BigDataCatalogServer/h1892_h1902">,
 <Layer url:"https://dev002857.esri.com/arcgis/rest/services/DataStoreCatalogs/bigDataFileShares_ht_ui3/BigDataCatalogServer/h1902_h1912">,
 <Layer url:"https:/

In [45]:
years_1842_52 = data_item.layers[0]

### Reconstruct tracks tool

The `reconstruct_tracks()` tool can be accessed through the tools.bigdata property of your GIS. In this example, we are using this tool to aggregate the numerous points into line segments showing the tracks followed by the hurricanes. The tool creates a feature service as an output which can be accessed once the processing is complete.

In [46]:
agg_result = gis.tools.bigdata.reconstruct_tracks(years_1842_52,
                                                 track_fields = 'Serial_Num',
                                                 output_name = 'hurricane_tracks_agg_result',
                                                 method = 'GEODESIC')

Submitted.
Executing...
Executing (ReconstructTracks): ReconstructTracks "Feature Set" Serial_Num Geodesic # # # # {"serviceProperties":{"name":"hurricane_tracks_agg_result","serviceUrl":"http://Dev002759.esri.com/server/rest/services/Hosted/hurricane_tracks_agg_result/FeatureServer"},"itemProperties":{"itemId":"94d90259707f4977a74cc8af479ae208"}} {}
Start Time: Sun Oct 16 18:51:33 2016
Using URL based GPRecordSet param: https://dev002857.esri.com/arcgis/rest/services/DataStoreCatalogs/bigDataFileShares_ht_ui3/BigDataCatalogServer/h1842_h1852
{"messageCode":"BD_101028","message":"Starting new distributed job with 3 tasks.","params":{"totalTasks":"3"}}
{"messageCode":"BD_101029","message":"1/3 distributed tasks completed.","params":{"completedTasks":"1","totalTasks":"3"}}
{"messageCode":"BD_101029","message":"2/3 distributed tasks completed.","params":{"completedTasks":"2","totalTasks":"3"}}
{"messageCode":"BD_101029","message":"3/3 distributed tasks completed.","params":{"completedTask

## Inspect the results
Let us create a map and load the processed result which is a feature service

In [47]:
processed_map = gis.map("USA",2)
processed_map

In [48]:
sr = gis.content.search("hurricane_tracks_agg_result")
sr

[<Item title:"hurricane_tracks_agg_result" type:Feature Service owner:admin>]

In [49]:
processed_map.add_layer(sr[0])

Thus we transformed a bunch of ponints into tracks that represents paths taken by the hurricanes over a period of 50 years. We can pull up another map and inspect the results a bit more closely

In [16]:
map2 = gis.map('USA', 2)
map2

In [17]:
tracks_layer = FeatureService(sr[0].url, gis).layers[0]
tracks_layer

<FeatureLayer url:"http://portalname.domain.com/server/rest/services/Hosted/hurricane_tracks_agg_result/FeatureServer/0">

In [18]:
map2.add_layer(tracks_layer)

Our input data and the map widget is time enabled. Thus we can filter the data to represent the tracks from only the years 1852 to 1860

In [52]:
processed_map.start_time

''

In [19]:
map2.start_time = '1852'
map2.end_time = '1860'

In [53]:
from arcgis.lyr import FeatureService
from pandas.io.json import json_normalize

subset_FS = FeatureService(sr[0].url, gis)
subset_FL = subset_FS.layers[0]

query_result = subset_FL.query(where ='FID < 5', 
                                out_fields = "*", 
                                returnGeometry = False)

att_data_frame = json_normalize(query_result)
att_data_frame.columns = att_data_frame.columns.str.replace("attributes.","")
att_data_frame

IndexError: list index out of range

## What can big data tools do for you?

With this sample we just scratched the surface of what big data analysis can do for you. ArcGIS Enterprise at 10.5 packs a powerful set of tools that let you derive a lot of value from your data. You can do so by asking the right questions, for instance, a weather dataset such as this could be used to answer a few interesting questions such as
 
 - did the number of hurricanes per season increase over the years?
 - give me the hurricanes that travelled longest distance
 - give me the ones that stayed for longest time. Do we see a trend?
 - how are wind speed and distance travelled correlated?
 - my assets are located in a tornado corridor. How many times in the past century, was there a hurricane within 50 miles from my assets?
 - my industry is dependent on tourism, which is heavily impacted by the vagaries of weather. From historical weather data, can I correlate my profits with major weather events? How well is my business insulated from freak weather events?
 - over the years do we see any shifts in major weather events - do we notice a shift in when the hurricane season starts?
 
The ArcGIS Python API gives you a gateway to easily access the big data tools from your ArcGIS Enterprise. By combining it with other powerful libraries from the pandas and scipy stack and the rich visualization capabilities of the Jupyter notebook, you can extract a lot of value from your data, big or small.