## Create a Self-Aggregating Map Layer using GeoAnalytics

This notebook will complete the following:

- Connect to your Enterprise portal
- Search through your big data file shares for your dataset of interest
- Run the GeoAnalytics Tool Copy to Data Store
- Publish the results of the tool as a Map Image Layer




In [5]:
# Import the required modules to use the ArcGIS API for Python 
# and the GeoAnalytics module

from arcgis.gis import GIS
import arcgis.geoanalytics

To modify this notebook to be used in your organization, set the following variables:

- The portal URL, username and password. The member is required to have [privileges](http://enterprise.arcgis.com/en/portal/latest/administer/windows/roles.htm) to run GeoAnalytics.
- The big data file share name you are using. It is assumed you have already registered a big data file. If you haven't, you can register it manually using the [steps outlined here](http://enterprise.arcgis.com/en/server/latest/publish-services/windows/registering-your-data-with-arcgis-server-using-manager.htm#ESRI_SECTION1_0D55682C9D6E48E7857852A9E2D5D189), or using the API. [See this sample for details.](https://developers.arcgis.com/python/sample-notebooks/creating-hurricane-tracks-using-geoanalytics/#Create-a-data-store)
- The dataset in the big data file share

In [8]:
# Variables that you can set to make this run on your own portal

# This is the portal that you will be connecting to
portal_url = "https://mymachinename.domain.com/portal"

# This is the portal member and password that will be running analysis
portal_username = "username"
portal_password = "password"

# The name of the big data file share used as input
big_data_file_share_name = "bigDataFileShares_pyTest"

# The dataset name in the big data file share above used to run the analysis
big_data_file_share_dataset = "ChicagoCrimes"

In [9]:
# Setting up for the Enterprise portal

gis = GIS(portal_url, portal_username, portal_password, verify_cert=False)
if not arcgis.geoanalytics.is_supported():
    print("GeoAnalytics is not supported on the Enterprise portal. Please use a portal that GeoAnalytics is supported on.")

## Find the dataset to use for analysis
To run analysis, we must find the dataset to run analysis on. In this workflow, we'll run the analysis on a dataset in a big data file share. Like all GeoAnalytics tools, this analysis could also be run on a feature layer or collection. To do this we first need to:
- Find the big data file shares registered with our GeoAnalytics Server
- Search through the big data file shares to find the one we want to use
- Search through our big data file share for the dataset to use.

Remember, you can have multiple big data file shares in a portal, and each big data file share may have one or more datasets. 

In [10]:
# Search for all the big data file shares in your portal
bigdata_fileshares = gis.content.search("", item_type = "big data file share", max_items=20)
bigdata_fileshares

[<Item title:"bigDataFileShares_Joyce_BDFS" type:Big Data File Share owner:Sarah_publisher>,
 <Item title:"bigDataFileShares_createSpaceTimeCube" type:Big Data File Share owner:admin>,
 <Item title:"bigDataFileShares_pyTest" type:Big Data File Share owner:admin>,
 <Item title:"bigDataFileShares_test_outputs" type:Big Data File Share owner:admin>,
 <Item title:"bigDataFileShares_findPointClusters" type:Big Data File Share owner:admin>,
 <Item title:"bigDataFileShares_pytest-S3" type:Big Data File Share owner:admin>,
 <Item title:"bigDataFileShares_holisticdata-azure" type:Big Data File Share owner:admin>,
 <Item title:"bigDataFileShares_TestTaxi" type:Big Data File Share owner:admin>,
 <Item title:"bigDataFileShares_Joyce_BDFS" type:Big Data File Share owner:Sarah_publisher>,
 <Item title:"bigDataFileShares_createSpaceTimeCube" type:Big Data File Share owner:admin>,
 <Item title:"bigDataFileShares_orc_and_par" type:Big Data File Share owner:admin>,
 <Item title:"bigDataFileShares_geocod

In [11]:
# Iterate through all the big data file share items until we find the one we are interested in using

try: 
    data_item = next(x for x in bigdata_fileshares if x.title == big_data_file_share_name)
    print("The item being used is: {0}".format(data_item))
except:
    print("\nThe big data file share that you were looking for was not found. Please:")
    print(" - Verify you have registered the big data file share before running this code.")
    print(" - Check that you gave the correct name for the big data file share. Expected format: \"bigDataFileShares_name\"")
    print(" - Increase the max items returned when searching for your item.\n")
    
    print("The big data file shares listed in your portal are: ")
    [print("  -",x.title) for x in bigdata_fileshares]


The item being used is: <Item title:"bigDataFileShares_pyTest" type:Big Data File Share owner:admin>


In [12]:
try:
    layer_to_use = next(x for x in data_item.layers if x.properties.name == big_data_file_share_dataset)
    print("The layer being used is: {0}".format(layer_to_use))
except:
    print("\nThe dataset: {0} in big data file share: {1} was not found. Please:".format(big_data_file_share_dataset, big_data_file_share_name) )
    print(" - Check that you gave the correct name for the dataset and big data file share.")
    
    print("The datasets listed in your big data file share are: ")
    [print("  -",x.properties.name) for x in data_item.layers]


The layer being used is: <Layer url:"https://gpportal.esri.com/server/rest/services/DataStoreCatalogs/bigDataFileShares_pyTest/BigDataCatalogServer/ChicagoCrimes">


## Run the Analysis
Now that we have found the layer of interest we're able to set up the analysis envrionment and run the [Copy to Data Store](http://enterprise.arcgis.com/en/portal/latest/use/geoanalytics-copy-to-data-store.htm) tool using GeoAnalytics. When running tools, we can set environment variables that will be applied to all following tool runs. In this sample, we will set the following environment variables:

- Default aggregation styles. We will set this to true, by default, it is set to false. 
- Verbose logging. We will turn this on to see the status of the tool as it runs.

There are a few other parameters we could set ([see the API guide here](https://esri.github.io/arcgis-python-api/apidoc/html/arcgis.env.html)), but will not do for this sample. They include:

- Processing spatial reference
- Output data store
- Output spatial reference
- Extent of data used in the analysis.

In [13]:
# Import the Copy to Data Store tool
from arcgis.geoanalytics.manage_data import copy_to_data_store

# Set the environment variables for using GeoAnalytics. All of these parameters are optional.
arcgis.env.default_aggregation_styles = True
arcgis.env.verbose = True

In [14]:
# Run the tool. We use a random number generator to ensure the name is always unique.
import random
random_value = random.randint(1,100)

result_name = "CrimeDataset_" + str(random_value)
tool_output = copy_to_data_store(layer_to_use, output_name=result_name)

Submitted.
Executing...




Executing (CopyToDataStore): CopyToDataStore "Record Set" "{"serviceProperties": {"name": "CrimeDataset_38", "serviceUrl": "http://gpportal.esri.com/server/rest/services/Hosted/CrimeDataset_38/FeatureServer"}, "itemProperties": {"itemId": "15592be05cbc4b1a96512eab24414420"}}" "{"defaultAggregationStyles": true}"
Start Time: Fri Jul 20 12:56:14 2018
{"messageCode":"BD_101081","message":"Finished writing results:"}
{"messageCode":"BD_101082","message":"* Count of features = 271868","params":{"resultCount":"271868"}}
{"messageCode":"BD_101083","message":"* Spatial extent = {\"xmin\":-87.93433083735326,\"ymin\":41.64472422641647,\"xmax\":-87.52468393341654,\"ymax\":42.022654058892584}","params":{"extent":"{\"xmin\":-87.93433083735326,\"ymin\":41.64472422641647,\"xmax\":-87.52468393341654,\"ymax\":42.022654058892584}"}}
{"messageCode":"BD_101084","message":"* Temporal extent = Interval(MutableInstant(2014-01-01 00:00:00.000),MutableInstant(2014-12-31 23:58:00.000))","params":{"extent":"Inte



{"messageCode":"BD_101051","message":"Possible issues were found while reading 'inputLayer'.","params":{"paramName":"inputLayer"}}
{"messageCode":"BD_101054","message":"Some records have either missing or invalid geometries."}
Succeeded at Fri Jul 20 12:58:41 2018 (Elapsed Time: 2 minutes 26 seconds)


In [15]:
# Confirm that a layer has been created

processed_map = gis.map("Chicago")
processed_map

MapView(basemaps=['dark-gray', 'dark-gray-vector', 'gray', 'gray-vector', 'hybrid', 'national-geographic', 'oc…

In [16]:
processed_map.add_layer(tool_output)

## Publish the layer as a Map Image Layer

The result of the GeoAnalytics tool is a feature layer in your portal. To create a map image layer you need to publish it. When you publish the layer, it will automatically create a map image layer in your portal of the same name, that aggregates depending on your zoom level. 

In [17]:
# This publishes the feature layer as a map image layer
self_aggregating_map_service = tool_output.publish()

In [18]:
self_aggregating_map_service