# Grab a selection of census tract data
*This notebook provides an example of how you can extract features from an ArcGIS online resource and save them into a shapefile on your local machine.* 

→ A good resource for learning more is here: https://developers.arcgis.com/python/guide/working-with-feature-layers-and-features/

### • First, we need to import the GIS module from the arcgis package
We aren't accessing any 'premium' content here, so we can authenticate "anonymously".

In [1]:
#Import the GIS object and authenticate
from arcgis import GIS
gis = GIS()

### • Next, we need to find the content we want to grab. 
Here, we could open https://arcgis.com and search for `Census Tracts`. When I checked last, that search returned > 30,000 records! So we'd need to refine our search. If we knew the owner of the dataset, we could add `owner:` to our search. We can also filter by **item type** and even filter for **authoritative** datasets.  

####   → Searching for content via AGOL: 
_First we'll search for objects in ArcGIS Online and familiarize ourselves with various attributes with our results._
* Search ArcGIS.com for <u>`Census Tracts Areas`</u> <u>feature layers</u> owned by <u>`esri_dm`</u>.
* Open the [link](https://www.arcgis.com/home/item.html?id=db3f9c8728dd44e4ad455e0c27a85eea) to the one result.
* Note the URL for the link, particularly the *id* returned: db3f9c8728dd44e4ad455e0c27a85eea
* Scroll to the bottom of the page. On the right side, find the [URL](https://services.arcgis.com/P3ePLMYs2RVChkJx/arcgis/rest/services/USA_Census_Tract_Areas_analysis_trim/FeatureServer) associated with the feature layer and open it in your browser. 
 * Note this page also reveals the item's ID. 
 * This page shows that the feature layer service serves just the one layer: `tracts_trim`.
* Open the [link] to the `tracts_trim` feature layer's *REST endpoint*.
 * What attributes are associated with this layer? 
 * How many records can be retrieved at one time from this service? 
* At the bottom of the page, find the link associated with the [Query](https://services.arcgis.com/P3ePLMYs2RVChkJx/ArcGIS/rest/services/USA_Census_Tract_Areas_analysis_trim/FeatureServer/0/query) interface for this layer.
* In the query interface enter `FIPS LIKE '37063%'` as the *Where clause*. Then scroll to the bottom and click the `Query(GET)` button. 
 * How many records are returned? 
 * Modify the query to return output format as `GeoJSON` and click `Query(GET)` again. 
 
What we just did was use AGOL to find a layer, access its REST endpoint, and use the REST api to query Census tracts for Durham County, setting the output to be a GeoJSON object. We can copy these results into a text file and convert the GeoJSON to a feature class using ArcGIS Pro's [JSON To Features](https://pro.arcgis.com/en/pro-app/tool-reference/conversion/json-to-features.htm) tool or through Python pakages like Fiona or Geopandas (more on that later...)

####   → Searching for content using the ArcGIS Python API
Instead of using a web browser, we'll now use ArcGIS Python's APIs [GIS module](https://developers.arcgis.com/python/guide/the-gis-module/) (and its helpers) to search for and access census tract data for Durham Co.


##### Step 1 is to use the API's [Content Manager] to search AGOL just like we did via the web browser. 
 * Alter the code cell below filling in the same query string we used before for the `query=` option.
 * Next, specify the item_type to be a `Feature Layer`. 
 * Finally, we add the `outside_org=True` option. _As we are using an anonymous connection to the GIS module, this doesn't really do anything. However, if we authenticated using our ArcGIS Pro app (_`gis = GIS('pro')`_) or with a username and password, we might default to our organization's portal, in which case this `outside_org=True` would be important.

In [18]:
#Use the API's content' helper to search for feature layers with keyword Census and owner is "esri_dm"
results = gis.content.search(query='Census Tracts Areas owner:esri_dm',
                             item_type='Feature Layer',
                             outside_org=True)
#Show the list of results returned
results

[<Item title:"USA Census Tract Areas" type:Feature Layer Collection owner:esri_dm>]

``` ► More info and examples on searching:``` https://developers.arcgis.com/python/guide/accessing-and-creating-content/

---
 
Now that we have a set of results, let's drill into the items that are returned -- well just the one item that in our case. 
* First, we'll extract the one item as its own variable - `tractsItem` - and then examine properties of that object. 

In [22]:
#Extract the one returned item in the list to the "tractsItem" variable
tractsItem = results[0]
#Reveal the data type of this object
type(tractsItem)

arcgis.gis.Item

In [24]:
#We can display the formatted AGOL info on that item:
tractsItem

In [21]:
#Show help documentation on the "arcgis.gis.Item" object
?tractsItem

Or, more detailed documentation on ArcGIS Item object is here:<br>
→ https://developers.arcgis.com/python/api-reference/arcgis.gis.toc.html#item

Open that link and view the methods associted with the object. What does the `content_status` function reveal? The `id` function? The `download` function? _Note that not all these functions will work on this item. Some of them are for modifying the actual feature layer hosted on AGOL, which we don't have privileges to do._ 

* Next, reveal the id associated with the item -- and compare that to the one you found by seaching AGOL in your browser? 

In [29]:
#Reveal the id associated with this item
tractsItem.id

'db3f9c8728dd44e4ad455e0c27a85eea'

**TIP**: A feature layer's item is useful to know because we can use that to access the item directly, i.e., without having to search for it. 

In [30]:
#Extract the Census tracts layer directly, via its ID
other_tractsItem = gis.content.get('db3f9c8728dd44e4ad455e0c27a85eea')
other_tractsItem

→ More info on the ArcGIS `item` object: https://esri.github.io/arcgis-python-api/apidoc/html/arcgis.gis.toc.html#item

In [None]:
#Get the layer from the item
tractsLyr = tractsItem.layers[0]
type(tractsLyr)

→ More info on the ArcGIS `layer` object: https://esri.github.io/arcgis-python-api/apidoc/html/arcgis.gis.toc.html#layer<br>
→ More info on the ArcGIS `FeatureLayer` object: https://esri.github.io/arcgis-python-api/apidoc/html/arcgis.features.toc.html#featurelayer

---
### Subsetting and downloading content
Now that we have what we want as a FeatureLayer object, we can query that layer for just the features we we want to download. Before diving into our query, we'll need to familiarize ourselves with the data. 

Below are a set of steps mirrored after ESRI's documentation on querying feature layers:<br>
https://developers.arcgis.com/python/guide/working-with-feature-layers-and-features/#Querying-feature-layers

* First, we could just examine the ESRI REST endpoint for this dataset. To do that, we'd just reveal the URL associated with the data layer...

In [None]:
print(tractsLyr.url)

* Or we could remain in our Python coding environment and reveal key properties using API functions:

In [None]:
#List the fields associated with the raster
for f in tractsLyr.properties.fields:
    print(f['name'],end=', ')

* Now we'll construct a query function, much like if we were invoking the REST interface, but instead using the API functions, which streamline these things. 

In [None]:
#Query the tracts feature layer for records falling within Durham Co (FIPS 37063)
query_result = tractsLyr.query("GEOID LIKE '37063%'")
type(query_result)

→ More info on the `Feature Set` object: https://esri.github.io/arcgis-python-api/apidoc/html/arcgis.features.toc.html#featureset

In [None]:
#Reveal how many features were extracted: 
len(query_result)

In [None]:
#Save the feature set as shapefile
outFN = query_result.save('.','MyFeatures')
print("Output saved as {}".format(outFN))

### Or, analyzing the data here, as a dataframe

In [None]:
#Convert the feature set data as a dataframe
df = query_result.sdf
df.head()

* Note the output has a column called "SHAPE". These values are ArcGIS API `geometry` objects. 

#### Analyzing geometry

In [None]:
#Get the value in the first row of the "SHAPE" column
feat = df.loc[0,'SHAPE']
type(feat)

https://esri.github.io/arcgis-python-api/apidoc/html/arcgis.geometry.html#arcgis.geometry.Geometry.get_area

In [None]:
#Get the area, in square miles
feat.get_area(method='GEODESIC',units='MILES')

#### Analyzing age demographics

In [None]:
#Grab the first 10 columns into a new dataframe
ageColsDF = df.iloc[:,:9]

In [None]:
#Summarize those columns
ageColsDF.describe()

In [None]:
#Plot demographics: count within each age group
ageCols.sum()

In [None]:
%matplotlib inline
ageCols.sum().plot(kind='bar');