# Enrich Using Point Geometries

Starting off, we import a few required Python resources. While there are quite a few in there, the import of note is `from arcgis.geoenrichment import Country, get_countries`. We are going to use this object and method to discover and perform our analysis.

In [1]:
import os
from pathlib import Path

from arcgis.features import GeoAccessor
from arcgis.geoenrichment import Country
from arcgis.geometry import Geometry
from arcgis.gis import GIS
from dotenv import load_dotenv, find_dotenv
import pandas as pd

load_dotenv(find_dotenv())

True

Now, we are going to need a connection to ArcGIS Online to demonstrate the abiliy to use ArcGIS Online for geoenrichment. This is accomplished by instantiating a `GIS` object instance with valid credentials read from environment variables.

In [2]:
gis_agol = GIS(
    url=os.getenv('ESRI_GIS_URL'), 
    username=os.getenv('ESRI_GIS_USERNAME'),
    password=os.getenv('ESRI_GIS_PASSWORD')
)

gis_agol

Next, we need some test data to work with. We are pulling business locations from an ArcGIS Online layer.

In [3]:
pt_itm_id = '581de875ae96477298f7b5ac598b1dec'
pt_lyr = GIS().content.get(pt_itm_id).layers[0]
pt_df = pt_lyr.query().sdf.drop(columns='OBJECTID')
pt_df.spatial.set_geometry('SHAPE')

pt_df.info()
pt_df.head()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 251 entries, 0 to 250
Data columns (total 2 columns):
 #   Column  Non-Null Count  Dtype   
---  ------  --------------  -----   
 0   LOCNUM  251 non-null    object  
 1   SHAPE   251 non-null    geometry
dtypes: geometry(1), object(1)
memory usage: 4.0+ KB


Unnamed: 0,LOCNUM,SHAPE
0,177692910,"{""x"": -122.63332499999997, ""y"": 45.41751900000..."
1,207178880,"{""x"": -122.578461, ""y"": 45.49077900000012, ""sp..."
2,213604069,"{""x"": -122.68257749999998, ""y"": 45.52993800000..."
3,214402802,"{""x"": -122.28194789999999, ""y"": 45.40177785000..."
4,215047911,"{""x"": -122.73786899999999, ""y"": 45.41748299999..."


To enrich, we start by creating a `Country` object instance. As part of the constructor, we need to tell the object what Business Analyst source to use in the `gis` parameter. In this case, we are telling the object to use a local instance of ArcGIS Pro with Business Analyst and the United States data pack.

In [4]:
usa_agol = Country('usa', gis=gis_agol)

usa_agol

<Country - United States (GIS @ https://baqa.mapsqa.arcgis.com version:9.4)>

## Select Variables

Next, we need to get some enrich variables to use. We can discover what is available using the `enrich_variables` property of the country object to retrieve a Pandas Data Frame of variables available for the country.

In [5]:
ev = usa_agol.enrich_variables

ev

Unnamed: 0,name,alias,data_collection,enrich_name,enrich_field_name,description,vintage,units
0,AGE0_CY,2021 Population Age <1,1yearincrements,1yearincrements.AGE0_CY,F1yearincrements_AGE0_CY,2021 Total Population Age <1 (Esri),2021,count
1,AGE1_CY,2021 Population Age 1,1yearincrements,1yearincrements.AGE1_CY,F1yearincrements_AGE1_CY,2021 Total Population Age 1 (Esri),2021,count
2,AGE2_CY,2021 Population Age 2,1yearincrements,1yearincrements.AGE2_CY,F1yearincrements_AGE2_CY,2021 Total Population Age 2 (Esri),2021,count
3,AGE3_CY,2021 Population Age 3,1yearincrements,1yearincrements.AGE3_CY,F1yearincrements_AGE3_CY,2021 Total Population Age 3 (Esri),2021,count
4,AGE4_CY,2021 Population Age 4,1yearincrements,1yearincrements.AGE4_CY,F1yearincrements_AGE4_CY,2021 Total Population Age 4 (Esri),2021,count
...,...,...,...,...,...,...,...,...
19148,MOEMEDYRMV,2019 Median Year Householder Moved In MOE (ACS...,yearmovedin,yearmovedin.MOEMEDYRMV,yearmovedin_MOEMEDYRMV,2019 Median Year Householder Moved into Unit M...,2015-2019,count
19149,RELMEDYRMV,2019 Median Year Householder Moved In REL (ACS...,yearmovedin,yearmovedin.RELMEDYRMV,yearmovedin_RELMEDYRMV,2019 Median Year Householder Moved into Unit R...,2015-2019,count
19150,ACSOWNER,2019 Owner Households (ACS 5-Yr),yearmovedin,yearmovedin.ACSOWNER,yearmovedin_ACSOWNER,2019 Owner Households (ACS 5-Yr),2015-2019,count
19151,MOEOWNER,2019 Owner Households MOE (ACS 5-Yr),yearmovedin,yearmovedin.MOEOWNER,yearmovedin_MOEOWNER,2019 Owner Households MOE (ACS 5-Yr),2015-2019,count


Tens of thousands of variables is just a few too many to deal with, so we can parse this down a bit using some Pandas Data Frame filtering to get just key United States variables for the current year.

In [6]:
kv = ev[
    (ev.data_collection.str.lower().str.contains('key'))
    & (ev.name.str.lower().str.endswith('cy'))
].reset_index(drop=True)

kv

Unnamed: 0,name,alias,data_collection,enrich_name,enrich_field_name,description,vintage,units
0,TOTPOP_CY,2021 Total Population,KeyUSFacts,KeyUSFacts.TOTPOP_CY,KeyUSFacts_TOTPOP_CY,2021 Total Population (Esri),2021,count
1,GQPOP_CY,2021 Group Quarters Population,KeyUSFacts,KeyUSFacts.GQPOP_CY,KeyUSFacts_GQPOP_CY,2021 Group Quarters Population (Esri),2021,count
2,DIVINDX_CY,2021 Diversity Index,KeyUSFacts,KeyUSFacts.DIVINDX_CY,KeyUSFacts_DIVINDX_CY,2021 Diversity Index (Esri),2021,count
3,TOTHH_CY,2021 Total Households,KeyUSFacts,KeyUSFacts.TOTHH_CY,KeyUSFacts_TOTHH_CY,2021 Total Households (Esri),2021,count
4,AVGHHSZ_CY,2021 Average Household Size,KeyUSFacts,KeyUSFacts.AVGHHSZ_CY,KeyUSFacts_AVGHHSZ_CY,2021 Average Household Size (Esri),2021,count
5,MEDHINC_CY,2021 Median Household Income,KeyUSFacts,KeyUSFacts.MEDHINC_CY,KeyUSFacts_MEDHINC_CY,2021 Median Household Income (Esri),2021,currency
6,AVGHINC_CY,2021 Average Household Income,KeyUSFacts,KeyUSFacts.AVGHINC_CY,KeyUSFacts_AVGHINC_CY,2021 Average Household Income (Esri),2021,currency
7,PCI_CY,2021 Per Capita Income,KeyUSFacts,KeyUSFacts.PCI_CY,KeyUSFacts_PCI_CY,2021 Per Capita Income (Esri),2021,currency
8,TOTHU_CY,2021 Total Housing Units,KeyUSFacts,KeyUSFacts.TOTHU_CY,KeyUSFacts_TOTHU_CY,2021 Total Housing Units (Esri),2021,count
9,OWNER_CY,2021 Owner Occupied HUs,KeyUSFacts,KeyUSFacts.OWNER_CY,KeyUSFacts_OWNER_CY,2021 Owner Occupied Housing Units (Esri),2021,count


## Enrich with Defaults

Finally, we can enrich using the points and variables collected above. Please notice, we are not specifying the area around the input points, so the proximity defaults are being used - a straight line distance of one kilometer around the points. This circular area is then used to apportion data to the locations specified by the point geometries.

In [7]:
pt_df.head()

Unnamed: 0,LOCNUM,SHAPE
0,177692910,"{""x"": -122.63332499999997, ""y"": 45.41751900000..."
1,207178880,"{""x"": -122.578461, ""y"": 45.49077900000012, ""sp..."
2,213604069,"{""x"": -122.68257749999998, ""y"": 45.52993800000..."
3,214402802,"{""x"": -122.28194789999999, ""y"": 45.40177785000..."
4,215047911,"{""x"": -122.73786899999999, ""y"": 45.41748299999..."


In [8]:
pt1_enrich_df = usa_agol.enrich(
    geographies=pt_df,
    enrich_variables=kv
)

pt1_enrich_df.info()
pt1_enrich_df.head()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 251 entries, 0 to 250
Data columns (total 31 columns):
 #   Column                             Non-Null Count  Dtype   
---  ------                             --------------  -----   
 0   locnum                             251 non-null    object  
 1   source_country                     251 non-null    object  
 2   area_type                          251 non-null    object  
 3   buffer_units                       251 non-null    object  
 4   buffer_units_alias                 251 non-null    object  
 5   buffer_radii                       251 non-null    int64   
 6   aggregation_method                 251 non-null    object  
 7   population_to_polygon_size_rating  251 non-null    float64 
 8   apportionment_confidence           251 non-null    float64 
 9   has_data                           251 non-null    int64   
 10  totpop_cy                          251 non-null    int64   
 11  gqpop_cy                           251 non-nu

Unnamed: 0,locnum,source_country,area_type,buffer_units,buffer_units_alias,buffer_radii,aggregation_method,population_to_polygon_size_rating,apportionment_confidence,has_data,...,vacant_cy,medval_cy,avgval_cy,popgrw10_cy,hhgrw10_cy,famgrw10_cy,dpop_cy,dpopwrk_cy,dpopres_cy,SHAPE
0,177692910,US,RingBuffer,esriMiles,Miles,1,BlockApportionment:US.BlockGroups;PointsLayer:...,2.191,2.576,1,...,348,390413,429594,0.43,0.41,0.19,11715,4849,6866,"{""x"": -122.63332499999997, ""y"": 45.41751900000..."
1,207178880,US,RingBuffer,esriMiles,Miles,1,BlockApportionment:US.BlockGroups;PointsLayer:...,2.191,2.576,1,...,420,394983,441629,0.81,0.67,0.61,20885,8823,12062,"{""x"": -122.578461, ""y"": 45.49077900000012, ""sp..."
2,213604069,US,RingBuffer,esriMiles,Miles,1,BlockApportionment:US.BlockGroups;PointsLayer:...,2.191,2.576,1,...,3773,628835,706736,2.86,2.81,3.17,91264,75188,16076,"{""x"": -122.68257749999998, ""y"": 45.52993800000..."
3,214402802,US,RingBuffer,esriMiles,Miles,1,BlockApportionment:US.BlockGroups;PointsLayer:...,2.191,2.576,1,...,94,376746,403178,2.64,2.52,2.44,5727,2799,2928,"{""x"": -122.28194789999999, ""y"": 45.40177785000..."
4,215047911,US,RingBuffer,esriMiles,Miles,1,BlockApportionment:US.BlockGroups;PointsLayer:...,2.191,2.576,1,...,158,575127,605756,0.41,0.47,0.2,22915,18367,4548,"{""x"": -122.73786899999999, ""y"": 45.41748299999..."


## Specify Proximity Value

If wanting to use a value different from the default (highly recommended) of one kilometer, this can easily be specified using the `proximity_value` parameter.

In [11]:
pt2_enrich_df = usa_agol.enrich(
    geographies=pt_df,
    enrich_variables=kv,
    proximity_value=3
)

pt2_enrich_df.info()
pt2_enrich_df.head()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 251 entries, 0 to 250
Data columns (total 31 columns):
 #   Column                             Non-Null Count  Dtype   
---  ------                             --------------  -----   
 0   locnum                             251 non-null    object  
 1   source_country                     251 non-null    object  
 2   area_type                          251 non-null    object  
 3   buffer_units                       251 non-null    object  
 4   buffer_units_alias                 251 non-null    object  
 5   buffer_radii                       251 non-null    int64   
 6   aggregation_method                 251 non-null    object  
 7   population_to_polygon_size_rating  251 non-null    float64 
 8   apportionment_confidence           251 non-null    float64 
 9   has_data                           251 non-null    int64   
 10  totpop_cy                          251 non-null    int64   
 11  gqpop_cy                           251 non-nu

Unnamed: 0,locnum,source_country,area_type,buffer_units,buffer_units_alias,buffer_radii,aggregation_method,population_to_polygon_size_rating,apportionment_confidence,has_data,...,vacant_cy,medval_cy,avgval_cy,popgrw10_cy,hhgrw10_cy,famgrw10_cy,dpop_cy,dpopwrk_cy,dpopres_cy,SHAPE
0,177692910,US,RingBuffer,esriMiles,Miles,1,BlockApportionment:US.BlockGroups;PointsLayer:...,2.191,2.576,1,...,348,390413,429594,0.43,0.41,0.19,11715,4849,6866,"{""x"": -122.63332499999997, ""y"": 45.41751900000..."
1,207178880,US,RingBuffer,esriMiles,Miles,1,BlockApportionment:US.BlockGroups;PointsLayer:...,2.191,2.576,1,...,420,394983,441629,0.81,0.67,0.61,20885,8823,12062,"{""x"": -122.578461, ""y"": 45.49077900000012, ""sp..."
2,213604069,US,RingBuffer,esriMiles,Miles,1,BlockApportionment:US.BlockGroups;PointsLayer:...,2.191,2.576,1,...,3773,628835,706736,2.86,2.81,3.17,91264,75188,16076,"{""x"": -122.68257749999998, ""y"": 45.52993800000..."
3,214402802,US,RingBuffer,esriMiles,Miles,1,BlockApportionment:US.BlockGroups;PointsLayer:...,2.191,2.576,1,...,94,376746,403178,2.64,2.52,2.44,5727,2799,2928,"{""x"": -122.28194789999999, ""y"": 45.40177785000..."
4,215047911,US,RingBuffer,esriMiles,Miles,1,BlockApportionment:US.BlockGroups;PointsLayer:...,2.191,2.576,1,...,158,575127,605756,0.41,0.47,0.2,22915,18367,4548,"{""x"": -122.73786899999999, ""y"": 45.41748299999..."


## Specify Proximity Value and Metric

If desiring to use a different measure of distance, such as miles, this can be specified as well using the `proximity_metric` parameter.

In [12]:
pt3_enrich_df = usa_agol.enrich(
    geographies=pt_df,
    enrich_variables=kv,
    proximity_value=3,
    proximity_metric='miles'
)

pt3_enrich_df.info()
pt3_enrich_df.head()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 251 entries, 0 to 250
Data columns (total 31 columns):
 #   Column                             Non-Null Count  Dtype   
---  ------                             --------------  -----   
 0   locnum                             251 non-null    object  
 1   source_country                     251 non-null    object  
 2   area_type                          251 non-null    object  
 3   buffer_units                       251 non-null    object  
 4   buffer_units_alias                 251 non-null    object  
 5   buffer_radii                       251 non-null    int64   
 6   aggregation_method                 251 non-null    object  
 7   population_to_polygon_size_rating  251 non-null    float64 
 8   apportionment_confidence           251 non-null    float64 
 9   has_data                           251 non-null    int64   
 10  totpop_cy                          251 non-null    int64   
 11  gqpop_cy                           251 non-nu

Unnamed: 0,locnum,source_country,area_type,buffer_units,buffer_units_alias,buffer_radii,aggregation_method,population_to_polygon_size_rating,apportionment_confidence,has_data,...,vacant_cy,medval_cy,avgval_cy,popgrw10_cy,hhgrw10_cy,famgrw10_cy,dpop_cy,dpopwrk_cy,dpopres_cy,SHAPE
0,177692910,US,RingBuffer,esriMiles,Miles,1,BlockApportionment:US.BlockGroups;PointsLayer:...,2.191,2.576,1,...,348,390413,429594,0.43,0.41,0.19,11715,4849,6866,"{""x"": -122.63332499999997, ""y"": 45.41751900000..."
1,207178880,US,RingBuffer,esriMiles,Miles,1,BlockApportionment:US.BlockGroups;PointsLayer:...,2.191,2.576,1,...,420,394983,441629,0.81,0.67,0.61,20885,8823,12062,"{""x"": -122.578461, ""y"": 45.49077900000012, ""sp..."
2,213604069,US,RingBuffer,esriMiles,Miles,1,BlockApportionment:US.BlockGroups;PointsLayer:...,2.191,2.576,1,...,3773,628835,706736,2.86,2.81,3.17,91264,75188,16076,"{""x"": -122.68257749999998, ""y"": 45.52993800000..."
3,214402802,US,RingBuffer,esriMiles,Miles,1,BlockApportionment:US.BlockGroups;PointsLayer:...,2.191,2.576,1,...,94,376746,403178,2.64,2.52,2.44,5727,2799,2928,"{""x"": -122.28194789999999, ""y"": 45.40177785000..."
4,215047911,US,RingBuffer,esriMiles,Miles,1,BlockApportionment:US.BlockGroups;PointsLayer:...,2.191,2.576,1,...,158,575127,605756,0.41,0.47,0.2,22915,18367,4548,"{""x"": -122.73786899999999, ""y"": 45.41748299999..."


## Determine Proximity Type

The above examples all use the default `proximity_type` of `straight_line`. However, based on what transportation network you have available with the GIS source you are using, other methods are also available. These can be discovered using the `travel_modes` property of the `Country`. Any of the vaues in the `names` column are valid values for `proximity_type` in addition to the default `straight_line`.

In [15]:
usa_agol.travel_modes

Exception: Token Required
(Error Code: 499)

Hence, if we want to use both paved _and_ gravel roads (because gravel roads are _fun_), we can use `rural_driving_time`. Before selecting, we can investigate the details of the method by looking at the description.

In [16]:
usa_agol.travel_modes[usa_agol.travel_modes.name == 'rural_driving_distance'].iloc[0]['description']

Exception: Token Required
(Error Code: 499)

### Enrich using Proximity Parameters

_Most_ people aren't going to be driving as fast on a gravel road as they are on an interstate. This enables us to take into consideration the differences in speed based on the road type. Using drive time as a method to define proximity around a location is a much better represenation of how people actually move around and interact with their surrouding environemnt...such as finding food at a grocery store.

In [14]:
pt4_enrich_df = usa_agol.enrich(
    geographies=pt_df,
    enrich_variables=kv,
    proximity_type='rural_driving_time',
    proximity_value=8,
    proximity_metric='minutes'
)

pt4_enrich_df.info()
pt4_enrich_df.head()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 251 entries, 0 to 250
Data columns (total 31 columns):
 #   Column                             Non-Null Count  Dtype   
---  ------                             --------------  -----   
 0   locnum                             251 non-null    object  
 1   source_country                     251 non-null    object  
 2   area_type                          251 non-null    object  
 3   buffer_units                       251 non-null    object  
 4   buffer_units_alias                 251 non-null    object  
 5   buffer_radii                       251 non-null    int64   
 6   aggregation_method                 251 non-null    object  
 7   population_to_polygon_size_rating  251 non-null    float64 
 8   apportionment_confidence           251 non-null    float64 
 9   has_data                           251 non-null    int64   
 10  totpop_cy                          251 non-null    int64   
 11  gqpop_cy                           251 non-nu

Unnamed: 0,locnum,source_country,area_type,buffer_units,buffer_units_alias,buffer_radii,aggregation_method,population_to_polygon_size_rating,apportionment_confidence,has_data,...,vacant_cy,medval_cy,avgval_cy,popgrw10_cy,hhgrw10_cy,famgrw10_cy,dpop_cy,dpopwrk_cy,dpopres_cy,SHAPE
0,177692910,US,RingBuffer,esriMiles,Miles,1,BlockApportionment:US.BlockGroups;PointsLayer:...,2.191,2.576,1,...,348,390413,429594,0.43,0.41,0.19,11715,4849,6866,"{""x"": -122.63332499999997, ""y"": 45.41751900000..."
1,207178880,US,RingBuffer,esriMiles,Miles,1,BlockApportionment:US.BlockGroups;PointsLayer:...,2.191,2.576,1,...,420,394983,441629,0.81,0.67,0.61,20885,8823,12062,"{""x"": -122.578461, ""y"": 45.49077900000012, ""sp..."
2,213604069,US,RingBuffer,esriMiles,Miles,1,BlockApportionment:US.BlockGroups;PointsLayer:...,2.191,2.576,1,...,3773,628835,706736,2.86,2.81,3.17,91264,75188,16076,"{""x"": -122.68257749999998, ""y"": 45.52993800000..."
3,214402802,US,RingBuffer,esriMiles,Miles,1,BlockApportionment:US.BlockGroups;PointsLayer:...,2.191,2.576,1,...,94,376746,403178,2.64,2.52,2.44,5727,2799,2928,"{""x"": -122.28194789999999, ""y"": 45.40177785000..."
4,215047911,US,RingBuffer,esriMiles,Miles,1,BlockApportionment:US.BlockGroups;PointsLayer:...,2.191,2.576,1,...,158,575127,605756,0.41,0.47,0.2,22915,18367,4548,"{""x"": -122.73786899999999, ""y"": 45.41748299999..."
