# ArcGIS API for Python

## GeoEnrichment

**GeoEnrichment** provides the ability to get facts about a location or area. Using GeoEnrichment, you can get information about the people and places in a specific area or within a certain distance or drive time from a location. It enables you to query and use information from a large collection of data sets including population, income, housing, consumer behavior, and the natural environment.

This module enables you to answer questions about locations that you can't answer with maps alone. For example: What kind of people live here? What do people like to do in this area? What are their habits and lifestyles?

The `enrich()` method can be used to retrieve demographics and other relevant characteristics associated with the area surrounding the requested places. You can also use the `arcgis.geoenrichment` **module** to obtain additional geographic context (for example, the **ZIP Code** of a location) and geographic boundaries (for example, the geometry for a drive-time service area).

**Site analysis** is a popular application of this type of **data enrichment**. For example, GeoEnrichment can be leveraged to study the population that would be affected by the development of a new community center within their neighborhood. With the `enrich()` method, the proposed site can be submitted, and the demographics and other relevant characteristics associated with the area around the site will be returned.

A user must be logged on to a GIS in order to use GeoEnrichment. The geoenrichment functionality is available in the `arcgis.geoenrichment` **module**.

In [84]:
from arcgis.gis import GIS
from arcgis.geoenrichment import *

gis = GIS(api_key="AAPK6ee8947e8f9549d0ada59f3e054f57e2fs3A0lQeIqocbpFkvx1hDM0cgFXyKXef3hmZNzOEwh79qVXTWqLFIbVAUjpGlPQ7")

### 1. GeoEnrichment coverage

In [2]:
# Query the countries for which there is GeoEnrichment data
countries = get_countries()
print("Number of countries, for which GeoEnrichment data is available: " + str(len(countries)))

Number of countries, for which GeoEnrichment data is available: 154


In [3]:
# Print a few countries for a sample
countries[0:10]

[<Country - Albania (GIS @ https://www.arcgis.com version:10.1)>,
 <Country - Algeria (GIS @ https://www.arcgis.com version:10.1)>,
 <Country - Andorra (GIS @ https://www.arcgis.com version:10.1)>,
 <Country - Angola (GIS @ https://www.arcgis.com version:10.1)>,
 <Country - Anguilla (GIS @ https://www.arcgis.com version:10.1)>,
 <Country - Argentina (GIS @ https://www.arcgis.com version:10.1)>,
 <Country - Armenia (GIS @ https://www.arcgis.com version:10.1)>,
 <Country - Aruba (GIS @ https://www.arcgis.com version:10.1)>,
 <Country - Australia (GIS @ https://www.arcgis.com version:10.1)>,
 <Country - Austria (GIS @ https://www.arcgis.com version:10.1)>]

### 2. Filtering countries by properties

In [4]:
# Gets the countries in Oceania
[country.properties.country_name for country in countries if country.properties.continent == 'Oceania']

['Australia',
 'Fiji',
 'French Polynesia',
 'New Caledonia',
 'New Zealand',
 'Papua New Guinea']

### 3. Discovering information for a country

In [85]:
# Country class can be used to discover the data collections, sub-geographies and available reports for a country
usa = Country.get('USA')
type(usa)

arcgis.geoenrichment.enrichment.Country

In [6]:
# Properties for the country are accessible using Country.properties
usa.properties

iso2                                                              US
iso3                                                             USA
country_name                                           United States
datasets           [USA_ESRI_2021, USA_ACS_2021, USA_ASR_2021, US...
default_dataset                                        USA_ESRI_2021
alt_name                                               UNITED STATES
continent                                              North America
Name: 147, dtype: object

In [7]:
usa.properties.country_name

'United States'

In [8]:
usa.properties.datasets

['USA_ESRI_2021',
 'USA_ACS_2021',
 'USA_ASR_2021',
 'USA_CRM_2021',
 'USA_DATAAXLE_2022',
 'USA_RMP_2021',
 'USA_SAFEGRAPH_2022',
 'USA_TRFCNT_2022',
 'USA_PL_2020',
 'Landscape']

### 4. Data collections and analysis variables

**GeoEnrichment** uses the **concept of a data collection** to define the data attributes returned by the enrichment service.

A **data collection** is a preassembled list of attributes that will be used to enrich the input features. Collection attributes can describe various types of information, such as demographic characteristics and geographic context of the locations or areas submitted as input features.

In [9]:
# data_collections property of a Country object lists its available data collections and analysis variables
data_frame = usa.data_collections
data_frame

Unnamed: 0_level_0,analysisVariable,alias,fieldCategory,vintage
dataCollectionID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
1yearincrements,1yearincrements.AGE0_CY,2021 Population Age <1,2021 Age: 1 Year Increments (Esri),2021
1yearincrements,1yearincrements.AGE1_CY,2021 Population Age 1,2021 Age: 1 Year Increments (Esri),2021
1yearincrements,1yearincrements.AGE2_CY,2021 Population Age 2,2021 Age: 1 Year Increments (Esri),2021
1yearincrements,1yearincrements.AGE3_CY,2021 Population Age 3,2021 Age: 1 Year Increments (Esri),2021
1yearincrements,1yearincrements.AGE4_CY,2021 Population Age 4,2021 Age: 1 Year Increments (Esri),2021
...,...,...,...,...
yearmovedin,yearmovedin.MOEMEDYRMV,2019 Median Year Householder Moved In MOE (ACS...,2015-2019 Year Householder Moved In (ACS),2015-2019
yearmovedin,yearmovedin.RELMEDYRMV,2019 Median Year Householder Moved In REL (ACS...,2015-2019 Year Householder Moved In (ACS),2015-2019
yearmovedin,yearmovedin.ACSOWNER,2019 Owner Households (ACS 5-Yr),2015-2019 Key Demographic Indicators (ACS),2015-2019
yearmovedin,yearmovedin.MOEOWNER,2019 Owner Households MOE (ACS 5-Yr),2015-2019 Key Demographic Indicators (ACS),2015-2019


In [10]:
# Print a few rows of the DataFrame
data_frame.head()

Unnamed: 0_level_0,analysisVariable,alias,fieldCategory,vintage
dataCollectionID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
1yearincrements,1yearincrements.AGE0_CY,2021 Population Age <1,2021 Age: 1 Year Increments (Esri),2021
1yearincrements,1yearincrements.AGE1_CY,2021 Population Age 1,2021 Age: 1 Year Increments (Esri),2021
1yearincrements,1yearincrements.AGE2_CY,2021 Population Age 2,2021 Age: 1 Year Increments (Esri),2021
1yearincrements,1yearincrements.AGE3_CY,2021 Population Age 3,2021 Age: 1 Year Increments (Esri),2021
1yearincrements,1yearincrements.AGE4_CY,2021 Population Age 4,2021 Age: 1 Year Increments (Esri),2021


In [11]:
# Call the shape property to get the total number of rows and columns
data_frame.shape

(19153, 4)

- Each **data collection** and **analysis variable** has a unique ID
- When calling the `enrich()` method (explained later in this guide) these analysis variables can be passed in the `data_collections` and `analysis_variables` parameters
- You can filter the `data_collections` and query the collections `analysis_variables` using Pandas expressions

In [12]:
# Get all the unique data collections available
data_frame.index.unique()

Index(['1yearincrements', '5yearincrements', 'ACS_Housing_Summary_rep',
       'ACS_Population_Summary_rep', 'Age', 'AgeDependency',
       'Age_50_Profile_rep', 'Age_by_Sex_Profile_rep',
       'Age_by_Sex_by_Race_Profile_rep', 'AtRisk',
       ...
       'transportation', 'travelMPI', 'unitsinstructure',
       'urbanizationgroupsNEW', 'vacant', 'vehiclesavailable', 'veterans',
       'women', 'yearbuilt', 'yearmovedin'],
      dtype='object', name='dataCollectionID', length=161)

In [13]:
# Query the Age data collection
age_dc = data_frame.loc['Age']
age_dc

Unnamed: 0_level_0,analysisVariable,alias,fieldCategory,vintage
dataCollectionID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Age,Age.MALE0,2021 Males Age 0-4,2021 Age: 5 Year Increments (Esri),2021
Age,Age.MALE5,2021 Males Age 5-9,2021 Age: 5 Year Increments (Esri),2021
Age,Age.MALE10,2021 Males Age 10-14,2021 Age: 5 Year Increments (Esri),2021
Age,Age.MALE15,2021 Males Age 15-19,2021 Age: 5 Year Increments (Esri),2021
Age,Age.MALE20,2021 Males Age 20-24,2021 Age: 5 Year Increments (Esri),2021
Age,Age.MALE25,2021 Males Age 25-29,2021 Age: 5 Year Increments (Esri),2021
Age,Age.MALE30,2021 Males Age 30-34,2021 Age: 5 Year Increments (Esri),2021
Age,Age.MALE35,2021 Males Age 35-39,2021 Age: 5 Year Increments (Esri),2021
Age,Age.MALE40,2021 Males Age 40-44,2021 Age: 5 Year Increments (Esri),2021
Age,Age.MALE45,2021 Males Age 45-49,2021 Age: 5 Year Increments (Esri),2021


In [14]:
# Get all the unique analysisVariables under collection
age_dc['analysisVariable'].unique()

array(['Age.MALE0', 'Age.MALE5', 'Age.MALE10', 'Age.MALE15', 'Age.MALE20',
       'Age.MALE25', 'Age.MALE30', 'Age.MALE35', 'Age.MALE40',
       'Age.MALE45', 'Age.MALE50', 'Age.MALE55', 'Age.MALE60',
       'Age.MALE65', 'Age.MALE70', 'Age.MALE75', 'Age.MALE80',
       'Age.MALE85', 'Age.FEM0', 'Age.FEM5', 'Age.FEM10', 'Age.FEM15',
       'Age.FEM20', 'Age.FEM25', 'Age.FEM30', 'Age.FEM35', 'Age.FEM40',
       'Age.FEM45', 'Age.FEM50', 'Age.FEM55', 'Age.FEM60', 'Age.FEM65',
       'Age.FEM70', 'Age.FEM75', 'Age.FEM80', 'Age.FEM85'], dtype=object)

### 5. Available reports

**GeoEnrichment** also enables you to create many types of high quality **reports** for a variety of use cases describing the **input area**.

The `reports` property of a `Country` object lists its available reports as a Pandas DataFrame. The report `id` is used as an input in the `create_report()` method to create reports.

In [15]:
# Print a sample of the reports, available for USA
usa.reports.head(5)

Unnamed: 0,id,title,categories,formats
0,census2010_profile,2010 Census Profile,[Demographics],"[pdf, xlsx]"
1,acs_housing,ACS Housing Summary,[Demographics],"[pdf, xlsx]"
2,acs_keyfacts,ACS Key Population & Household Facts,[Demographics],"[pdf, xlsx]"
3,acs_population,ACS Population Summary,[Demographics],"[pdf, xlsx]"
4,55plus,Age 50+ Profile,[Demographics],"[pdf, xlsx]"


In [16]:
# Total number of reports available
usa.reports.shape

(53, 4)

### 6. Finding named statistical areas

Each country has several **named statistical areas** in a hierarchy of geography levels (such as **states**, **counties**, **zip-codes**, etc).

The `subgeographies` property of a country can be used to discover these standard geographic/statistical areas within that country.

This information is available through a heirarchy of dynamic properties (like states, counties, tracts, zip-codes,...). Each such dynamic property reflects the geographical levels within that country, with subgeographies grouped logically under the higher levels of geographies. The properties are dictionaries containing the names of the standard geographic places and their values are instances of `NamedArea` class. The `NamedArea` objects can be used as **study areas** in the `enrich()` method.

Note: Setting the `IPCompleter.greedy=True` configuration option in **Jupyter notebook** enables you to dynamically discover the various levels of subgeographies using intellisense, as in the example below.

In [17]:
%config IPCompleter.greedy=True

In [18]:
%config IPCompleter

IPCompleter(Completer) options
----------------------------
IPCompleter.backslash_combining_completions=<Bool>
    Enable unicode completions, e.g. \alpha<tab> . Includes completion of latex
    commands, unicode names, and expanding unicode characters back to latex
    commands.
    Current: True
IPCompleter.debug=<Bool>
    Enable debug for the Completer. Mostly print extra information for
    experimental jedi integration.
    Current: False
IPCompleter.greedy=<Bool>
    Activate greedy completion
            PENDING DEPRECATION. this is now mostly taken care of with Jedi.
            This will enable completion on elements of lists, results of function calls, etc.,
            but can be unsafe because the code is actually evaluated on TAB.
    Current: True
IPCompleter.jedi_compute_type_timeout=<Int>
    Experimental: restrict time (in milliseconds) during which Jedi can compute types.
            Set to 0 to stop computing types. Non-zero value lower than 100ms may hurt
         

In [19]:
%config IPCompleter.greedy

True

In [21]:
named_area_usa = usa.subgeographies
named_area_usa

<NamedArea name:"147" area_id="01", level="US.WholeUSA", country="147">

In [22]:
named_area_usa.states['California']

<NamedArea name:"California" area_id="06", level="US.States", country="147">

In [23]:
named_area_usa.states['California'].counties['San_Bernardino_County']

<NamedArea name:"San Bernardino County" area_id="06071", level="US.Counties", country="147">

In [24]:
named_area_usa.states['California'].counties['San_Bernardino_County'].tracts['060710001.03']

<NamedArea name:"060710001.03" area_id="06071000103", level="US.Tracts", country="147">

In [25]:
usa.subgeographies.states['California'].zip5['92373']

<NamedArea name:"Redlands" area_id="92373", level="US.ZIP5", country="147">

**The named areas can also be drawn on a map, as they include a `geometry` property:**

In [105]:
redlands = gis.map('Redlands, CA', zoomlevel=12)
redlands

MapView(layout=Layout(height='400px', width='100%'))

In [34]:
# Draw a subgeography on the map above
redlands.draw(usa.subgeographies.states['California'].zip5['92373'].geometry)

In [104]:
india_map = gis.map('India', zoomlevel=5)
india_map

MapView(layout=Layout(height='400px', width='100%'))

In [56]:
india = Country.get('India')

In [57]:
# Print the available datasets
india.properties.datasets

['IND_MBR_2020']

In [58]:
# View the current dataset of the country
india.dataset

'IND_MBR_2020'

In [65]:
# Inspect the various subgeographies
india.subgeographies.states['Uttar_Pradesh'].districts['Baghpat'].subdistricts['Baraut']

<NamedArea name:"Baraut" area_id="0913900735", level="IN.Subdistricts", country="63">

### 7. Searching for named areas within a country

In [86]:
riversides_in_usa = usa.search('Riverside')
print("Number of riversides in USA: " + str(len(riversides_in_usa)))

Number of riversides in USA: 147


In [91]:
# List a few of them
riversides_in_usa[:5]

[<NamedArea name:"Riverside" area_id="147435", level="Cities", country="147">,
 <NamedArea name:"Riverside" area_id="147436", level="Cities", country="147">,
 <NamedArea name:"Riverside" area_id="147437", level="Cities", country="147">,
 <NamedArea name:"Riverside" area_id="147438", level="Cities", country="147">,
 <NamedArea name:"Riverside" area_id="147439", level="Cities", country="147">]

**You can make a map of all the riversides in the US:**

In [94]:
usa_map = gis.map('United States', zoomlevel=4)
usa_map

MapView(layout=Layout(height='400px', width='100%'))

In [95]:
# Draw on the map above
for riverside in riversides_in_usa:
    usa_map.draw(riverside.geometry)

**Filtering named areas by geography level:**

In [97]:
usa_levels = [level for level in usa.levels]
usa_levels

[{'id': 'US.WholeUSA',
  'name': 'Entire Country',
  'isWholeCountry': True,
  'adminLevel': 'Admin1',
  'singularName': 'United States of America',
  'pluralName': 'United States of America',
  'defaultGeneralizationLevel': 6},
 {'id': 'US.States',
  'name': 'States',
  'isWholeCountry': False,
  'adminLevel': 'Admin2',
  'singularName': 'State',
  'pluralName': 'States',
  'defaultGeneralizationLevel': 5},
 {'id': 'US.DMA',
  'name': 'DMAs',
  'isWholeCountry': False,
  'adminLevel': '',
  'singularName': 'DMA',
  'pluralName': 'DMAs',
  'description': 'A Designated Market Area (DMA), also referred to as a media market, is a region of the United States that is used to define television and radio markets. DMAs are determined by the Nielsen Company and are usually defined based on metropolitan areas, with suburbs often being combined within. Esri creates these boundaries by dissolving the TIGER 2019 Block Groups with Esri modified Shoreline.',
  'defaultGeneralizationLevel': 2},
 {'id'

In [99]:
usa_levels_ids = [level['id'] for level in usa.levels]
usa_levels_ids

['US.WholeUSA',
 'US.States',
 'US.DMA',
 'US.CD',
 'US.CBSA',
 'US.Counties',
 'US.CSD',
 'US.ZIP5',
 'US.Places',
 'US.Tracts',
 'US.BlockGroups']

In [101]:
usa.search(query='Riverside', layers=['US.Counties'])

[<NamedArea name:"Riverside County" area_id="06065", level="US.Counties", country="147">]