## Spatial Data Science: The New Frontier in Analytics

The purpose of this MOOC is to discuss and explore spatial data science. 

You can [register here](https://www.esri.com/training/catalog/5d76dcf7e9ccda09bef61294/spatial-data-science%3A-the-new-frontier-in-analytics/?adumkts=training&aduc=email&adum=drip&aduca=mi_moocs&adut=2119512-2021-sds-mooc-reg1d&adulb=multiple&adusn=multiple&adupt=awareness&aducp=event_body_cta#!). Note that registration closes on 14 September 2013.

Instructors : 

- Mogahid Hussein (Host)
- Lauren Bennett 
- Shannon Kalisky
- Flora Vale
- Atma Man

## Exercise 1: Perform data engineering tasks

### **Introduction**

Data engineering is a fundamental part of every analysis. The term refers to the planning, preparation, and processing of data to make it more useful for analysis. It can include simple tasks like identifying and correcting imperfections in your data and calculating new fields. It can also include more complex tasks like reducing the dimensions of a multivariate dataset.

Data engineering also involves the process of geoenriching your data. Geoenrichment can include various tasks:

- Adding a spatial location to your data, referred to as geocoding
- Using other data sources to extract information and add, or enrich, these values to your dataset
- Calculating new fields that represent spatial characteristics, like the distance from a particular feature in a landscape

In this exercise, you will use ArcGIS Notebooks and the Data Engineering view in ArcGIS Pro to perform data engineering tasks. These tasks will use the built-in tools that are available with these products as well as tools that are available by integrating open-source libraries. Using ArcGIS Notebooks allows you to document and share the steps you take to prepare your data for an analysis. You will then have transparent and reproducible research or analysis.

### **Scenario**

Because voting is voluntary in the United States, the level of voter participation (referred to as "voter turnout") has a significant impact on the election results and resulting public policy.

Modeling voter turnout, and understanding where low turnout is prevalent, can inform outreach efforts to increase voter participation. With the ultimate goal of predicting voter turnout, in this exercise, you will focus on performing various data engineering tasks to prepare election result data for predictive analysis.

The data for this section is obtained from the [Harvard Dataverse](https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/VOQCHQ) and the [United States Census Bureau](https://www.census.gov/). The voter turnout dataset from Harvard Dataverse has vote totals from each U.S. county for U.S. presidential elections from 2000 to 2020.

#### **Step 1: Download the exercise data files**

To begin, you will open the ArcGIS Pro project package that you downloaded previously.

1. Start ArcGIS Pro.

2. If necessary, sign in with the provided course ArcGIS account.

3. Near Recent Projects, click Open Another Project.

>Note: If you have configured ArcGIS Pro to start without a project template or with a default project, you will not see the Start page. On the Project tab, click Open, and then click Open Another Project.

4. Browse to the `DataEngineering_and_Visualization` folder that you saved on your computer.

5. Click `DataEngineering_and_Visualization.aprx` to select it, and then click OK.

6. In the Catalog pane, expand Maps.

7. Right-click Data Visualization and choose Open.

A Data Visualization map tab opens to a gray basemap with a map layer that contains the 2020 election results for counties that have been enriched with demographic variables. The CountyElections2020 layer has also been projected to the USA Contiguous Equidistant Conic projected coordinate system. The feature class that you created using the Data Engineering Notebook is projected in the WGS84 coordinate system, which is a standard coordinate system for web mapping applications. However, this projection does not preserve areas, distances, or angles. Because this layer will be used for a distance-based analysis, it is best practice to use an equidistant projection to preserve true distance measurements on your map.


In this step, you will download the exercise data files. 

1. Open a new web browser tab or window.

2. Go to https://links.esri.com/Section1_DataOpens in new window and download the exercise data ZIP file.

>Note: The complete URL to the exercise data file is https://www.arcgis.com/home/item.html?id=9487775690064b159099d152ad04eec5Opens in new window.

3. Create a folder on your local computer and name it EsriTraining.

>Throughout this course, you will save all your data to this folder. When you create the folder, do not include any spaces or special characters in the folder name.

4. Extract the exercise data files to the EsriTraining folder on your local computer.

5. After you extract the folder, confirm that the data files are stored in the `DataEngineering_and_Visualization` folder.

6. Leave the `DataEngineering_and_Visualization` folder open.

You downloaded and extracted the exercise data files that you will need to complete the first section of the MOOC.

#### **Step 2: Confirm that your computer can run ArcGIS Pro**

In this step, you will run a test to confirm that your computer can support ArcGIS Pro. Even if you have ArcGIS Pro installed, you should confirm that your computer can support ArcGIS Pro 3.1.

>Note: This test uses a third-party executable file. If you prefer not to run this test due to security reasons, you can review the Common Questions or go to ArcGIS Pro Help: ArcGIS Pro 3.1 system requirementsOpens in new window.

1. Go to [Can your computer run ArcGIS Pro 3.0 and 3.1](https://www.systemrequirementslab.com/clientapp?refid=1256&appkey=6D681CD0-BA6C-4B6B-9A82-639759CFD094&itemid=21760&type=form)

2. Click Run Tech Check.

3. Follow the steps to open and run the test.

The site generates a report that lists the minimum requirements and identifies whether your machine meets these requirements.

4. Save the report.

The MOOC team may ask you to share the report if you need help in later ArcGIS Pro exercises.

> Note: If the report says that you need to update Microsoft .NET, the next step of this exercise, which is titled Install Microsoft .NET Desktop Runtime, will guide you through the process.

5. If your computer does not meet these requirements, check the Common Questions to find links to complete any other recommended updates, and then run the test again.

> Note: If your computer does not meet the requirements, you may need to use a different computer or update your graphics card. For more information about graphics card requirements, go to [ArcGIS Pro Help: ArcGIS Pro 3.1 system requirements > Hardware requirements](https://pro.arcgis.com/en/pro-app/latest/get-started/arcgis-pro-system-requirements.htm#GUID-70D8DACC-78A9-4C57-A986-DFE62A9145E9).

6. If your computer meets the requirements, continue to the step that is titled *Locate your course account to install ArcGIS Pro*.

You ran and saved a report that told you whether your computer can support ArcGIS Pro 3.1.

### **Step 3: Install Microsoft .NET Desktop Runtime**

ArcGIS Pro 3.1 is built on .NET 6.0, Microsoft’s latest edition of .NET that has long-term support.  Moving to this version of .NET positions Esri and other ArcGIS Pro developers well for future development and enhancements. Because certain third-party components may start to be compatible only with .NET 6.0, it is best to use the most updated software framework.

Before you can install ArcGIS Pro 3.1 to use in the MOOC exercises, you must update your system to use .NET 6.0. 

> Note: If you already have Microsoft .NET Desktop Runtime 6.0 on your computer, continue to the next step to install ArcGIS Pro 3.1. Alternatively, if you already have both ArcGIS Pro and Microsoft .NET 6.0 on your computer, you can update your version of ArcGIS Pro directly from the Settings tab on the Start page of ArcGIS Pro.

1. In a web browser, go to ArcGIS Pro Help: [ArcGIS Pro 3.1 system requirements > Software requirements](https://pro.arcgis.com/en/pro-app/latest/get-started/arcgis-pro-system-requirements.htm#GUID-26B35CD3-0388-4F44-87E6-9D5C35BFC084).

2. For Microsoft .NET, click the link in the Minimum Requirement column, and follow the instructions to download and install Microsoft .NET Desktop Runtime 6.0.x - Windows x64. 

After you successfully download and install Microsoft .NET Desktop Runtime 6.0, you are ready to download and install ArcGIS Pro 3.1.



### **Step 4: Locate your course account to install ArcGIS Pro**

For this MOOC, you will use ArcGIS Pro 3.1. First, you need to visit the MOOC home page to locate your course account username and password. Then, if necessary, you will install ArcGIS Pro 3.1 from ArcGIS Online.

1. On the MOOC home page, next to Dashboard, click Lessons.

2. Under Lessons, locate your ArcGIS account information.

You will use your course ArcGIS account username and password to download ArcGIS Pro and complete all the MOOC exercises. The username for your account ends with `_sds` (for example, jdoe_sds). You may want to write down the username and password for quick reference, or you can always return to the Lessons tab to locate your credentials.

> Note: If you registered in the last few hours, your account may not be ready. Refresh the page in an hour or so to see whether your account information is available. It took a couple of days in my case - try to be patient!

If you already installed ArcGIS Pro 3.1, you can skip the remaining actions and move to the next step, which is titled *Explore an ArcGIS Pro project*.

3. Open a new web browser in private or incognito mode.

>Note: To learn how to enable private browsing, go to [How to Enable Private Browsing on any Web Browser](https://www.howtogeek.com/269265/how-to-enable-private-browsing-on-any-web-browser/).

4. In the address bar, type ***www.arcgis.com*** and press Enter.

5. Click Sign In.

6. Under ArcGIS Login, copy and paste or type your course ArcGIS username and password.

7. Click Sign In.

The first time that you sign in, you may be asked to change your password and to set a security question.

8. If necessary, follow the on-screen instructions to change your password.

9. If necessary, follow the on-screen instructions to set your security question.

> Note: An automated email will be sent to the email address that is associated with the account, telling you that your account was recently modified. No action is required.

After the sign-in process is complete, you will see the home page of the MOOC organization.

10. In the upper-right corner, click your account username, and then click My Settings.

11. On the left side of the page, under My Settings, click the Licenses tab.

12. Under Licensed Products, locate ArcGIS Pro.

13. To the right of the software name, click Download ArcGIS Pro.

The Download window opens.

> Note: You can run ArcGIS Pro in a different language by clicking the down arrow next to English (Version 3.1) and choosing a different supported language. Keep in mind that this course is taught in English, which means that all screenshots and exercises use the English version of ArcGIS Pro.

14. Click Download.

If the default download location does not have enough space, you can change the location by following the steps in [How to Change the File Download Location in Your Browser](https://www.lifewire.com/change-the-file-download-location-4046428).

15. After the download completes, double-click the .exe file.

16. Follow the installation instructions, accepting all defaults.

17. When you are finished installing ArcGIS Pro, close the incognito web browser window.

>Note: Best practice for ArcGIS Pro is to uninstall an older version of ArcGIS Pro before installing a new version. 

### **Step 5: Explore an ArcGIS Pro project**

In this step, you will use your course ArcGIS account username and password to sign in to ArcGIS Pro. You will need to use your course ArcGIS account to license ArcGIS Pro and to access other software applications that are used throughout the MOOC exercises.

1. In File Explorer, navigate to the `DataEngineering_and_Visualization` folder.

The `DataEngineering_and_Visualization` folder shows all the data files that you need to complete both exercises in this section. You will open the ArcGIS Pro project from File Explorer.

2. Double-click the `DataEngineering_and_Visualization` ArcGIS project file.

3. Sign in to ArcGIS Pro with the provided course ArcGIS account username that ends in _sds.

>Note: The course ArcGIS account username and password are listed on the MOOC home page under Lessons. If you are already signed in to ArcGIS Pro with a different account, in the top-right corner, click your username. Then click Sign Out. Click the Not Signed In link and then click Sign In.

You signed in to ArcGIS Pro with your MOOC ArcGIS account credentials. Next, you will explore the `DataEngineering_and_Visualization` ArcGIS Pro project that you downloaded previously. After you signed in, the ArcGIS Pro project opened to show the Data Engineering map.

4. On the ribbon, click the View tab.

5. In the Windows group, click Reset Panes and choose Reset Panes For Mapping (Default).

![reset_panes.PNG](attachment:72962fc2-eec2-4fa6-af9e-4f2dec932e1a.PNG)

Your ArcGIS Pro project is open to a gray reference map, which is called a basemap. Because you are preparing U.S. election data, the basemap is currently focused on the contiguous United States.

Above the map is the ArcGIS Pro ribbon. ArcGIS Pro uses this horizontal ribbon to display and organize functionality into a series of tabs. On the Map tab is the Navigate group, which provides the tools that you need to navigate the map. The default tool is the Explore tool ![image.png](attachment:4ffc204a-a3ce-45cc-95ea-f129ad5f6988.png), which you can use to pan and zoom in and out of maps. To explore different areas of the world on this basemap, pan the map by clicking your mouse and holding down the button while you move the map. When you pan a map with the mouse, the pointer becomes a hand. Zoom in or out of the map by using the mouse wheel or by using the Fixed Zoom In button ![image.png](attachment:e7886067-5a7d-428a-ba20-d1e04ce08a56.png) or Fixed Zoom Out button ![image.png](attachment:fe12ad1f-f5a2-420c-bc20-0019d4109723.png) in the Navigate group.

You reset the panes to show the default mapping panes. To the left side of the map is the Contents pane, which lists the layers that have been added to the map. Also to the right side of the map is the Catalog pane, which lists the items that are associated with this ArcGIS Pro package—Maps, Toolboxes, Notebooks, Databases, Styles, Folders, and Locations.

To learn more about the ArcGIS Pro interface, go to ArcGIS Pro Help: [ArcGIS Pro user interface](https://pro.arcgis.com/en/pro-app/get-started/get-started.htm#ESRI_SECTION1_75CF36E07CE441BC9573B0EF5BDEB120). To learn more about ArcGIS Pro projects, go to ArcGIS Pro Help: [Projects in ArcGIS Pro](https://pro.arcgis.com/en/pro-app/help/projects/what-is-a-project.htm).

6. On the ribbon, click the Map tab.

You explored an ArcGIS Pro project. Next, you will open an ArcGIS notebook.

### **Step 6: Open an ArcGIS notebook**

This exercise uses ArcGIS Notebooks in ArcGIS Pro. The ArcGIS Notebooks interface is built on top of Jupyter Notebook, which structures content using cells. Code cells contain executable Python code, and Markdown cells contain explanatory text and media. In this step, you will open the ArcGIS notebook that is used in this exercise.

1. In the Catalog pane, expand Notebooks.

2. Right-click Data Engineering Notebook.ipynb and choose Open Notebook.

![open_notebook.PNG](attachment:91b91688-6b0c-49b7-bcd3-73f32a3a06fc.PNG)

A notebook opened in the ArcGIS Pro project. You will use this notebook to complete most of this exercise.

### **Step 7: Modify a Markdown cell**

### The steps below were carried out inside ArcGis Pro using an ArcGis Notebook however I recreated it locally.  Please refer to [Data Engineering Notebook.ipynb](https://github.com/Stephen137/ESRI-Spatial-Data-Science/blob/main/Data%20Engineering%20Notebook.ipynb).

The first few cells in this notebook are Markdown cells that help to explain the exercise. In this step, you will learn how to use the Markdown cells in the notebook.

1. In the notebook, double-click the first Markdown cell that is titled Data Engineering.

Markdown cells use hashtags to determine the size and format of the explanatory text. Adding additional hashtags will decrease the size of the font. If you are familiar with HTML, you can think of this action as switching between header tags (`<h1>`, `<h2>`, `<h3`>). Be sure to maintain a space between the hashtag and your text; otherwise, the font style and size will appear as regular text.
    
2. In front of Data Engineering, type a hashtag `(#)`
    
3. Add a space between the hashtag and the words Data Engineering.

The text font style and size change to make the text appear more like a heading.

4. From the ArcGIS Notebooks toolbar, click Run ![image.png](attachment:ce3d551c-fc0d-4004-9daf-e5c8bd6a348a.png).

>Note: Alternatively, you can select the cell and press Shift + Enter on your keyboard.

Running a Markdown cell will apply the formatting that you have indicated in the cell. Similarly, running a Code cell will execute the code that you have written in the cell.

### **Step 8: Import Python modules**

In this step, you will import the necessary Python modules to execute the cells in the notebook. A Python module is a file that contains Python definitions and statements. A module can define functions, classes, and variables, and it can include runnable code. You will use the import statement to import the modules.

1. Click the Markdown cell that is titled Import Needed Modules, to select the cell.

2. From the ArcGIS Notebooks toolbar, click the Insert Cell Below button ![image.png](attachment:6392d551-145a-4258-a6a6-6efdd5adf66e.png).

A Code cell is added under the Markdown cell. You will use this cell to import the Python modules that are required to complete this exercise.

3. Use the import syntax to import the following Python modules, pressing Enter on your keyboard after each line:

- arcgis
- pandas
- os
- arcpy

This Code cell will call the modules from the ArcGIS Pro conda environment. To the left of the Code cell is blue text with brackets. When you run a Code cell, an asterisk appears inside the brackets to indicate that the cell is running. When the cell has completed running, the asterisk is replaced with a number.

4. From the ArcGIS Notebooks toolbar, click Run.
The number 1 appears in the brackets to indicate that the cell has been executed, which means that the modules were successfully loaded. 

You will use the pandas module quite often in this exercise. Instead of typing pandas each time, you will shorten pandas to pd.

5. Modify the line of code that says import pandas to say `import pandas as pd`.

6. Click Run.

You used pd as a variable. A variable is a name that references an object. The object could be a dataset or, in this case, a Python module. You could have shortened pandas to any variable name. You used pd because it is the most common local name for pandas. The remaining Code cells will use pd when using pandas functionality.

### **Step 9: Create a pandas data frame**

Next, you will use the pandas functionality to create a data frame. A pandas data frame is a tabular data structure of columns and rows. The columns are referred to as the attributes, or attribute fields, and the rows are referred to as the records. To create a data frame, your first step is to define a variable for the dataset.

1. Click the gray arrow to the left of the Load And Prepare Election Data section to expand the section.

2. Click the Markdown cell under Read Data Into Python.

3. Under the green explanation text beginning with The CSV file . . . , create a variable that is called `elections_data_path = "countypres_2000-2020.csv"` for the CSV dataset.


Adding an equal sign (=) after the variable followed by the countypres_2000-2020.csv dataset name enclosed within quotation marks defines this variable. You can now use the `elections_data_path variable` throughout the script to refer to the county election dataset (countypres_2000-2020.csv).  

4. Press Enter to start a new line of code.

You will use the pandas read function to load the county election dataset into the data frame.

5. In the Code cell, create a variable that is called `elections_complete_df`.

6. Add the `pd.read_csv` function with `elections_data_path` as the input parameter.

You want to specify that the `county_fips` attribute field in this data frame will be an object. You will use the dtype parameter to specify this field's data type.

7. After elections_data_path, add a comma and a space, and then type `dtype={"county_fips":object}`.

8. Press Enter to start a new line of code.

You want to confirm that the dataset loaded properly before moving on in the notebook. 

9. In the Code cell, type `elections_complete_df`.

10. Run the Code cell.

You created a data frame for the county elections dataset that you will use to prepare, reformat, and geoenable your data.

11. In ArcGIS Pro, from the Notebook tab, in the Notebook group, click Save.

Before moving to the next step in this exercise, you must expand each section and execute the rest of the steps in the notebook in ArcGIS Pro.

12. Expand each section and select each cell and either click Run ![image.png](attachment:3b53f8d3-78d2-4486-8539-74ffe341641f.png) button or press Shift + Enter on your keyboard.

13. Review the step as you run each cell.

>Note: You must run each cell in the notebook before proceeding to the next step.

14. After you have finished running all cells in the notebook, save and close the Data Engineering Notebook.

15. Return to the exercise to continue with the rest of the steps.

>Note: Although you did not write all the Python code, it is recommended that you carefully look at the Python syntax and logic in each cell. Reviewing each cell can help familiarize you with the ArcGIS Notebooks interface and the relevant Python syntax. The notebook can also act as sample code that you can reference for data engineering tasks.

### **Step 10: Modify environment settings**

Before continuing with your data engineering task, you will set the geoprocessing environments. Environments are additional settings that affect geoprocessing tools and provide a powerful way to ensure that geoprocessing is performed in a controlled environment.

1. In the Catalog pane, expand Databases and expand DataEngineering_and_Visualization.gdb.

2. Right-click the `county_elections_pres` layer and choose Add To Current Map.

![county_elections_pres.PNG](attachment:b395ce8e-0f93-49b9-90ec-3fb0868b593f.PNG)

>Note: If you do not see the layer, return to the Create a pandas data frame step and verify that you have executed each cell in the notebook.

You added the feature class that you created by running all the cells in the Data Engineering Notebook. The color of the data will vary every time it is added to the map.

3. On the ribbon, click the Analysis tab.

4. In the Geoprocessing group, click Environments.

![environments.PNG](attachment:7f359809-dee0-408a-974a-759caf9ef3fb.PNG)

The Environments dialog box opens. Here, you can set parameters that apply to geoprocessing tools, such as the processing extent that limits processing to a specific geographic area, a coordinate system for all output geodatasets, or the cell size of output raster datasets.

5. Under Processing Extent, click the Extent down arrow and choose the `county_elections_pres` layer.

>Note: Your extent will again be As Specified Below, but your extent figures now match the`county_elections_pres` layer.

![processing_extent.PNG](attachment:c995ae4b-eb2b-420a-a457-d5de562bf978.PNG)

Next, you will set the data source for the Enrich tool to use demographic variables from the United States because the United States is your study area.

6. Scroll to the bottom of the Environments dialog box.

7. Under Business Analyst, next to Data Source, click the Browse button ![image.png](attachment:25ca31cb-e0df-412a-9600-1b631a5c34dd.png).

The Business Analyst Data Source dialog box opens, which is where you can set the data source for geoenrichment to a specific country. You will set the data source to the United States.

8. In the Business Analyst Data Source dialog box, on the left side, under Portal, click All Countries.

9. Scroll through the countries listed to see which countries and regions have demographic data available through Esri.

Esri's GeoEnrichment service enables you to query authoritative global data for more than 150 countries and regions. This extensive global data portfolio allows you to integrate global demographics, business, behavioral, environmental, and places datasets into your own data.

10. On the left side of the dialog box, click North America.

11. From the options that display under United States, click `Esri 2022`.

![data_source.PNG](attachment:9e69c74d-3eb0-4e80-a97a-6b627ebe75da.PNG)

12. Click OK.

Because your study area is the United States, you set the region to select demographic variables from the United States to geoenrich your data.

13. In the Environments dialog box, click OK.

>Note: For more information on demographic data from Esri, go to the Esri Location Data ResourcesOpens in new window web page.

### **Step 11: Open the Data Engineering view**

Geoenrichment will use the location of your data to add demographic variables as attributes to your feature class. Geoenrichment can be performed using ArcPy in a notebook, but the Data Engineering view in ArcGIS Pro allows you to explore potential variables that you would like to add to the feature class.

1. In the Contents pane, click the `county_elections_pres layer`, if necessary.

2. On the ribbon, click the Analysis tab, if necessary.

3. In the Workflows group, click Data Engineering.

![data_eng.PNG](attachment:e1062083-5ec6-4ad3-8ed1-a779ab52c6fc.PNG)

The Data Engineering view opens in a dockable window that can be moved and docked in the same way that you dock maps, layouts, and attribute tables. In addition to the view, a Data Engineering contextual tab is available. The tab provides access to commands that are used for data engineering.

The Data Engineering view contains two panels: a fields panel and a statistics panel. The fields panel lists the fields in the layer that you used to open the view. The fields panel allows you to explore fields, change symbology, and produce charts for fields in the layer. 

The statistics panel allows you to explore the values and distribution of your data by viewing statistics and data quality metrics. The panel's statistics table is empty by default until you add fields from the fields panel.

4. From the Data Engineering contextual tab, in the Tools group, click Integrate and choose Enrich.

![enrich.PNG](attachment:97f66ca2-7eeb-4069-b29f-6cb185f3169c.PNG)

The Enrich dialog box opens.

The four tool galleries on the Data Engineering contextual tab (Clean, Construct, Integrate, and Format) each contain a subset of geoprocessing tools that can be used for data engineering tasks. You selected the Enrich tool, which enables you to add demographic variables as attributes to your feature class. The Enrich tool lists the parameters that are required to run the tool. Parameters define the values that are used to run the tool and its underlying algorithms. To run the Enrich tool, you will need to define the input feature class, a name for the output feature class, and the variables that will be added to the output feature class.

5. Leave the Enrich dialog box open.

### **Step 12: Explore geoenrichment options**

You will review the workflow for geoenriching your data using the Enrich tool in the Data Engineering view.

1. Return to the Enrich dialog box, and confirm that the Input Features parameter is set to `county_elections_pres`.

The tool will automatically create an output feature class name that reflects the input. You can keep this name or modify it to be more meaningful for your analysis.

2. For Output Feature Class, replace the current text with `county_elections_pres_enrich`.

>Note: This parameter represents a file path that leads to the ArcGIS Pro project's file geodatabase (DataEngineering_and_Visualization.gdb). In ArcGIS Pro, the Current Workspace environment defaults to the project's default geodatabase.

3. Next to Variables, click the Add button ![image.png](attachment:ad79234d-9cbe-4243-9d89-9432fbdbfcad.png).

The Data Browser window is where you can explore the different demographic variables that are available for data enrichment. Esri provides various demographic variables that are regularly updated with the latest available data. For the United States, Esri also provides attributes from previous censuses (2000 and 2010) that are recalibrated with the most current census (2020) geography. You can quickly add various demographic variables to your data using the Enrich tool. You can also add variables that you created or that were shared with you.

4. In the Data Browser window, in the Search Variables field, type `Median Age` and press Enter.

![median_age.PNG](attachment:67c1b891-725d-4a3c-a30a-b715a50bceb4.PNG)

On the left, you have the option to filter the available variables so that you can easily focus your search. To the right of the Median Age variables, you see a hashtag and the word Index. For each variable, these icons, along with a percent sign icon, are used to specify whether you want a total count (hashtag), index, or percentage (percent sign) for the variable.

5. Click the Show/Hide Details Panel button ![image.png](attachment:7275ffe1-b993-494f-8dda-34af5699c51d.png).

The Details panel helps you keep track of the variables that you select. When a variable is selected, it is automatically listed in the Details panel. 

6. Select the Median Age variable closest in time to 2020.

7. Search for and select the Per Capita Income variable closest in time to 2020.

8. In the Data Browser window, click OK.

![median_age_per_capita.PNG](attachment:62da2fdd-9d7b-4a7c-87a9-2fb649b4627a.PNG)

The variables that you selected are added to the Variables parameter.

9. At the top of the Enrich dialog box, click the Estimate Credits link.

![estimated_credits.PNG](attachment:c7ca35b7-879f-4287-905c-731786aaa64c.PNG)

The Enrich tool consumes ArcGIS credits when it is run. By clicking Estimate Credits, an estimate of the number of ArcGIS credits displays in the banner, as well as the number of available credits that you have. For this course, your ArcGIS Online organizational account is allocated 300 credits. You will **NOT** enrich the data because an enriched data layer has been provided for you in the next exercise.

10. At the top of the Enrich dialog box, click the Close button ![image.png](attachment:5e89653c-b13b-4e2d-be04-2a5a8f2832c0.png).

You explored the workflow for geoenriching your data using the Enrich tool. 

11. Under the map, on the Data Engineering view tab, click the Close button ![image.png](attachment:a4e46ba1-0e40-4ff4-890e-b4ef0bbd3037.png).

After completing various data engineering techniques, you cleaned and prepared the election data. Geoenabling and geoenriching the data provides demographic variables that you can use to model or predict voter turnout.

In the next exercise, you will use various visualization techniques to explore relationships between voter turnout and these variables. You will use this information to identify potential variables to use in your prediction model later in the MOOC.

12. If you would like to perform additional data engineering tasks, proceed to the optional stretch goal. Otherwise, close the Data Engineering map, **SAVE THE PROJECT**, and then exit ArcGIS Pro.

### **Step 13: Stretch goal (Optional)**

Throughout this course, you will see exercise stretch goals. These goals include ways that you can continue or enhance the work that you completed during the exercise.

Stretch goals are community-supported (meaning that your fellow MOOC participants can assist you with the steps to complete the stretch goal using the Lesson Forum), and they are a great opportunity to work together to learn.

If you would like to continue engineering your data, you can modify the ArcGIS notebook to include the following tasks:

1. Identify and remove records with null candidatevotes values in the election data.

2. Apply a symbology layer (default.lyrx) to the 2020 election turnout feature class (out_2020_fc_name).

The default.lyrx file is located in the `DataEngineering_And_Visualization` folder. The ArcGIS Pro Help: [Apply Symbology From Layer (Data Management)](https://pro.arcgis.com/en/pro-app/tool-reference/data-management/apply-symbology-from-layer.htm) documentation describes the process of applying a symbology layer and includes the syntax to use in your script.

3. Determine how to incorporate Alaska into this analysis.

>Note: Alaska does not have counties. Research its administrative and political subdivisions to determine how the data would need to be engineered to address this issue.

4. Use the Lesson Forum to post your questions, observations, and syntax examples. Be sure to include the #stretch hashtag in the posting title.

5. When you are finished, close the Data Engineering map and notebook tabs, save the project, and then exit ArcGIS Pro.