# **Fetching Census data**
Author: Liubov Dumarevskaya

GitHub page of the author: https://liubovd.github.io/

Last Updated: June 13, 2025


Description: This is the workflow aimed on fetching ASC data for Rhode Island per tract. You can choose different vairables and years.



**Make sure you run all the cells in the order!**

### **Run the next cell to download library of available variables:**

In [None]:
import pandas as pd
url = 'https://raw.githubusercontent.com/LiubovD/liubovd.github.io/refs/heads/main/workshops/table_asc_for_workflow_modified.csv'
df = pd.read_csv(url)
my_dict = dict(zip(df.iloc[:, 1], df.iloc[:, 0]))
print(my_dict)

{'Population': 'B01001_001E', 'Sex by Age:Total: Male:': 'B01001_002E', 'Sex by Age:Total: Male: Under 5 years': 'B01001_003E', 'Sex by Age:Total: Male: 5 to 9 years': 'B01001_004E', 'Sex by Age:Total: Male: 10 to 14 years': 'B01001_005E', 'Sex by Age:Total: Male: 15 to 17 years': 'B01001_006E', 'Sex by Age:Total: Male: 18 and 19 years': 'B01001_007E', 'Sex by Age:Total: Male: 20 years': 'B01001_008E', 'Sex by Age:Total: Male: 21 years': 'B01001_009E', 'Sex by Age:Total: Male: 22 to 24 years': 'B01001_010E', 'Sex by Age:Total: Male: 25 to 29 years': 'B01001_011E', 'Sex by Age:Total: Male: 30 to 34 years': 'B01001_012E', 'Sex by Age:Total: Male: 35 to 39 years': 'B01001_013E', 'Sex by Age:Total: Male: 40 to 44 years': 'B01001_014E', 'Sex by Age:Total: Male: 45 to 49 years': 'B01001_015E', 'Sex by Age:Total: Male: 50 to 54 years': 'B01001_016E', 'Sex by Age:Total: Male: 55 to 59 years': 'B01001_017E', 'Sex by Age:Total: Male: 60 and 61 years': 'B01001_018E', 'Sex by Age:Total: Male: 62 t

### **Select variable to map:**

In [None]:
import ipywidgets as widgets
from IPython.display import display

dropdown = widgets.Dropdown(
    options=[(desk, code) for desk, code in my_dict.items()],
    description='Variable:',
    style={'description_width': 'initial'},
    layout=widgets.Layout(width='80%')
)

display(dropdown)

Dropdown(description='Variable:', layout=Layout(width='80%'), options=(('Population', 'B01001_001E'), ('Sex by…

### **Set the desired year (2020-2023) and double-check your choosen parameters .**

In [None]:
variable = dropdown.value
year = input("Enter the year: ")
url = f"https://api.census.gov/data/{year}/acs/acs5"
print("Constructed URL:", url)

params = {
    "get": f"NAME,{variable}",
    "for": "tract:*",
    "in": "state:44"  # Rhode Island
}

print("Constructed parameters:", params)

Enter the year: 2022
Constructed URL: https://api.census.gov/data/2022/acs/acs5
Constructed parameters: {'get': 'NAME,B01001_001E', 'for': 'tract:*', 'in': 'state:44'}


### **Request data and check your dataset. Some of the variables are not available for all years for RI on the tract level, in this case you will get either None or -666666 value in the table.**

In [23]:
import requests
response = requests.get(url, params=params)
data = response.json()

df = pd.DataFrame(data[1:], columns=data[0])
print(df.head())

ConnectTimeout: HTTPSConnectionPool(host='api.census.gov', port=443): Max retries exceeded with url: /data/2022/acs/acs5?get=NAME%2CB01001_001E&for=tract%3A%2A&in=state%3A44 (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7ded58c84350>, 'Connection to api.census.gov timed out. (connect timeout=None)'))

### **Add GEOID field which contains full tract information and allows to join table to map:**

In [None]:
df['GEOID'] = df['state'] + df['county'] + df['tract']
print(df)

                                                  NAME B08006_013E state  \
0       Census Tract 301; Bristol County; Rhode Island           0    44   
1       Census Tract 302; Bristol County; Rhode Island           0    44   
2       Census Tract 303; Bristol County; Rhode Island           0    44   
3       Census Tract 304; Bristol County; Rhode Island           0    44   
4       Census Tract 305; Bristol County; Rhode Island           0    44   
..                                                 ...         ...   ...   
245  Census Tract 515.02; Washington County; Rhode ...           0    44   
246  Census Tract 515.03; Washington County; Rhode ...           0    44   
247  Census Tract 515.04; Washington County; Rhode ...           0    44   
248  Census Tract 9901; Washington County; Rhode Is...           0    44   
249  Census Tract 9902; Washington County; Rhode Is...           0    44   

    county   tract        GEOID  
0      001  030100  44001030100  
1      001  030200 

### **Get information for population:**

In [None]:
url_population = f"https://api.census.gov/data/{year}/acs/acs5?get=NAME,B01001_001E&for=tract:*&in=state:44"

response = requests.get(url_population)
pop_data = response.json()

df_pop = pd.DataFrame(pop_data[1:], columns=pop_data[0])\

print(df_pop.head())

                                             NAME B01001_001E state county  \
0  Census Tract 301; Bristol County; Rhode Island        4970    44    001   
1  Census Tract 302; Bristol County; Rhode Island        3417    44    001   
2  Census Tract 303; Bristol County; Rhode Island        4536    44    001   
3  Census Tract 304; Bristol County; Rhode Island        4190    44    001   
4  Census Tract 305; Bristol County; Rhode Island        3343    44    001   

    tract  
0  030100  
1  030200  
2  030300  
3  030400  
4  030500  


### **Add population numbers to the table:**

In [None]:
df = df.merge(df_pop[['NAME', 'B01001_001E']], on='NAME', how='left')
df = df.rename(columns={'B01001_001E': 'population'})
print(df)

                                                  NAME B08006_013E state  \
0       Census Tract 301; Bristol County; Rhode Island           0    44   
1       Census Tract 302; Bristol County; Rhode Island           0    44   
2       Census Tract 303; Bristol County; Rhode Island           0    44   
3       Census Tract 304; Bristol County; Rhode Island           0    44   
4       Census Tract 305; Bristol County; Rhode Island           0    44   
..                                                 ...         ...   ...   
245  Census Tract 515.02; Washington County; Rhode ...           0    44   
246  Census Tract 515.03; Washington County; Rhode ...           0    44   
247  Census Tract 515.04; Washington County; Rhode ...           0    44   
248  Census Tract 9901; Washington County; Rhode Is...           0    44   
249  Census Tract 9902; Washington County; Rhode Is...           0    44   

    county   tract        GEOID population  
0      001  030100  44001030100       4970

In [None]:
df[variable] = pd.to_numeric(df[variable], errors='coerce')
df['population'] = pd.to_numeric(df['population'], errors='coerce')
df['rate'] = (df[variable]/df['population'])
print(df.head())

                                             NAME  B08006_013E state county  \
0  Census Tract 301; Bristol County; Rhode Island            0    44    001   
1  Census Tract 302; Bristol County; Rhode Island            0    44    001   
2  Census Tract 303; Bristol County; Rhode Island            0    44    001   
3  Census Tract 304; Bristol County; Rhode Island            0    44    001   
4  Census Tract 305; Bristol County; Rhode Island            0    44    001   

    tract        GEOID  population  rate  
0  030100  44001030100        4970   0.0  
1  030200  44001030200        3417   0.0  
2  030300  44001030300        4536   0.0  
3  030400  44001030400        4190   0.0  
4  030500  44001030500        3343   0.0  


### **Create a CSV table in your virtual Colab environment:**

In [None]:
df.to_csv(f"ASC_RI_{variable}_{year}.csv", index=False)

### **Download the table to your computer.**

In [None]:
from google.colab import files
files.download(f"ASC_RI_{variable}_{year}.csv")

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

### **Skip next cell**

In [None]:
import geopandas as gpd

input_file = "https://liubovd.github.io/maps/ri_tract_with_area.geojson"
gdf = gpd.read_file(input_file)

merged_gdf = gdf.merge(df, left_on='GEOID', right_on='GEOID', how='left')
merged_gdf = merged_gdf.drop_duplicates(subset=['GEOID'])
merged_gdf[variable] = pd.to_numeric(merged_gdf[variable], errors='coerce')
merged_gdf["ALAND"] = pd.to_numeric(merged_gdf["ALAND"], errors='coerce')
merged_gdf["rate"] = (merged_gdf[variable] / merged_gdf["ALAND"])*1000000
print(merged_gdf.head())
output_file = "merged_tracts_RI.geojson"

print(f"\nSuccessfully saved merged data to {output_file}")

   OBJECTID        GEOID    ALAND  Shape_Length  Shape_Area  \
0         1  44003021002  2706612      0.096981    0.000296   
1         2  44003021300  4168478      0.155531    0.000451   
2         3  44003021100  7158162      0.210567    0.000790   
3         4  44003020904  8168014      0.204804    0.000894   
4         5  44003020200  2891313      0.135714    0.000338   

                                            geometry  \
0  MULTIPOLYGON (((-71.38035 41.74992, -71.38031 ...   
1  MULTIPOLYGON (((-71.38047 41.74979, -71.3804 4...   
2  POLYGON ((-71.41908 41.76189, -71.41924 41.761...   
3  POLYGON ((-71.45829 41.66371, -71.45853 41.663...   
4  POLYGON ((-71.50237 41.71148, -71.5029 41.7116...   

                                             NAME  B29002_008E state county  \
0  Census Tract 210.02; Kent County; Rhode Island          495    44    003   
1     Census Tract 213; Kent County; Rhode Island          748    44    003   
2     Census Tract 211; Kent County; Rhode Isla

### **Create a base map with tract polygons**

Choose whether you want to use rate per 100 000 people or direct number.

In [None]:
import ipywidgets as widgets
from IPython.display import display
mapping_choice = widgets.ToggleButtons(
    options=['Map rate',  'Map value'],
    description='Choose:',
    disabled=False,
    button_style='',
    tooltips=['Description of slow', 'Description of fast'],
)
display(mapping_choice)

ToggleButtons(description='Choose:', options=('Map rate', 'Map value'), tooltips=('Description of slow', 'Desc…

In [None]:
import folium

if mapping_choice.value == 'Map rate':
  value_to_map = "rate"
else:
  value_to_map = variable


m = folium.Map(location=[41.58, -71.47], zoom_start=9)
geojson_path = "https://liubovd.github.io/maps/ri_tract_with_muni.geojson"
folium.GeoJson(geojson_path).add_to(m)

folium.Choropleth(
    geo_data=geojson_path,
    name="Choropleth",
    data=df,
    columns=["GEOID", value_to_map],
    key_on="feature.properties.GEOID",
    fill_color="YlOrRd",
    fill_opacity=0.7,
    line_opacity=0.2,
    legend_name=f"{variable}_per_total population",
    highlight=True
).add_to(m)

m

NameError: name 'variable' is not defined