# Illinois Data Importing & Exploration

@authors: vcle, bpuhani

All data retrieved in April 2025: <br>
| **Dataset**                                                                                                        | **Description**                                                                           | **Metadata**                                                                                      |
|--------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------|
| [2020 Population data](https://redistrictingdatahub.org/dataset/illinois-block-pl-94171-2020-by-table/)            | Based on the decennial census at the Census Block level on 2020 Census Redistricting Data | [Link](https://redistrictingdatahub.org/wp-content/uploads/2021/09/readme_il_pl2020_5f_b_shp.txt) |
| [2020 County data](https://redistrictingdatahub.org/dataset/illinois-county-pl-94171-2020/)                        | From 2020 Census Redistricting Data (P.L. 94-171) Shapefiles                              | [Link](https://redistrictingdatahub.org/wp-content/uploads/2021/08/readme_il_pl2020_cnty_shp.txt) |
| [2020 election data](https://redistrictingdatahub.org/dataset/vest-2020-illinois-precinct-and-election-results/)   | VEST 2020 Illinois precinct and election results                                          | [Link](https://redistrictingdatahub.org/wp-content/uploads/2021/06/readme_il_vest_20.txt)         |
| [2021 State Senate District plan](https://redistrictingdatahub.org/dataset/2021-oregon-state-senate-adopted-plan/) | 2021 Illinois State Senate Approved Plan                                                  | [Link](https://redistrictingdatahub.org/wp-content/uploads/2021/12/readme_il_sldu_2021.txt)       |
| [2021 Congressional District plan](https://redistrictingdatahub.org/dataset/2021-illinois-congressional-districts-approved-plan/) | 2021 Illinois Congressional Districts                                                     | [Link](https://redistrictingdatahub.org/wp-content/uploads/2021/11/readme_il_cong_adopted_2021.txt)        |


Importing the needed libraries.

In [1]:
import utilities as util
import os

## Importing the data

Prerequisites:
* Downloaded the data from the links above.
* Unzipped the data and placed it in the `il_data` folder.
* The data is in the following format:
    * `il_pl2020_b/il_pl2020_b` for the population data,
        * This contains multiple shapefiles, but we are only using the following:
            * P2: Hispanic or Latino, and Not Hispanic or Latino by Race
            * P4: Hispanic or Latino, and Not Hispanic or Latino by Race for the Population 18 Years and Over
        * Taken from the [Documentation](https://www2.census.gov/programs-surveys/decennial/2020/technical-documentation/complete-tech-docs/summary-file/2020Census_PL94_171Redistricting_StatesTechDoc_English.pdf)
    * `il_vest_20/il_vest_20.shp` for the election data,
    * `il_pl2020_cnty/il_pl2020_cnty.shp` for the county data,
    * `il_sldu_2021/il_sldu_2021.shp` for the senate data.

1. Setting all the paths to the data.

In [2]:
# rename congress files to mitigate name issues

old_base = "il_data/il_cong_adopted_2021/HB 1291 FA #1"
new_base = "il_data/il_cong_adopted_2021/HB_1291_FA_1"

if os.path.exists(old_base):
    for ext in [".shp", ".dbf", ".shx", ".prj"]:
        old = old_base + ext
        new = new_base + ext
        if os.path.exists(old):
            os.rename(old, new)

In [3]:
# Paths to the data
population_path = "il_data/il_pl2020_b/il_pl2020_p2_b.shp"
vap_path = "il_data/il_pl2020_b/il_pl2020_p4_b.shp"
vest20_path = "il_data/il_vest_20/il_vest_20.shp"
county_path = "il_data/il_pl2020_cnty/il_pl2020_cnty.shp"
sen_path = "il_data/il_sldu_2021/il_sldu_2021.shp"
cong_path = "il_data/il_cong_adopted_2021/HB_1291_FA_1.shp"

2. Loading the data using the `load_shapefile` function from the `utilities.py` file.

In [4]:
# population data
print("Loading population data...")
population_df = util.load_shapefile(population_path)

# voting age population data
print("\nLoading voting age population data...")
vap_df = util.load_shapefile(vap_path)

# election data
print("\nLoading election data...")
vest20_df = util.load_shapefile(vest20_path)

# county data
print("\nLoading county data...")
county_df = util.load_shapefile(county_path)

# senate data
print("\nLoading senate data...")
sen_df = util.load_shapefile(sen_path)

# congressional data
print("\nLoading congressional data...")
cong_df = util.load_shapefile(cong_path)

Loading population data...
Loading shapefile from il_data/il_pl2020_b/il_pl2020_p2_b.shp...
Shapefile data loaded from cache.

Loading voting age population data...
Loading shapefile from il_data/il_pl2020_b/il_pl2020_p4_b.shp...
Shapefile data loaded from cache.

Loading election data...
Loading shapefile from il_data/il_vest_20/il_vest_20.shp...
Shapefile data loaded from cache.

Loading county data...
Loading shapefile from il_data/il_pl2020_cnty/il_pl2020_cnty.shp...
Shapefile data loaded from cache.

Loading senate data...
Loading shapefile from il_data/il_sldu_2021/il_sldu_2021.shp...
Shapefile data loaded from cache.

Loading congressional data...
Loading shapefile from il_data/il_cong_adopted_2021/HB_1291_FA_1.shp...
Shapefile data loaded from cache.


## Exploring the data
1. Looking at the columns and the first few rows of each dataframe.

In [5]:
# Population data
population_df.head()

Unnamed: 0,GEOID20,SUMLEV,LOGRECNO,GEOID,COUNTY,P0020001,P0020002,P0020003,P0020004,P0020005,...,P0020065,P0020066,P0020067,P0020068,P0020069,P0020070,P0020071,P0020072,P0020073,geometry
0,170579538001047,750,395466,7500000US170579538001047,57,12,2,10,10,9,...,0,0,0,0,0,0,0,0,0,"POLYGON ((-90.15877 40.39479, -90.15876 40.395..."
1,171699702003023,750,558628,7500000US171699702003023,169,10,0,10,10,8,...,0,0,0,0,0,0,0,0,0,"POLYGON ((-90.56039 40.12145, -90.55996 40.121..."
2,171030001004063,750,461247,7500000US171030001004063,103,6,0,6,6,6,...,0,0,0,0,0,0,0,0,0,"POLYGON ((-88.96107 41.83323, -88.95948 41.833..."
3,170579530004028,750,393793,7500000US170579530004028,57,27,0,27,26,26,...,0,0,0,0,0,0,0,0,0,"POLYGON ((-90.30324 40.57410, -90.30309 40.574..."
4,170579539003025,750,395787,7500000US170579539003025,57,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,"POLYGON ((-90.37469 40.24967, -90.37434 40.250..."


In [6]:
# Voting Age Population data
vap_df.head()

Unnamed: 0,GEOID20,SUMLEV,LOGRECNO,GEOID,COUNTY,P0040001,P0040002,P0040003,P0040004,P0040005,...,P0040065,P0040066,P0040067,P0040068,P0040069,P0040070,P0040071,P0040072,P0040073,geometry
0,170579538001047,750,395466,7500000US170579538001047,57,7,2,5,5,4,...,0,0,0,0,0,0,0,0,0,"POLYGON ((-90.15877 40.39479, -90.15876 40.395..."
1,171699702003023,750,558628,7500000US171699702003023,169,6,0,6,6,5,...,0,0,0,0,0,0,0,0,0,"POLYGON ((-90.56039 40.12145, -90.55996 40.121..."
2,171030001004063,750,461247,7500000US171030001004063,103,6,0,6,6,6,...,0,0,0,0,0,0,0,0,0,"POLYGON ((-88.96107 41.83323, -88.95948 41.833..."
3,170579530004028,750,393793,7500000US170579530004028,57,21,0,21,20,20,...,0,0,0,0,0,0,0,0,0,"POLYGON ((-90.30324 40.57410, -90.30309 40.574..."
4,170579539003025,750,395787,7500000US170579539003025,57,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,"POLYGON ((-90.37469 40.24967, -90.37434 40.250..."


In [7]:
# Election data
vest20_df.head()

Unnamed: 0,STATEFP20,COUNTYFP20,VTDST20,GEOID20,NAME20,G20PREDBID,G20PRERTRU,G20PRELJOR,G20PREGHAW,G20PREACAR,G20PRESLAR,G20USSDDUR,G20USSRCUR,G20USSIWIL,G20USSLMAL,G20USSGBLA,geometry
0,17,19,CN0100,17019CN0100,Cunningham 1,753,62,7,9,2,5,684,51,70,12,15,"POLYGON ((-88.23247 40.13302, -88.23175 40.134..."
1,17,19,CC0600,17019CC0600,City of Champaign 06,1035,264,16,16,4,4,958,253,56,21,35,"POLYGON ((-88.25798 40.13331, -88.25798 40.134..."
2,17,19,CC0100,17019CC0100,City of Champaign 01,590,34,2,2,1,6,532,28,58,5,4,"POLYGON ((-88.24012 40.11728, -88.24012 40.117..."
3,17,19,CC0900,17019CC0900,City of Champaign 09,618,98,6,8,2,1,578,84,37,15,14,"POLYGON ((-88.27716 40.13611, -88.27702 40.136..."
4,17,19,CC0300,17019CC0300,City of Champaign 03,1073,209,28,9,3,6,1007,232,10,35,18,"POLYGON ((-88.23540 40.11265, -88.23359 40.112..."


In [8]:
# County data
county_df.head()

Unnamed: 0,STATEFP20,COUNTYFP20,COUNTYNS20,GEOID20,NAME20,NAMELSAD20,LSAD20,CLASSFP20,MTFCC20,CSAFP20,...,P0050002,P0050003,P0050004,P0050005,P0050006,P0050007,P0050008,P0050009,P0050010,geometry
0,17,63,424233,17063,Grundy,Grundy County,6,H1,G4020,176.0,...,204,23,0,181,0,14,0,0,14,"POLYGON ((-88.36529 41.46049, -88.36050 41.460..."
1,17,109,1784729,17109,McDonough,McDonough County,6,H1,G4020,,...,304,0,0,304,0,2189,2057,0,132,"POLYGON ((-90.90733 40.46259, -90.90727 40.466..."
2,17,127,1784730,17127,Massac,Massac County,6,H1,G4020,424.0,...,168,0,0,168,0,79,0,0,79,"POLYGON ((-88.83844 37.33571, -88.83834 37.335..."
3,17,73,424238,17073,Henry,Henry County,6,H1,G4020,209.0,...,712,328,0,384,0,44,0,0,44,"POLYGON ((-90.43247 41.41422, -90.43244 41.415..."
4,17,61,424232,17061,Greene,Greene County,6,H1,G4020,,...,130,22,0,108,0,0,0,0,0,"POLYGON ((-90.62176 39.36211, -90.62229 39.365..."


In [9]:
# Senate data
sen_df.head()

Unnamed: 0,ID,DISTRICT,DISTRICTN,geometry
0,1,1,1,"POLYGON ((-87.75941 41.86568, -87.75843 41.865..."
1,2,2,2,"POLYGON ((-87.74589 41.89869, -87.74590 41.898..."
2,3,3,3,"POLYGON ((-87.69339 41.77912, -87.69340 41.779..."
3,4,4,4,"POLYGON ((-87.90444 41.81876, -87.90473 41.818..."
4,5,5,5,"POLYGON ((-87.73993 41.86596, -87.73994 41.866..."


In [10]:
# Congressional data
cong_df.head()

Unnamed: 0,ID,DISTRICT,DISTRICTN,geometry
0,1,1,1,"POLYGON ((-87.85513 41.14803, -87.85581 41.148..."
1,2,2,2,"POLYGON ((-87.63785 40.02176, -87.63788 40.022..."
2,3,3,3,"POLYGON ((-88.15112 41.81388, -88.15086 41.813..."
3,4,4,4,"POLYGON ((-87.76055 41.72702, -87.76056 41.727..."
4,5,5,5,"POLYGON ((-88.19463 42.08930, -88.19462 42.089..."


2. Calculating the number of districts in the senate data.

In [9]:
nr_of_districts = sen_df.shape[0]
print(f"Number of State Senate Seats in Illinois: {nr_of_districts}")

Number of State Senate Seats in Illinois: 59


3. Calculating the number of districts in the congressional data.

In [11]:
nr_of_districts_cong = cong_df.shape[0]
print(f"Number of Congressional Seats in Illinois: {nr_of_districts_cong}")

Number of Congressional Seats in Illinois: 17


4. Calculating the total number of population and voting age population in the state.

In [10]:
# P0020001 is the total population
print(f"Total pop in Illinois:\t {population_df['P0020001'].sum():_}")
# P0040001 is the total voting age population
print(f"Total vap in Illinois:\t {vap_df['P0040001'].sum():_}")

Total pop in Illinois:	 12_812_508
Total vap in Illinois:	 9_999_469


5. View a unique list (set) of all districts in the senate data.

In [15]:
print("Entries in 'DISTRICT'", set(sen_df["DISTRICT"])) # contains strings
print("Entries in 'DISTRICTN'", set(sen_df["DISTRICTN"])) # contains numbers

Entries in 'DISTRICT' {'8', '49', '56', '26', '22', '43', '45', '34', '9', '57', '6', '30', '15', '20', '44', '2', '53', '38', '47', '40', '25', '24', '23', '42', '28', '35', '16', '19', '58', '31', '51', '32', '12', '50', '4', '29', '36', '3', '27', '46', '13', '54', '17', '52', '59', '39', '21', '48', '41', '37', '14', '5', '55', '7', '10', '1', '11', '33', '18'}
Entries in 'DISTRICTN' {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59}


6. View a unique list (set) of all districts in the congressional data.

In [16]:
print("Entries in 'DISTRICT'", set(cong_df["DISTRICT"])) # contains strings
print("Entries in 'DISTRICTN'", set(cong_df["DISTRICTN"])) # contains numbers

Entries in 'DISTRICT' {'8', '12', '4', '5', '9', '14', '3', '7', '10', '16', '6', '1', '11', '15', '2', '13', '17'}
Entries in 'DISTRICTN' {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17}


### Conclusion
* According to the 2002 Census, Illinois has a total population of 12,812,508 and a voting age population of 9,999,469.
* The state has 59 seats/districts for the state senate.
* State SenateDistricts are well numbered from 1 to 59 in the field `DISTRICTN`.
* The congressional data has 17 districts.
* Congressional districts are well numbered from 1 to 17 in the field `DISTRICTN`.