# **Week 4 Race Data Analysis Update**
### **Jia Ni**
In this assignment, to further assess the evenness of racial distribution in LA County, particularly in areas where the population distribution is highly uneven (with significant racial disparities), I calculated the Shannon-Wiener Index for each census tract to measure racial diversity and visualized the index onto the base map. A higher index value indicates a more even population distribution and greater diversity, while a lower index value suggests that one or a few racial groups dominate, leading to lower diversity.

### **Import the libraries**

In [None]:
import pandas as pd
import geopandas as gpd
import numpy as np
import folium

### **Read and add GeoJSON file to notebook**

In [None]:
tracts_race = gpd.read_file("data/tracts_race.geojson")
tracts_race.head()

### **Extract the columns of racial percentages**

In [None]:
percent_columns = [
    "White_Percent",
    "Black or African American_Percent",
    "American Indian and Alaska Native_Percent",
    "Asian_Percent",
    "Native Hawaiian and Other Pacific Islander_Percent",
    "Some Other Race_Percent",
    "Two or More Races_Percent"
]

In [None]:
# Check the data types
tracts_race[percent_columns].dtypes

In [None]:
# Convert the percentage columns to numeric type, with non-numeric values converted to NaN
tracts_race[percent_columns] = tracts_race[percent_columns].apply(pd.to_numeric, errors="coerce")

# Check for NaN
print(tracts_race[percent_columns].isna().sum())

### **Data cleaning**

In [None]:
# Filter rows containing NaN
rows_with_nan = tracts_race[tracts_race[percent_columns].isna().any(axis=1)]
rows_with_nan

In [None]:
# Delete them
tracts_race = tracts_race.dropna(subset=percent_columns)

In [None]:
# Check for NaN
print(tracts_race[percent_columns].isna().any())

In [None]:
# Convert percentages to decimals
tracts_race[percent_columns] = tracts_race[percent_columns].div(100)
tracts_race.head()

### **Define the Shannon-Wiener Index calculation function**

In [None]:
def calculate_shannon_wiener(row):
    proportions = row[row > 0]
    return -np.sum(proportions * np.log(proportions))

# Calculate the Shannon-Wiener Index for each census tract
tracts_race["Shannon_Wiener_Index"] = tracts_race[percent_columns].apply(calculate_shannon_wiener, axis=1)

In [None]:
tracts_race.head()

### **Create the histogram of the Shannon-Wiener Index in LA**

In [None]:
import matplotlib.pyplot as plt

plt.figure(figsize=(14, 8))
plt.hist(tracts_race["Shannon_Wiener_Index"], bins=50, color="lightyellow", edgecolor='#80795b')

plt.title("Distribution of Shannon-Wiener Index in LA County", fontsize=14, fontweight='bold', pad = 10)
plt.xlabel("Shannon-Wiener Index", fontsize=12, fontweight='bold', labelpad = 10)
plt.ylabel("Frequency", fontsize=12, fontweight='bold', labelpad = 10)
plt.grid(axis="y", linestyle="--", alpha=0.3)

plt.show()

### **Ensure compatibility with mapping libraries**

In [None]:
if tracts_race.crs.to_epsg() != 4326:
    tracts_race = tracts_race.to_crs(epsg=4326)

### **Find the center of the spatial extent**

In [None]:
bounds = tracts_race.total_bounds
minx, miny, maxx, maxy = bounds
center_lat = (miny + maxy) / 2
center_lon = (minx + maxx) / 2

### **Create an interactive map of the Shannon-Wiener Index**

In [None]:
# Set up the base map
m_sw = folium.Map(location=[center_lat, center_lon], tiles="cartodb positron")
m_sw.fit_bounds([[miny, minx], [maxy, maxx]])

In [None]:
# Create a choropleth map of the Shannon-Wiener Index
folium.Choropleth(
    geo_data=tracts_race,
    data=tracts_race,
    columns=["FIPS", "Shannon_Wiener_Index"],
    key_on="feature.properties.FIPS",
    fill_color="YlGnBu",
    fill_opacity=0.7,
    line_opacity=0,
    legend_name="Shannon-Wienner Index"
).add_to(m_sw)

In [None]:
# Add borders and tooltips
folium.GeoJson(
    tracts_race,
    name="Borders",
    style_function=lambda feature: {
        "fillOpacity": 0,
        "color": "white",
        "weight": 0.3
    },
    tooltip=folium.GeoJsonTooltip(fields=["FIPS", "Shannon_Wiener_Index"],
                                   aliases=["Census Tract ID", "Shannon_Wiener_Index"],
                                   localize=True)
).add_to(m_sw)

In [None]:
# Show the map
m_sw

In [None]:
# Save the map to HTMLs
m_sw.save('Shannon-Wiener Index.html')