# Analyzing Segregation in NYC Schools

In this notebook, we will be using data from the NYC Schools dataset to analyze segregation in NYC schools. Specifically, we will be calculating the Atkinson segregation index for each school district, which measures the level of inequality in the distribution of different racial and ethnic groups across schools.

## <u>Part 1</u>: <u>Loading and Preprocessing the Data</u>

To begin, we will load the nycschools package and use it to load the school demographics data:

In [None]:
%pip install nycschools
from nycschools import schools
demo = schools.load_school_demographics()

We will then select only the columns that we are interested in and filter the data to only include data from the 2018 academic year:

In [6]:
demo = demo[['dbn', 'beds', 'district', 'geo_district', 'zip', 'boro', 'school_name',
       'short_name', 'ay', 'year', 'school_type', 'total_enrollment', 'female_n', 'female_pct', 'male_n',
       'male_pct', 'asian_n', 'asian_pct', 'black_n', 'black_pct',
       'hispanic_n', 'hispanic_pct', 'multi_racial_n', 'multi_racial_pct',
       'native_american_n', 'native_american_pct', 'white_n', 'white_pct',
       'missing_race_ethnicity_data_n', 'missing_race_ethnicity_data_pct', 'poverty_n', 'poverty_pct',
       'eni_pct']]

demo = demo[demo['ay'] == 2018]

Next, we will use `pandas` to group the data by district and sum the number of students per race:

In [7]:
import pandas as pd

# Group the data by district and sum the number of students per race
district_data = demo.groupby('district').agg({
    'white_n': 'sum',
    'black_n': 'sum',
    'asian_n': 'sum',
    'hispanic_n': 'sum',
    'multi_racial_n': 'sum',
    'native_american_n': 'sum',
    'total_enrollment': 'sum'
})

## <u>Part 2</u>: <u>Calculating the Atkinson Segregation Index</u>

Now that we have preprocessed the data, we can move on to calculating the Atkinson segregation index for each district. The Atkinson index is defined as:

`1 - (sum(p_i^(1 - beta)) / n) `

where: 

- `p_i` is the proportion of students in group i (e.g. white, black, Asian, Hispanic, etc.) 

- `beta` is a parameter that determines the degree of inequality we want to measure

- `n` is the total number of groups.

We will define a function atkinson_index to calculate the index for a given district, and then apply this function to each district using pandas:

In [8]:
# Calculate the proportions for each race in each district
for race in ['white_n', 'black_n', 'asian_n', 'hispanic_n', 'multi_racial_n', 'native_american_n']:
    district_data[race + '_prop'] = district_data[race] / district_data['total_enrollment']


def atkinson_index(district, beta=0.5):
    proportions = [
        district['white_n_prop'],
        district['black_n_prop'],
        district['asian_n_prop'],
        district['hispanic_n_prop'],
        district['multi_racial_n_prop'],
        district['native_american_n_prop']
    ]
    return 1 - sum(p ** (1 - beta) for p in proportions if p > 0) / len(proportions)


# Calculate the Atkinson segregation index for each district
district_data['atkinson_index'] = district_data.apply(atkinson_index, axis=1)

Finally, we reset the index of our district_data dataframe and print out the results:

In [9]:
district_data = district_data.reset_index()
print(district_data[['district', 'atkinson_index']])

    district  atkinson_index
0          1        0.638541
1          2        0.635476
2          3        0.642181
3          4        0.667669
4          5        0.664208
5          6        0.720841
6          7        0.711855
7          8        0.680269
8          9        0.713777
9         10        0.681613
10        11        0.660476
11        12        0.702780
12        13        0.643602
13        14        0.666580
14        15        0.640118
15        16        0.696803
16        17        0.696642
17        18        0.721763
18        19        0.677406
19        20        0.672386
20        21        0.647573
21        22        0.641533
22        23        0.707116
23        24        0.685135
24        25        0.664023
25        26        0.652383
26        27        0.635525
27        28        0.626423
28        29        0.665985
29        30        0.657361
30        31        0.652436
31        32        0.717511
32        75        0.652208
33        79  

## <u>Part 3</u>: <u>Interpreting the Atkinson Segregation Index Results</u>

We have calculated the Atkinson segregation index for each school district in NYC, and our results show that the index varies widely across districts, ranging from 0.626 to 0.721.

An Atkinson segregation index value of 0 indicates perfect integration, where each racial or ethnic group is represented in each school proportionally to their share of the overall student population. An Atkinson segregation index value of 1 indicates complete segregation, where each racial or ethnic group attends separate schools and there is no mixing between groups.

Using these benchmarks, we can interpret our results as follows:

- Districts with an Atkinson segregation index of less than 0.1 can be considered to have low levels of segregation.

- Districts with an Atkinson segregation index between 0.1 and 0.6 can be considered to have moderate levels of segregation.

- Districts with an Atkinson segregation index of greater than 0.6 can be considered to have high levels of segregation.

Based on these guidelines, our results suggest that many NYC school districts have high levels of segregation. In particular, 12 out of 35 districts have Atkinson segregation indices of 0.7 or higher, indicating significant levels of segregation.

These findings have important implications for educational outcomes and opportunities in NYC, as segregation has been shown to be associated with a range of negative outcomes, such as lower academic achievement, reduced access to resources and opportunities, and increased social isolation. By using these results to inform policy decisions and interventions aimed at promoting diversity, equity, and inclusion in education, we can work towards creating a more equitable and inclusive educational system for all students.