In [1]:
import warnings
warnings.simplefilter(action='ignore', category=FutureWarning)
from modules.data_cleaning import combine_data, cleaning_data, making_covid_column, create_borough_column
import pandas as pd
school_data_chunk1 = pd.read_csv('data/school_data_chunk1.csv', na_values = 's')
school_data_chunk2 = pd.read_csv('data/school_data_chunk2.csv', na_values = 's')
school_data_combined =combine_data(school_data_chunk1,school_data_chunk2)
school_data = cleaning_data(school_data_combined)
school_data["Year"] = pd.Categorical(school_data["Year"])
school_data["COVID"] = pd.Categorical(school_data['Year'].apply(making_covid_column))
school_data['school_dist'] = school_data['dbn'].astype(str).str[:2]
school_data['borough'] = pd.Categorical(school_data['school_dist'].apply(create_borough_column))




# **<span style='font-family:"Times New Roman"'> <span style='color:black'> English Language Arts Scores and Literacy Levels </span>**
### **<span style='font-family:"Times New Roman"'> <span style='color:black'> Visualizing the Effects of the COVID-19 Pandemic on English Education </span>**

### **<span style='font-family:"Times New Roman"'> Introduction </span>**

<span style='font-family:"Times New Roman"'> The COVID-19 pandemic created obstacles in children's learning and development. Lockdowns, online schooling, and limited socialization created increased levels of stress and anxiety in children. As for schools, COVID-19 practices have continued into the present, with hybrid learning options and lenient grading. This project aims to analyze English Language Arts (ELA) scores of children, from third grade to eighth grade. The ultimate mission of this research is to compare reading levels pre-COVID and post-COVID. The data being used is based in New York City, across all 32 school districts amongst the five boroughs from 2013 to 2023. The data has been modified to only include only the years 2018, 2019, and 2022 due to changes in standardized tests throughout the decade, making them incomparable. ELA scores are split into four levels based on state administered testing. Level 1 indicates a student has not met learning expectations and Level 4 indicates that a student has exceeded expectations, with distinction.  </span>

<span style='font-family:"Times New Roman"'> I hope to gain insights of learning loss amongst grades, school districts, and boroughs of New York City. Young learners have been neglected a fair education, especially those who live in low-income areas and are part of marginalized communities throughout the city. To explore this data, I will be utilizing various representations of multivariate data, primarily using bar graphs and interactive widgets, primarily to assess the levels and scores of grades per year. I will also be applying geographic visualizations to assess the distribution of Level 1 readers by school district to gain insights on the demographic makeup versus percentage of students who do not meet expectations. </span>



In [6]:
from modules.data_visualizations import plotting_sums
plotting_sums(school_data)

### **<span style='font-family:"Times New Roman"'> Motivation </span>**

<span style='font-family:"Times New Roman"'> This research was inspired by my own experiences learning and working in low-income public schools. I went to school in Woodstown, New Jersey, in the center of one of the most segregated counties in New Jersey with a long-standing history of racism. I went to a high school with an immense white majority with an obvious disdain for non-white students from both peers and administration. I became most aware of the racial divide in Salem County when I began substitute teaching in Salem, NJ, a town ridden with poverty and the only majority black town in Salem County. This area has been neglected by the county for years, as the schools are incredibly under-funded, lacking essential resources to further education, such as technology and teachers. 
Racial education gaps have always been prevalent in the United States, however the COVID-19 pandemic has increased this divide and governments have abandoned school districts. 
  </span>

### **<span style='font-family:"Times New Roman"'>Methods </span>**

#### **<span style='font-family:"Times New Roman"'> Plots of Multivariate Data </span>**

<span style='font-family:"Times New Roman"'> The first method I chose to utilize was the creation of plots using interactive widgets and color-coding. Such plots include color-coded bar charts and an interactive line chart. The interactive bar chart is the most essential in this dashboard, as the user can freely manipulate variables of interest. Interactive plots of multivariate data will allow the user to explore various trends through color-coding and widgets. All plots utilize colors to help decipher insights. This dashboard mostly focuses on the distribution of mean test scores and reading levels by grade to model general declines in education pre and post-pandemic </span> 

#### **<span style='font-family:"Times New Roman"'> Geographic Visualization: Choropleth Maps </span>**

<span style='font-family:"Times New Roman"'>Geographic visualization techniques will be used to model the averages of students who fall in a Level 1 range per year. This is done with a GeoJSON of all New York City school district boundaries. Boundary coordinates are attached to each school district in a GeoDataFrame, which includes the mean percentage of each reading level (1-4) per district. These maps use a color scale legend corresponding to each districts' mean Level 1 percentage. This dashboard creates insights on sociodemographic mean distributions that have been affected by the COVID-19 pandemic. The maps give the reader geographical context of the question at hand. This allows for an accessible display of information that easily allows the reader to decipher trends in the data through recognized locations of New York City.
</span>

### **<span style='font-family:"Times New Roman"'>Main Results </span>**

#### <span style='font-family:"Times New Roman"'> Dashboard: Visualization of Multivariate Data and Results </span>

In [None]:
from modules.interactive_dashboard import Dashboard
import panel as pn  

dashboard = Dashboard(data = school_data)

dashboard.layout.show()

#### <span style='font-family:"Times New Roman"'> Dashboard: Geographic Visualization and Results </span>

In [None]:
from modules.map_dashboard import MapDashboard

geojson = 'school_districts.geojson'
dashboard2 = MapDashboard(geojson, school_data)

dashboard2.layout.show()

### **<span style='font-family:"Times New Roman"'>Conclusion </span>**

<span style='font-family:"Times New Roman"'> Overall, when comparing test scores from pre-COVID and post-COVID, it can be concluded that there is a rise in means from 2019 to 2022. However, 2022 averages do not exceed scores from 2018, showing that ELA comprehension has been on an incline since the year before the pandemic started. There were varying results when analyzing the trends amongst grades. The analysis discovered that primarily younger grades (third and fourth) have experienced ELA education loss. This is presumed to be due to formative years spent in lockdown, inhibiting behavioral and academic development. School district geographic analysis demonstrated the academic disparity in New York City. As predicted, as sociodemographic factors, such as race and income, display elevated mean percentages for students falling behind. </span>

<span style='font-family:"Times New Roman"'> These analyses solidifies former research which inspects the effects of the COVID-19 pandemic and education loss. While the pandemic greatly impacted students of all socioeconomic backgrounds, these effects are largely seen in school districts who have already been neglected and lack resources to improve. This work is important in progressing policies to ensure equitable education. While there is still work to do in a post-pandemic world, recovery is possible.
 </span>