# Analysis of NJ public schools results in ELA and math grades 6-8.

<span style="color: red;">**If kernel can't connect to server again run command:**
*netsh winsock reset*<span>

<a id="TOC"></a> 
## Table of Contents
1. [Data sources and definitions](#data)
2. [Imports: modules](#modules)
3. [Read and prepare data](#read)
4. [Best middle schools by math](#best)
5. [Maps of middle schools by math results](#maps) 


<a id="data"></a> 
#### Data:
1. Data New Jersey Student Learning Assessments (NJSLA) results 2015-2023 for grades 6-8:
<br>State of New Jersey, Department of Education:
Statewide Assessment Reports
<br>https://www.nj.gov/education/assessment/results/reports/
2. NJ schools locations: NJGIN Open Data <br>
https://njogis-newjersey.opendata.arcgis.com/datasets/d8223610010a4c3887cfb88b904545ff/explore
3. School districts: NJGIN Open Data 
<br>https://njogis-newjersey.opendata.arcgis.com/datasets/ca144194df66491d83b8f8bf338e0172/explore

####  Performance levels for New Jersey Student Learning Standards for English Language Arts and Math  

**Level 1**: Did Not Yet Meet Expectations 

**Level 2**: Partially Met Expectations 

**Level 3**: Approached Expectations  

**Level 4**: Met Expectations  

**Level 5**: Exceeded Expectations  

*Source: New Jersey Assessments Resource Center, 2022, https://nj.mypearsonsupport.com/resources/reporting/NJSLA_Score_Interpretation_Guide_Spring2022.pdf*

## Questions
*1. How the test results changed?*
<br>Compare last year test results in a school with the school 10-year average as percentage of average:
<br> school_change = (school_current_year - school_10year_average)/school_10year_average
<br> citywide_change = (city_current_year - city_10year_average)/city_10year_average
<br> relative_school_change = school_change - citywide_change
<br><br>
*2. How good the school is?* 
<br>Last three testing period results (2019, 2022, 2023) are different for some schools: due to COVID disruptions, testing procedures changes, in Destrict 15 due to admission rules changed. Therefore average 10 years scores do not reflect well schools situation now. Results for these 3 last testing years are teken instead.
<br><br>
*3. Is the school citywide or borowide?*
<br><br>
*4. Diversity?*
<br><br>
*5. Size?*

<a id="modules"></a> 
#### Imports: modules

In [1]:
import os
import pandas as pd
import geopandas as gpd
# import folium
import matplotlib.pyplot as plt
import base64
from io import BytesIO
import math
from tqdm import tqdm
from utils import match_name, create_plot, process_schools, create_chart

pd.set_option('display.float_format', '{:.3f}'.format)



<a id="read"></a> 
#### Read data

In [2]:
basePath = r"G:\My Drive\Kids\NJ_schools_mapped"
dataFolder = r"raw_data"
outputFolder = r"processed_data"

The excel files downloaded from NJ DOE were cleaned from 'DFG' columns and case in columns headers was unified.

In [128]:
## Read data from annual files with results by schools

# Initialize an empty list to store your DataFrames
math_DFs = []

directory = os.path.join(basePath, dataFolder)

# Loop through each file in the directory
for filename in tqdm(os.listdir(directory), desc = 'Processing files'):
    if filename.endswith('.xlsx') and filename.startswith('MAT') and 'NJSLA DATA'  in filename:
        print(filename)
        
        # Construct the full file path
        file_path = os.path.join(directory, filename)
        
        # Read the Excel file
        df = pd.read_excel(file_path, skiprows=2)
        
        # Filter the DataFrame 
        filtered_df = df[(df['Subgroup'].str.lower() == 'total') & (df['School Name'].str.lower() != 'district total') & pd.notna(df['School Name']) & (df['School Name'].str.strip() != '')]
        
        # Add a column with type of assessment and grade (ex: MAT06),
        # it is in the first 5 characters of the filename
        filtered_df['Assessment'] = filename[:5] 
        
        # Add a column with year, it is in the last 4 characters before file extention in the filename
        filtered_df['Year'] = filename[-9:-5] 
        
        # Harmonizing cases in columns between different tables
        column_to_upper = ['County Name', 'District Name', 'School Name', 'Subgroup', 'Subgroup_Type']
        for col in column_to_upper:
            filtered_df[col] = filtered_df[col].str.upper()
        
        # Append the filtered DataFrame to your list
        math_DFs.append(filtered_df)

print("Concatinatinating dataframes")        
# Concatenate all DataFrames into one
mathResultsDF = pd.concat(math_DFs, ignore_index=True)

print("mathResultsDF is ready.")


Processing files:   0%|                                                                         | 0/46 [00:00<?, ?it/s]

MAT06 NJSLA DATA 2022-2023.xlsx


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
Processing files:  13%|████████▍                                                        | 6/46 [00:13<01:30,  2.26s/it]

MAT07 NJSLA DATA 2022-2023.xlsx


Processing files:  15%|█████████▉                                                       | 7/46 [00:20<02:03,  3.17s/it]

MAT08 NJSLA DATA 2022-2023.xlsx


Processing files:  17%|███████████▎                                                     | 8/46 [00:25<02:13,  3.53s/it]

MAT06 NJSLA DATA 2021-2022.xlsx


Processing files:  26%|████████████████▋                                               | 12/46 [00:32<01:28,  2.61s/it]

MAT07 NJSLA DATA 2021-2022.xlsx


Processing files:  28%|██████████████████                                              | 13/46 [00:40<01:51,  3.38s/it]

MAT08 NJSLA DATA 2021-2022.xlsx


Processing files:  30%|███████████████████▍                                            | 14/46 [00:56<03:07,  5.87s/it]

MAT06 NJSLA DATA 2018-2019.xlsx


Processing files:  39%|█████████████████████████                                       | 18/46 [01:04<01:48,  3.88s/it]

MAT07 NJSLA DATA 2018-2019.xlsx


Processing files:  41%|██████████████████████████▍                                     | 19/46 [01:09<01:47,  4.00s/it]

MAT08 NJSLA DATA 2018-2019.xlsx


Processing files:  43%|███████████████████████████▊                                    | 20/46 [01:13<01:42,  3.94s/it]

MAT06 NJSLA DATA 2017-2018.xlsx


Processing files:  54%|██████████████████████████████████▊                             | 25/46 [01:17<00:46,  2.23s/it]

MAT07 NJSLA DATA 2017-2018.xlsx


Processing files:  57%|████████████████████████████████████▏                           | 26/46 [01:21<00:49,  2.47s/it]

MAT08 NJSLA DATA 2017-2018.xlsx


Processing files:  59%|█████████████████████████████████████▌                          | 27/46 [01:25<00:51,  2.70s/it]

MAT06 NJSLA DATA 2016-2017.xlsx


Processing files:  67%|███████████████████████████████████████████▏                    | 31/46 [01:30<00:30,  2.04s/it]

MAT07 NJSLA DATA 2016-2017.xlsx


Processing files:  70%|████████████████████████████████████████████▌                   | 32/46 [01:35<00:34,  2.46s/it]

MAT08 NJSLA DATA 2016-2017.xlsx


Processing files:  72%|█████████████████████████████████████████████▉                  | 33/46 [01:39<00:35,  2.72s/it]

MAT06 NJSLA DATA 2015-2016.xlsx


Processing files:  80%|███████████████████████████████████████████████████▍            | 37/46 [01:44<00:17,  1.92s/it]

MAT07 NJSLA DATA 2015-2016.xlsx


Processing files:  83%|████████████████████████████████████████████████████▊           | 38/46 [01:47<00:17,  2.19s/it]

MAT08 NJSLA DATA 2015-2016.xlsx


Processing files:  85%|██████████████████████████████████████████████████████▎         | 39/46 [01:51<00:17,  2.47s/it]

MAT06 NJSLA DATA 2014-2015.xlsx


Processing files:  93%|███████████████████████████████████████████████████████████▊    | 43/46 [01:55<00:05,  1.69s/it]

MAT07 NJSLA DATA 2014-2015.xlsx


Processing files:  96%|█████████████████████████████████████████████████████████████▏  | 44/46 [01:58<00:03,  1.91s/it]

MAT08 NJSLA DATA 2014-2015.xlsx


Processing files: 100%|████████████████████████████████████████████████████████████████| 46/46 [02:01<00:00,  2.64s/it]

Concatinatinating dataframes
mathResultsDF is ready.





In [4]:
unique_values2 = mathResultsDF['Year'].unique()
print(unique_values2)

['2023' '2022' '2019' '2018' '2017' '2016' '2015']


In [129]:
## Read data from annual files with results by schools

# Initialize an empty list to store your DataFrames
ELA_DFs = []

directory = os.path.join(basePath, dataFolder)

# Loop through each file in the directory
for filename in tqdm(os.listdir(directory), desc = 'Processing files'):
    if filename.endswith('.xlsx') and filename.startswith('ELA'):
        print(filename)
        
        # Construct the full file path
        file_path = os.path.join(directory, filename)
        
        # Read the Excel file
        df = pd.read_excel(file_path, skiprows=2)
        
        # Filter the DataFrame 
        filtered_df = df[(df['Subgroup'].str.lower() == 'total') & (df['School Name'].str.lower() != 'district total') & pd.notna(df['School Name']) & (df['School Name'].str.strip() != '')]
        
        # Add a column with type of assessment and grade (ex: MAT06),
        # it is in the first 5 characters of the filename
        filtered_df['Assessment'] = filename[:5] 
        
        # Add a column with year, it is in the last 4 characters before file extention in the filename
        filtered_df['Year'] = filename[-9:-5] 
        
        # Harmonizing cases in columns between different tables
        column_to_upper = ['County Name', 'District Name', 'School Name', 'Subgroup', 'Subgroup_Type']
        for col in column_to_upper:
            filtered_df[col] = filtered_df[col].str.upper()
        
        # Append the filtered DataFrame to your list
        ELA_DFs.append(filtered_df)

print("Concatinating dataframes")        
# Concatenate all DataFrames into one
ELAResultsDF = pd.concat(ELA_DFs, ignore_index=True)

print("ELAResultsDF is ready.")

Processing files:   0%|                                                                         | 0/46 [00:00<?, ?it/s]

ELA06 NJSLA DATA 2022-2023.xlsx


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
Processing files:   7%|████▏                                                            | 3/46 [00:04<01:09,  1.62s/it]

ELA07 NJSLA DATA 2022-2023.xlsx


Processing files:   9%|█████▋                                                           | 4/46 [00:09<01:47,  2.56s/it]

ELA08 NJSLA DATA 2022-2023.xlsx


Processing files:  11%|███████                                                          | 5/46 [00:13<02:09,  3.16s/it]

ELA06 NJSLA DATA 2021-2022.xlsx


Processing files:  20%|████████████▋                                                    | 9/46 [00:18<01:10,  1.91s/it]

ELA07 NJSLA DATA 2021-2022.xlsx


Processing files:  22%|█████████████▉                                                  | 10/46 [00:22<01:24,  2.34s/it]

ELA08 NJSLA DATA 2021-2022.xlsx


Processing files:  24%|███████████████▎                                                | 11/46 [00:27<01:37,  2.77s/it]

ELA06 NJSLA DATA 2018-2019.xlsx


Processing files:  33%|████████████████████▊                                           | 15/46 [00:31<00:58,  1.88s/it]

ELA07 NJSLA DATA 2018-2019.xlsx


Processing files:  35%|██████████████████████▎                                         | 16/46 [00:35<01:06,  2.23s/it]

ELA08 NJSLA DATA 2018-2019.xlsx


Processing files:  37%|███████████████████████▋                                        | 17/46 [00:39<01:14,  2.56s/it]

ELA06 NJSLA DATA 2017-2018.xlsx


Processing files:  48%|██████████████████████████████▌                                 | 22/46 [00:44<00:39,  1.63s/it]

ELA07 NJSLA DATA 2017-2018.xlsx


Processing files:  50%|████████████████████████████████                                | 23/46 [00:48<00:46,  2.02s/it]

ELA08 NJSLA DATA 2017-2018.xlsx


Processing files:  52%|█████████████████████████████████▍                              | 24/46 [00:52<00:51,  2.33s/it]

ELA06 NJSLA DATA 2016-2017.xlsx


Processing files:  61%|██████████████████████████████████████▉                         | 28/46 [00:57<00:31,  1.73s/it]

ELA07 NJSLA DATA 2016-2017.xlsx


Processing files:  63%|████████████████████████████████████████▎                       | 29/46 [01:01<00:34,  2.04s/it]

ELA08 NJSLA DATA 2016-2017.xlsx


Processing files:  65%|█████████████████████████████████████████▋                      | 30/46 [01:04<00:37,  2.36s/it]

ELA06 NJSLA DATA 2015-2016.xlsx


Processing files:  74%|███████████████████████████████████████████████▎                | 34/46 [01:09<00:20,  1.74s/it]

ELA07 NJSLA DATA 2015-2016.xlsx


Processing files:  76%|████████████████████████████████████████████████▋               | 35/46 [01:13<00:22,  2.05s/it]

ELA08 NJSLA DATA 2015-2016.xlsx


Processing files:  78%|██████████████████████████████████████████████████              | 36/46 [01:16<00:23,  2.35s/it]

ELA06 NJSLA DATA 2014-2015.xlsx


Processing files:  87%|███████████████████████████████████████████████████████▋        | 40/46 [01:20<00:09,  1.60s/it]

ELA07 NJSLA DATA 2014-2015.xlsx


Processing files:  89%|█████████████████████████████████████████████████████████       | 41/46 [01:23<00:09,  1.84s/it]

ELA08 NJSLA DATA 2014-2015.xlsx


Processing files: 100%|████████████████████████████████████████████████████████████████| 46/46 [01:26<00:00,  1.89s/it]

Concatinating dataframes
ELAResultsDF is ready.





In [6]:
len(ELA_DFs)

21

In [8]:
ELAResultsDF.tail()

Unnamed: 0,County Code,County Name,District Code,District Name,School Code,School Name,Subgroup,Subgroup_Type,Registered To Test,Not Tested ** (See Below),Valid Scores,Mean Scale Score,L1 Percent,L2 Percent,L3 Percent,L4 Percent,L5 Percent,Assessment,Year
16229,80,CHARTERS,7890,TEANECK COMMUNITY CS,920.0,TEANECK COMMUNITY CHARTER SCHOOL,TOTAL,ALL STUDENTS,*,*,25,741,4,24,32,40,0,ELA08,2015
16230,80,CHARTERS,8010,UNION COUNTY TEAMS CS,980.0,UNION COUNTY TEAMS CHARTER SCHOOL,TOTAL,ALL STUDENTS,*,*,32,738,6.3,28.1,31.3,31.3,3.1,ELA08,2015
16231,80,CHARTERS,8050,UNITY CS,990.0,UNITY CHARTER SCHOOL,TOTAL,ALL STUDENTS,*,*,*,*,*,*,*,*,*,ELA08,2015
16232,80,CHARTERS,8065,UNIVERSITY HEIGHTS CS,980.0,UNIVERSITY HEIGHTS CHARTER SCHOOL,TOTAL,ALL STUDENTS,*,*,47,747,6.4,17,31.9,40.4,4.3,ELA08,2015
16233,80,CHARTERS,8140,VILLAGE CS,990.0,THE VILLAGE CHARTER SCHOOL,TOTAL,ALL STUDENTS,*,*,37,732,13.5,18.9,40.5,27,0,ELA08,2015


In [7]:
unique_values = ELAResultsDF['Year'].unique()
print(unique_values)

['2023' '2022' '2019' '2018' '2017' '2016' '2015']


In [None]:
mathResultsDF.tail()

In [130]:
subjects = ['Math', 'ELA']
resultsDFs = {'Math': mathResultsDF, 'ELA': ELAResultsDF}

In [131]:
for subject in subjects:
    resultsDF = resultsDFs[subject]
    resultsDF.info()
    print(len(resultsDF))
    
del resultsDF    

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 15943 entries, 0 to 15942
Data columns (total 19 columns):
 #   Column                     Non-Null Count  Dtype  
---  ------                     --------------  -----  
 0   County Code                15943 non-null  object 
 1   County Name                15943 non-null  object 
 2   District Code              15943 non-null  object 
 3   District Name              15943 non-null  object 
 4   School Code                15943 non-null  float64
 5   School Name                15943 non-null  object 
 6   Subgroup                   15943 non-null  object 
 7   Subgroup_Type              15943 non-null  object 
 8   Registered To Test         15943 non-null  object 
 9   Not Tested ** (See Below)  15943 non-null  object 
 10  Valid Scores               15943 non-null  object 
 11  Mean Scale Score           15943 non-null  object 
 12  L1 Percent                 15943 non-null  object 
 13  L2 Percent                 15943 non-null  obj

In [132]:
# resultsDF.info() showed that most of the columns are objects instead of numbers and needed to be converted
for subject in subjects:
    resultsDF = resultsDFs[subject]
    resultsDF_colToConvert = ['Valid Scores',
     'Mean Scale Score',
     'L1 Percent',                             
     'L2 Percent',
     'L3 Percent',
     'L4 Percent',
     'L5 Percent']
    resultsDF[resultsDF_colToConvert] = resultsDF[resultsDF_colToConvert].apply(pd.to_numeric, errors = 'coerce')
    resultsDF['School Code'] = resultsDF['School Code'].astype(str)
    resultsDF.info()
    print(len(resultsDF))
    
del resultsDF

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 15943 entries, 0 to 15942
Data columns (total 19 columns):
 #   Column                     Non-Null Count  Dtype  
---  ------                     --------------  -----  
 0   County Code                15943 non-null  object 
 1   County Name                15943 non-null  object 
 2   District Code              15943 non-null  object 
 3   District Name              15943 non-null  object 
 4   School Code                15943 non-null  object 
 5   School Name                15943 non-null  object 
 6   Subgroup                   15943 non-null  object 
 7   Subgroup_Type              15943 non-null  object 
 8   Registered To Test         15943 non-null  object 
 9   Not Tested ** (See Below)  15943 non-null  object 
 10  Valid Scores               15481 non-null  float64
 11  Mean Scale Score           15489 non-null  float64
 12  L1 Percent                 15489 non-null  float64
 13  L2 Percent                 15489 non-null  flo

In [133]:
for subject in subjects:
    resultsDF = resultsDFs[subject]
    assessment = resultsDF['Assessment']
    resultsDF['Grade'] = assessment.str[-1]
    resultsDF['Grade'] = pd.to_numeric(resultsDF['Grade'])
    levels = ['L1', 'L2', 'L3', 'L4', 'L5']
    for l in levels:        
        resultsDF[f'{l} Number'] = (resultsDF[f'{l} Percent']*0.01)*resultsDF['Valid Scores']
    
    print(resultsDF.head())
   
del resultsDF

  County Code County Name District Code                    District Name  \
0          01    ATLANTIC        10.000  ABSECON PUBLIC SCHOOLS DISTRICT   
1          01    ATLANTIC       110.000    ATLANTIC CITY SCHOOL DISTRICT   
2          01    ATLANTIC       110.000    ATLANTIC CITY SCHOOL DISTRICT   
3          01    ATLANTIC       110.000    ATLANTIC CITY SCHOOL DISTRICT   
4          01    ATLANTIC       110.000    ATLANTIC CITY SCHOOL DISTRICT   

  School Code              School Name Subgroup Subgroup_Type  \
0        50.0           EMMA C ATTALES    TOTAL  ALL STUDENTS   
1        30.0  SOVEREIGN AVENUE SCHOOL    TOTAL  ALL STUDENTS   
2        50.0   CHELSEA HEIGHTS SCHOOL    TOTAL  ALL STUDENTS   
3        60.0      TEXAS AVENUE SCHOOL    TOTAL  ALL STUDENTS   
4        70.0   NEW YORK AVENUE SCHOOL    TOTAL  ALL STUDENTS   

  Registered To Test Not Tested ** (See Below)  ...  L4 Percent  L5 Percent  \
0                  *                         *  ...      28.000       1.0

## Analysis

#### Prepare schools dataframe with only middle school tests results (grades 6-8)

In [134]:
# Select middle school grades results from the dataframes with Math and ELA tests results

resultsMS_bySchl_Norm ={}

for subject in subjects:
        
    resultsDF = resultsDFs[subject]
    
    # Dataframe with only grades 6-8 results (middle schools and K-8) by years
    resultsMS_bySchl = resultsDF.groupby(['School Code', 'School Name', 'Year'])[['L1 Number','L2 Number','L3 Number','L4 Number','L5 Number']].sum()
    
    # Change column names to include subject
    resultsMS_bySchl.columns = [f'Level 1 {subject}',f'Level 2 {subject}',f'Level 3 {subject}',f'Level 4 {subject}', f'Level 5 {subject}']
    
    # Dataframe for middle schools by years with normalized values
    resultsMS_bySchl_Norm[subject] = resultsMS_bySchl.div(resultsMS_bySchl.sum(axis=1), axis=0)
    resultsMS_bySchl_Norm[subject].reset_index(inplace=True)
    
    print(resultsMS_bySchl_Norm[subject].head())
    
del resultsDF, resultsMS_bySchl

  School Code                       School Name  Year  Level 1 Math  \
0        10.0  ALICE COSTELLO ELEMENTARY SCHOOL  2015         0.127   
1        10.0  ALICE COSTELLO ELEMENTARY SCHOOL  2016         0.196   
2        10.0  ALICE COSTELLO ELEMENTARY SCHOOL  2017         0.185   
3        10.0  ALICE COSTELLO ELEMENTARY SCHOOL  2018         0.143   
4        10.0  ALICE COSTELLO ELEMENTARY SCHOOL  2019         0.165   

   Level 2 Math  Level 3 Math  Level 4 Math  Level 5 Math  
0         0.365         0.349         0.159         0.000  
1         0.196         0.411         0.178         0.018  
2         0.231         0.462         0.123         0.000  
3         0.414         0.328         0.114         0.000  
4         0.341         0.365         0.129         0.000  
  School Code                       School Name  Year  Level 1 ELA  \
0        10.0  ALICE COSTELLO ELEMENTARY SCHOOL  2015        0.111   
1        10.0  ALICE COSTELLO ELEMENTARY SCHOOL  2016        0.140   
2  

In [135]:
# Make a merged dataframe with both Math and ELA results
DFs = list(resultsMS_bySchl_Norm.values())
allResultsDF = pd.merge(DFs[0], DFs[1], on = ['School Name', 'Year'], how = 'inner')
allResultsDF.head(5)

Unnamed: 0,School Code_x,School Name,Year,Level 1 Math,Level 2 Math,Level 3 Math,Level 4 Math,Level 5 Math,School Code_y,Level 1 ELA,Level 2 ELA,Level 3 ELA,Level 4 ELA,Level 5 ELA
0,10.0,ALICE COSTELLO ELEMENTARY SCHOOL,2015,0.127,0.365,0.349,0.159,0.0,10.0,0.111,0.238,0.302,0.318,0.032
1,10.0,ALICE COSTELLO ELEMENTARY SCHOOL,2016,0.196,0.196,0.411,0.178,0.018,10.0,0.14,0.105,0.369,0.316,0.07
2,10.0,ALICE COSTELLO ELEMENTARY SCHOOL,2017,0.185,0.231,0.462,0.123,0.0,10.0,0.092,0.132,0.408,0.289,0.079
3,10.0,ALICE COSTELLO ELEMENTARY SCHOOL,2018,0.143,0.414,0.328,0.114,0.0,10.0,0.123,0.235,0.346,0.284,0.012
4,10.0,ALICE COSTELLO ELEMENTARY SCHOOL,2019,0.165,0.341,0.365,0.129,0.0,10.0,0.09,0.19,0.28,0.38,0.06


In [136]:
# Add colomn with sum of shares of level4 students by Math and level4 students ELa
allResultsDF['Level 5 Math+Ela'] = allResultsDF[f'Level 5 {subjects[0]}']+allResultsDF[f'Level 5 {subjects[1]}']
allResultsDF.head(10)

Unnamed: 0,School Code_x,School Name,Year,Level 1 Math,Level 2 Math,Level 3 Math,Level 4 Math,Level 5 Math,School Code_y,Level 1 ELA,Level 2 ELA,Level 3 ELA,Level 4 ELA,Level 5 ELA,Level 5 Math+Ela
0,10.0,ALICE COSTELLO ELEMENTARY SCHOOL,2015,0.127,0.365,0.349,0.159,0.0,10.0,0.111,0.238,0.302,0.318,0.032,0.032
1,10.0,ALICE COSTELLO ELEMENTARY SCHOOL,2016,0.196,0.196,0.411,0.178,0.018,10.0,0.14,0.105,0.369,0.316,0.07,0.088
2,10.0,ALICE COSTELLO ELEMENTARY SCHOOL,2017,0.185,0.231,0.462,0.123,0.0,10.0,0.092,0.132,0.408,0.289,0.079,0.079
3,10.0,ALICE COSTELLO ELEMENTARY SCHOOL,2018,0.143,0.414,0.328,0.114,0.0,10.0,0.123,0.235,0.346,0.284,0.012,0.012
4,10.0,ALICE COSTELLO ELEMENTARY SCHOOL,2019,0.165,0.341,0.365,0.129,0.0,10.0,0.09,0.19,0.28,0.38,0.06,0.06
5,10.0,ALICE COSTELLO ELEMENTARY SCHOOL,2022,0.149,0.264,0.356,0.218,0.012,10.0,0.144,0.196,0.402,0.216,0.041,0.053
6,10.0,ALICE COSTELLO ELEMENTARY SCHOOL,2023,0.186,0.291,0.279,0.221,0.023,10.0,0.212,0.131,0.283,0.283,0.091,0.114
7,10.0,ALLAMUCHY TOWNSHIP SCHOOL,2015,0.084,0.221,0.295,0.347,0.053,10.0,0.046,0.101,0.202,0.422,0.229,0.282
8,10.0,ALLAMUCHY TOWNSHIP SCHOOL,2016,0.047,0.196,0.383,0.337,0.037,10.0,0.058,0.123,0.295,0.401,0.123,0.16
9,10.0,ALLAMUCHY TOWNSHIP SCHOOL,2017,0.048,0.173,0.385,0.365,0.029,10.0,0.049,0.122,0.195,0.52,0.114,0.143


In [137]:
unique_values = allResultsDF['Year'].unique()
print(unique_values)

['2015' '2016' '2017' '2018' '2019' '2022' '2023']


#### Select schools with the best results for all middle school grades in 2023

#### Create dataframe with average 2015-2023 math and ela test results for all middle school grades

In [138]:
# Make a merged dataframe with both Math and ELA average 2015-2023 results 

resultsMS_AVG2015_23 = {}

for subject in subjects:
    
    resultsDF = resultsDFs[subject]
    
    # Dataframe with only grades 6-8 results (middle schools and K-8) by schools
    resultsMS_bySchl_sumed = resultsDF.groupby(['School Code', 'School Name'])[['L1 Number','L2 Number','L3 Number','L4 Number','L5 Number']].sum()
    
    # Rename columns
    resultsMS_bySchl_sumed.columns = [f'# Level 1 {subject}',f'# Level 2 {subject}',f'# Level 3 {subject}',f'# Level 4 {subject}', f'# Level 5 {subject}']

    
    # Dataframe for middle schools by years with normalized values
    resultsMS_bySchl_sumed_Norm = resultsMS_bySchl_sumed.div(resultsMS_bySchl_sumed.sum(axis=1), axis=0)
    resultsMS_bySchl_sumed_Norm.columns = [f'8yrs avg Lvl 1 {subject}',f'8yrs avg Lvl 2 {subject}',f'8yrs avg Lvl 3 {subject}', f'8yrs avg Lvl 4 {subject}', f'8yrs avg Lvl 5 {subject}']
    resultsMS_bySchl_sumed_Norm.reset_index(inplace = True)
    
    # Add the dataframe to the respective dictionnary 
    resultsMS_AVG2015_23[subject] = resultsMS_bySchl_sumed_Norm
    print(subject)
    print(len(resultsMS_AVG2015_23[subject]))
    

# del resultsDF, resultsMS_bySchl_sumed_Norm, resultsMS_bySchl_sumed_sorted, fileName, filePath, resultsMS_bySchl_sumed
del resultsDF, resultsMS_bySchl_sumed_Norm, resultsMS_bySchl_sumed

Math
1235
ELA
1235


In [139]:
# Make a merged dataframe with both Math and ELA average 2013-2023 results 

AVG2015_23_DFs = list(resultsMS_AVG2015_23.values())
allResultsAVG2015_23DF = pd.merge(AVG2015_23_DFs[0], AVG2015_23_DFs[1], on = ['School Code','School Name'], how = 'inner')
allResultsAVG2015_23DF['8yrs avg Lvl 5 Math+Ela'] = allResultsAVG2015_23DF[f'8yrs avg Lvl 5 {subjects[0]}']+allResultsAVG2015_23DF[f'8yrs avg Lvl 5 {subjects[1]}']

del AVG2015_23_DFs

In [140]:
allResultsAVG2015_23DF.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 1234 entries, 0 to 1233
Data columns (total 13 columns):
 #   Column                   Non-Null Count  Dtype  
---  ------                   --------------  -----  
 0   School Code              1234 non-null   object 
 1   School Name              1234 non-null   object 
 2   8yrs avg Lvl 1 Math      1224 non-null   float64
 3   8yrs avg Lvl 2 Math      1224 non-null   float64
 4   8yrs avg Lvl 3 Math      1224 non-null   float64
 5   8yrs avg Lvl 4 Math      1224 non-null   float64
 6   8yrs avg Lvl 5 Math      1224 non-null   float64
 7   8yrs avg Lvl 1 ELA       1225 non-null   float64
 8   8yrs avg Lvl 2 ELA       1225 non-null   float64
 9   8yrs avg Lvl 3 ELA       1225 non-null   float64
 10  8yrs avg Lvl 4 ELA       1225 non-null   float64
 11  8yrs avg Lvl 5 ELA       1225 non-null   float64
 12  8yrs avg Lvl 5 Math+Ela  1224 non-null   float64
dtypes: float64(11), object(2)
memory usage: 135.0+ KB


In [127]:
allResultsAVG2015_23DF.head()

Unnamed: 0,School Name,School Name_x,School Name_x.1,School Code,8yrs avg Lvl 1 Math,8yrs avg Lvl 2 Math,8yrs avg Lvl 3 Math,8yrs avg Lvl 4 Math,8yrs avg Lvl 5 Math,8yrs avg Lvl 1 ELA,...,plot Math_x,plot ELA_x,School Name_y,plot Math_y,School Name_y.1,plot ELA_y,School Name_y.2,plot Math,School Name_y.3,plot ELA
0,ALICE COSTELLO ELEMENTARY SCHOOL,ALICE COSTELLO ELEMENTARY SCHOOL,ALICE COSTELLO ELEMENTARY SCHOOL,10.0,0.164,0.303,0.359,0.166,0.008,0.133,...,iVBORw0KGgoAAAANSUhEUgAAAfEAAAEhCAYAAABvDJlSAA...,iVBORw0KGgoAAAANSUhEUgAAAfEAAAEhCAYAAABvDJlSAA...,ALICE COSTELLO ELEMENTARY SCHOOL,iVBORw0KGgoAAAANSUhEUgAAAfEAAAEhCAYAAABvDJlSAA...,ALICE COSTELLO ELEMENTARY SCHOOL,iVBORw0KGgoAAAANSUhEUgAAAfEAAAEhCAYAAABvDJlSAA...,ALICE COSTELLO ELEMENTARY SCHOOL,iVBORw0KGgoAAAANSUhEUgAAAfEAAAEhCAYAAABvDJlSAA...,ALICE COSTELLO ELEMENTARY SCHOOL,iVBORw0KGgoAAAANSUhEUgAAAfEAAAEhCAYAAABvDJlSAA...


In [110]:
# Make plots for popups in the map and add them as columns to the mappable dataframe

# list of schools names

schoolsNames = allResultsDF['School Name'].to_list()
testResults = allResultsDF

print("Schools' list ready.")
# Create disctionnary to hold the dataframes by schools
schoolDFs = {}

# Make dataframes by schools 
for name in schoolsNames:
    dfName = name
    schoolDFs[dfName] = testResults[testResults['School Name'] == name]
print('Dataframes by schools ready.')


plotsDFs = {}


print("Making plots of test results ...")

for subject in subjects:
    plots = []
    columns_to_plot = [f'Level 1 {subject}',f'Level 2 {subject}',f'Level 3 {subject}',f'Level 4 {subject}', f'Level 5 {subject}']  

    # Plot dataframes by school

    for schoolDF, current_dataframe in tqdm(schoolDFs.items()):
        # schoolDF contains the name of the dataframe
        # current_dataframe contains the dataframe itself
        # Do something with current_dataframe
        # Create a plot
        fig = create_plot(current_dataframe, schoolDF, columns_to_plot)

        # Convert the plot to a PNG image and then encode it
        io_buf = BytesIO()
        fig.savefig(io_buf, format='png', bbox_inches='tight')
        # Close the figure
        plt.close()
        #Reading file to get the base64 string
        io_buf.seek(0)
        base64_string = base64.b64encode(io_buf.read()).decode('utf8')

        pair = (schoolDF, base64_string)

        plots.append(pair) 
            
    # add the plots to the dataframe of middle schools subject results 
    plotsDFs[subject] = pd.DataFrame(plots, columns=['School Name', f'plot {subject}'])

           
# Concatenate all plots DataFrames along the columns before merging
combined_plots_df = pd.concat(plotsDFs.values(), axis=1)


print('Adding plots to the data frame with test results.')    
allResultsAVG2015_23DF = pd.merge(allResultsAVG2015_23DF, combined_plots_df, left_on = 'School Name', right_on=combined_plots_df.iloc[:, 0])
print('Done.')   

Schools' list ready.
Dataframes by schools ready.
Making plots of test results ...


100%|██████████████████████████████████████████████████████████████████████████████| 1202/1202 [03:56<00:00,  5.08it/s]


plotsDFs[subject]:                                             School Name  \
0                      ALICE COSTELLO ELEMENTARY SCHOOL   
1                             ALLAMUCHY TOWNSHIP SCHOOL   
2                                  ALPHA BOROUGH SCHOOL   
3                              ALPINE ELEMENTARY SCHOOL   
4                                      ARTS HIGH SCHOOL   
...                                                 ...   
1197  INTERNATIONAL ACADEMY OF ATLANTIC CITY CHARTER...   
1198                   PRINCIPLE ACADEMY CHARTER SCHOOL   
1199             HUDSON ARTS AND SCIENCE CHARTER SCHOOL   
1200        PHILIP'S ACADEMY CHARTER SCHOOL OF PATERSON   
1201                     CAMDENS PROMISE CHARTER SCHOOL   

                                              plot Math  
0     iVBORw0KGgoAAAANSUhEUgAAAfEAAAEhCAYAAABvDJlSAA...  
1     iVBORw0KGgoAAAANSUhEUgAAAbkAAAEhCAYAAADiYd4GAA...  
2     iVBORw0KGgoAAAANSUhEUgAAAYUAAAEhCAYAAACQrrywAA...  
3     iVBORw0KGgoAAAANSUhEUgAAAZ8AAAEhCA

100%|██████████████████████████████████████████████████████████████████████████████| 1202/1202 [03:54<00:00,  5.12it/s]

plotsDFs[subject]:                                             School Name  \
0                      ALICE COSTELLO ELEMENTARY SCHOOL   
1                             ALLAMUCHY TOWNSHIP SCHOOL   
2                                  ALPHA BOROUGH SCHOOL   
3                              ALPINE ELEMENTARY SCHOOL   
4                                      ARTS HIGH SCHOOL   
...                                                 ...   
1197  INTERNATIONAL ACADEMY OF ATLANTIC CITY CHARTER...   
1198                   PRINCIPLE ACADEMY CHARTER SCHOOL   
1199             HUDSON ARTS AND SCIENCE CHARTER SCHOOL   
1200        PHILIP'S ACADEMY CHARTER SCHOOL OF PATERSON   
1201                     CAMDENS PROMISE CHARTER SCHOOL   

                                               plot ELA  
0     iVBORw0KGgoAAAANSUhEUgAAAfEAAAEhCAYAAABvDJlSAA...  
1     iVBORw0KGgoAAAANSUhEUgAAAbkAAAEhCAYAAADiYd4GAA...  
2     iVBORw0KGgoAAAANSUhEUgAAAYUAAAEhCAYAAACQrrywAA...  
3     iVBORw0KGgoAAAANSUhEUgAAAZ8AAAEhCA




In [124]:
len(combined_plots_df)

1202

In [141]:
print('Adding plots to the data frame with test results.')    
allResultsAVG2015_23DF = pd.merge(allResultsAVG2015_23DF, combined_plots_df, left_on = 'School Name', right_on=combined_plots_df.iloc[:, 0])
print('Done.')   

Adding plots to the data frame with test results.
Done.


In [142]:
allResultsAVG2015_23DF.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 1234 entries, 0 to 1233
Data columns (total 18 columns):
 #   Column                   Non-Null Count  Dtype  
---  ------                   --------------  -----  
 0   School Name              1234 non-null   object 
 1   School Code              1234 non-null   object 
 2   School Name_x            1234 non-null   object 
 3   8yrs avg Lvl 1 Math      1224 non-null   float64
 4   8yrs avg Lvl 2 Math      1224 non-null   float64
 5   8yrs avg Lvl 3 Math      1224 non-null   float64
 6   8yrs avg Lvl 4 Math      1224 non-null   float64
 7   8yrs avg Lvl 5 Math      1224 non-null   float64
 8   8yrs avg Lvl 1 ELA       1225 non-null   float64
 9   8yrs avg Lvl 2 ELA       1225 non-null   float64
 10  8yrs avg Lvl 3 ELA       1225 non-null   float64
 11  8yrs avg Lvl 4 ELA       1225 non-null   float64
 12  8yrs avg Lvl 5 ELA       1225 non-null   float64
 13  8yrs avg Lvl 5 Math+Ela  1224 non-null   float64
 14  School Name_y           

In [100]:
allResultsAVG2015_23DF=allResultsAVG2015_23DF.drop(['plot Math_y', 'plot ELA_y', 'plot Math_x', 'plot ELA_x','School Name_x', 'School Name_y'], axis = 1)

In [104]:
allResultsAVG2015_23DF=allResultsAVG2015_23DF.drop(['plot Math', 'plot ELA'], axis = 1)

In [143]:
allResultsAVG2015_23DF=allResultsAVG2015_23DF.drop(['School Name_x', 'School Name_y'], axis = 1)

In [83]:
allResultsAVG2015_23DF[['plot Math', 'plot ELA']]

Unnamed: 0,plot Math,plot ELA
0,iVBORw0KGgoAAAANSUhEUgAAAfEAAAEhCAYAAABvDJlSAA...,iVBORw0KGgoAAAANSUhEUgAAAfEAAAEhCAYAAABvDJlSAA...


In [93]:
allResultsAVG2015_23DF[['plot Math', 'plot ELA']]

2

In [103]:
allResultsAVG2015_23DF.loc[0, 'plot Math'] == allResultsAVG2015_23DF.loc[0, 'plot ELA']

False

<a id="maps"></a> 
### Preparing geoJSONs for mapping

#### Read schools geolocation file

In [114]:
## Read GeoJSON into data frame
SchoolsFile = 'School_Point_Locations_of_NJ_(Public%2C_Private_and_Charter).geojson'
NJSchoolsPath = os.path.join(basePath, dataFolder, SchoolsFile)
NJSchoolsData = gpd.read_file(NJSchoolsPath)

DistrictsFile = 'School_Districts_-_Unified_for_New_Jersey.geojson'
NJDistrictsPath = os.path.join(basePath, dataFolder, DistrictsFile)
NJDistrictsData = gpd.read_file(NJDistrictsPath)

In [None]:
NJSchoolsData.info()

#### Merge the GeoJSON and the results dataframe

In [115]:
#NYCSchoolsData.info() #Too many columns --> make a smaller copy
NJSchoolsDataShort = NJSchoolsData[['OBJECTID', 'DIST_CODE', 'DIST_NAME', 'SCHOOLCODE', 'SCHOOLTYPE', 'SCHOOL', 'SCHOOLNAME', 'CITY', 'geometry']]
NJSchoolsDataShort.head()

Unnamed: 0,OBJECTID,DIST_CODE,DIST_NAME,SCHOOLCODE,SCHOOLTYPE,SCHOOL,SCHOOLNAME,CITY,geometry
0,1,1658,Glen Rock,24R,"DAY CARE, TRANSITIONAL K",Glen Rock Cooperative Nursery School,Glen Rock Cooperative Nusery School,Glen Rock,POINT (-74.12435 40.96150)
1,2,632,North Brunswick Twp,82F,CHILD CARE/PRE-SCHOOL,Creative Nursery School Childcare & Learning C...,Creative Nursery School,North Brunswick,POINT (-74.47660 40.44052)
2,3,1681,Old Bridge Twp.,18G,PRE-SCHOOL/PRE-K,Good Shepherd Children's Center,Good Shepherd Children's Center,Old Bridge,POINT (-74.30568 40.40041)
3,4,1784,Lakewood Twp,05D,SPECIAL EDUCATION,TREE OF KNOWLEDGE LEARNING ACADEMY,Tree Of Knowledge,Lakewood,POINT (-74.21612 40.09282)
4,5,63,Hackensack City,58A,DAY CARE,Sarkis & Siran Gabrellian Child Care Center,Sarkis & Siran Gabrellian Child Care and Learn...,Hackensack,POINT (-74.05591 40.88239)


In [24]:
# Matching the school all data file with spatial data (geojson of schools locations)
# Matching names from allResultsAVG2015_23DF to NYCSchoolsDataShort

tqdm.pandas(desc="Matching Names")

matched_tuples = allResultsAVG2015_23DF['School Name'].progress_apply(
    lambda x: match_name(x, NJSchoolsDataShort['SCHOOL'], min_score=80))

print('Done.')

Matching Names: 100%|██████████████████████████████████████████████████████████████| 1234/1234 [09:04<00:00,  2.27it/s]

Done.





In [144]:
print('Appending mathes to the dataframe.')
allResultsAVG2015_23DF['matched_name'] = list(zip(*matched_tuples_2))[0]
allResultsAVG2015_23DF['matched_score'] = list(zip(*matched_tuples_2))[1]
print('Done.')

Appending mathes to the dataframe.
Done.


In [145]:
allResultsAVG2015_23DF.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 1234 entries, 0 to 1233
Data columns (total 17 columns):
 #   Column                   Non-Null Count  Dtype  
---  ------                   --------------  -----  
 0   School Name              1234 non-null   object 
 1   School Code              1234 non-null   object 
 2   8yrs avg Lvl 1 Math      1224 non-null   float64
 3   8yrs avg Lvl 2 Math      1224 non-null   float64
 4   8yrs avg Lvl 3 Math      1224 non-null   float64
 5   8yrs avg Lvl 4 Math      1224 non-null   float64
 6   8yrs avg Lvl 5 Math      1224 non-null   float64
 7   8yrs avg Lvl 1 ELA       1225 non-null   float64
 8   8yrs avg Lvl 2 ELA       1225 non-null   float64
 9   8yrs avg Lvl 3 ELA       1225 non-null   float64
 10  8yrs avg Lvl 4 ELA       1225 non-null   float64
 11  8yrs avg Lvl 5 ELA       1225 non-null   float64
 12  8yrs avg Lvl 5 Math+Ela  1224 non-null   float64
 13  plot Math                1234 non-null   object 
 14  plot ELA                

In [146]:
(allResultsAVG2015_23DF['matched_score'] == -1).sum()

86

In [147]:
name = 'NJTestResults2023_tempMatched4.csv'
path = os.path.join(basePath, outputFolder, name)
print(f'Saving to {path} ...')
allResultsAVG2015_23DF.to_csv(path)
print('Saved.')
del name, path

Saving to G:\My Drive\Kids\NJ_schools_mapped\processed_data\NJTestResults2023_tempMatched4.csv ...
Saved.


In [148]:
# Unmatched or matched incorrectly names identified by 
# visual observations on the map or by analysing the geoJSON in prefered software
# allResultsAVG2015_23DF['School Name']:NJSchoolsDataShort['SCHOOL']

unmatched = {
'Brookside':'Brookside School',
'Markham Place':'Markham Place School',
'Roosevelt Science Technology Engineering And Mathematics (St':'Roosevelt STEM School',
'John Witherspoon Middle School':'Princeton Middle School',
'Park Middle School':'Park Elementary School',
'Mountain View':'Mountain View School',
'Hammarskjold Middle School':'Hammarskjold Upper Elementary School',
}

In [149]:
# Replacing the erroneus matches in the allResultsDF_2023 data frame

def replace_values(row):
    if row['School Name'] in unmatched:
        row['matched_name'] = unmatched[row['School Name']]
    return row

allResultsAVG2015_23DF = allResultsAVG2015_23DF.apply(replace_values, axis = 1)

In [150]:
allResultsAVG2015_23DF.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 1234 entries, 0 to 1233
Data columns (total 17 columns):
 #   Column                   Non-Null Count  Dtype  
---  ------                   --------------  -----  
 0   School Name              1234 non-null   object 
 1   School Code              1234 non-null   object 
 2   8yrs avg Lvl 1 Math      1224 non-null   float64
 3   8yrs avg Lvl 2 Math      1224 non-null   float64
 4   8yrs avg Lvl 3 Math      1224 non-null   float64
 5   8yrs avg Lvl 4 Math      1224 non-null   float64
 6   8yrs avg Lvl 5 Math      1224 non-null   float64
 7   8yrs avg Lvl 1 ELA       1225 non-null   float64
 8   8yrs avg Lvl 2 ELA       1225 non-null   float64
 9   8yrs avg Lvl 3 ELA       1225 non-null   float64
 10  8yrs avg Lvl 4 ELA       1225 non-null   float64
 11  8yrs avg Lvl 5 ELA       1225 non-null   float64
 12  8yrs avg Lvl 5 Math+Ela  1224 non-null   float64
 13  plot Math                1234 non-null   object 
 14  plot ELA                

In [40]:
allResultsAVG2015_23DF=allResultsAVG2015_23DF.drop(['School Name_x', 'School Name_y',], axis = 1)

In [152]:
# Merging DataFrames based on the matched name
print('Merging dataframes.')
schoolsData_mappable = pd.merge(NJSchoolsDataShort,allResultsAVG2015_23DF, left_on='SCHOOL', right_on='matched_name')

data_Name = 'NJpublicSchoolsData.geojson'
data_Path = os.path.join(basePath,outputFolder, data_Name)

print(f"Saving data to GeoJSON file {data_Path}...")
schoolsData_mappable.to_file(data_Path, driver="GeoJSON")

print('Saved.')
del data_Name, data_Path

Merging dataframes.
Saving data to GeoJSON file G:\My Drive\Kids\NJ_schools_mapped\processed_data\NJpublicSchoolsData.geojson...
Saved.


In [153]:
# allResultsAVG2013_23DF_norm.reset_index()
schoolsData_mappable.info()

<class 'geopandas.geodataframe.GeoDataFrame'>
Int64Index: 1499 entries, 0 to 1498
Data columns (total 26 columns):
 #   Column                   Non-Null Count  Dtype   
---  ------                   --------------  -----   
 0   OBJECTID                 1499 non-null   int64   
 1   DIST_CODE                1499 non-null   object  
 2   DIST_NAME                1499 non-null   object  
 3   SCHOOLCODE               1499 non-null   object  
 4   SCHOOLTYPE               1479 non-null   object  
 5   SCHOOL                   1499 non-null   object  
 6   SCHOOLNAME               1499 non-null   object  
 7   CITY                     1499 non-null   object  
 8   geometry                 1499 non-null   geometry
 9   School Name              1499 non-null   object  
 10  School Code              1499 non-null   object  
 11  8yrs avg Lvl 1 Math      1492 non-null   float64 
 12  8yrs avg Lvl 2 Math      1492 non-null   float64 
 13  8yrs avg Lvl 3 Math      1492 non-null   float64 
 14  

In [None]:
print(schoolsAllData_mappable['matched_name'].isnull().sum())

In [None]:
allDataNJ.head().to_txt()