In [1]:
import pandas as pd 
df = pd.read_csv('data/school-district-asbestos-download-12-29.csv')

Here's some helpful explanations of abreviations from the Office of the City Controller's dashboard About page:

## Asbestos Types

**ACM**: Asbestos-Containing Material

**ACBM**: Asbestos-Containing Building Material

**ACPI**: Asbestos-Containing Pipe Insulation

**VAT**: Vinyl Asbestos Tiling

**FRI**: Friable Asbestos

## Units

**LF**: Linear Feet

**SF**: Square Feet

**CF**: Cubic Feet

## Abatement Methods

**REM**: Removal

**CAP**: Encapsulation

**CLO**: Enclosure

https://controller.phila.gov/philadelphia-audits/interactive-asbestos-dashboard/#/about

In [2]:
df.count()

Application Date           1827
Planned Completion Date    1827
Facility Name              1827
Facility Owner                0
Operation Type             1827
Permit Number              1827
Project Type               1827
School Address                0
School Level               1827
School Name                1827
School Website                0
Status                     1826
URL                        1827
Description                1826
Planned Start Date         1827
dtype: int64

In [3]:
import re
def get_total_quantity_per_project(description, unit='LF'):
  """Get total quantity of asbestos-containing material to be removed, encapsulated, or enclosed in a given project using naïve regex."""

  description = str(description)
  matches = re.findall(f'[0-9]+\s{unit}', description)
  counts = []
  for match in matches:
    counts.append(int(re.match('[0-9]+', match).group(0)))
  return sum(counts)

Add columns with total quantities of material per project.

In [4]:
# Linear feet
df['Total LF'] = df['Description'].apply(get_total_quantity_per_project, args=('LF',))

In [5]:
# Square feet
df['Total SF'] = df['Description'].apply(get_total_quantity_per_project, args=('SF',))

In [6]:
# Cubic feet
df['Total CF'] = df['Description'].apply(get_total_quantity_per_project, args=('CF',))

Get total quantities across district.

In [7]:
df['Total LF'].sum()

29572

In [8]:
df['Total SF'].sum()

299806

Divide square feet by 43560 to get acres.

In [9]:
df['Total SF'].sum() / 43560

6.882598714416896

In [10]:
df['Total CF'].sum()

0

Get total quantities per school.

In [11]:
lf_material_by_school = df.groupby('School Name')['Total LF'].sum()

In [12]:
sf_material_by_school = df.groupby('School Name')['Total SF'].sum()

In [13]:
cf_material_by_school = df.groupby('School Name')['Total CF'].sum()

In [14]:
sf_material_by_school

School Name
A. Philip Randolph Career and Technical High School     250
A.L. Fitzpatrick School                                 725
Abraham Lincoln High School                               0
Abram S. Jenks School                                    57
Academy at Palumbo                                     1564
                                                       ... 
William McKinley School                                   0
William Rowen School                                    528
William T. Tilden School                                  0
William W. Bodine High School                          1751
Woodrow Wilson School                                   191
Name: Total SF, Length: 214, dtype: int64

Get projects per school.

In [15]:
projects_by_school = df.groupby('School Name').size()

In [16]:
projects_by_school

School Name
A. Philip Randolph Career and Technical High School     1
A.L. Fitzpatrick School                                12
Abraham Lincoln High School                             6
Abram S. Jenks School                                  11
Academy at Palumbo                                     11
                                                       ..
William McKinley School                                 3
William Rowen School                                   13
William T. Tilden School                                5
William W. Bodine High School                          15
Woodrow Wilson School                                  11
Length: 214, dtype: int64

In [17]:
pd.concat([projects_by_school, lf_material_by_school, sf_material_by_school], axis=1)

Unnamed: 0_level_0,0,Total LF,Total SF
School Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
A. Philip Randolph Career and Technical High School,1,0,250
A.L. Fitzpatrick School,12,121,725
Abraham Lincoln High School,6,41,0
Abram S. Jenks School,11,159,57
Academy at Palumbo,11,124,1564
...,...,...,...
William McKinley School,3,0,0
William Rowen School,13,861,528
William T. Tilden School,5,312,0
William W. Bodine High School,15,96,1751


Look at distributions of project severity.

From dashboard About page:     
## Projects

Sorted from smallest to largest by amount of asbestos material.

**Incidental Project**: Abatement of less than 1 linear or 5 square feet.

**Small Project**: Abatement greater than or equal to 1 linear or 5 square feet but less than 3 linear or 12 square feet.

**Minor Project**: Abatement greater than or equal to 3 linear or 12 square feet but less than 40 linear or 80 square feet.

**Major Project, Permitted**: Abatement greater than or equal to 40 linear or 80 square feet. A Major Project must have a permit approved and issued by the Department of Public Health.

**Major Project, Non-Permitted**: Any demolition project except single residences with three dwelling units or less where there is no asbestos present.

**NESHAP Project**: Abatement greater than or equal to 260 linear or 160 square feet.

In [18]:
df['Project Type'].unique()

array(['Small Removal Project', 'Minor Removal Project',
       'Major Removal Project', 'Incidental Removal Project',
       'Non Friable Removal'], dtype=object)

I don't see any projects in the category of "Major Project, Non-Permitted." I'm therefore going to assume that every project in this dataset is for asbestos abatement.

In [19]:
df.groupby('Project Type').size()

Project Type
Incidental Removal Project    284
Major Removal Project         410
Minor Removal Project         867
Non Friable Removal           160
Small Removal Project         106
dtype: int64

In [20]:
df.groupby(['School Name', 'Project Type']).size()

School Name                                          Project Type              
A. Philip Randolph Career and Technical High School  Non Friable Removal           1
A.L. Fitzpatrick School                              Incidental Removal Project    4
                                                     Major Removal Project         1
                                                     Minor Removal Project         4
                                                     Non Friable Removal           3
                                                                                  ..
William W. Bodine High School                        Small Removal Project         1
Woodrow Wilson School                                Incidental Removal Project    2
                                                     Major Removal Project         1
                                                     Minor Removal Project         7
                                                     Non Friable Remov

The description above says a "minor" removal project should be less than 40 linear or 80 square feet. It's odd that I see an average of 102 square feet for minor projects. Maybe there's an issue with the way I determined total square feet per project. 

In [21]:
df[df['Project Type']=='Minor Removal Project']['Total SF'].describe()

count      867.000000
mean       100.648212
std        590.462030
min          0.000000
25%          0.000000
50%          0.000000
75%         10.000000
max      12345.000000
Name: Total SF, dtype: float64

Let's look more closely at the descriptions for minor removal projects.

In [22]:
df[(df['Project Type']=='Minor Removal Project') & (df['Total SF'] > 80)][['Project Type', 'Description', 'Total SF']]

Unnamed: 0,Project Type,Description,Total SF
26,Minor Removal Project,HARTRANFT ES - REVISION - CHANGED REM 15 SF VA...,146
28,Minor Removal Project,"ELKIN ES - REM 340 SF VAT IN RM 116, REM 100 S...",2225
72,Minor Removal Project,"BLAINE ES - REM 120 SF OF VAT IN STAIR #2, 205...",815
79,Minor Removal Project,Alternative Education Program - REM 240 SF of ...,240
82,Minor Removal Project,LINGELBACH ES - REM 10 SF OF ACOUSTICAL CEILIN...,500
...,...,...,...
1682,Minor Removal Project,SPRUANCE ES - REM 288 SF VAT IN PRINCIPALS OFF...,442
1744,Minor Removal Project,"DAY ES - REM 70 SF OF VAT, 3 LF ACPFI & 10 LF ...",402
1757,Minor Removal Project,HILL FREEDMAN WORLD ACADEMY @ LEEDS MS - REM 1...,100
1761,Minor Removal Project,SPRUANCE ES - REM 45 SF OF TRANSITE STALLS IN ...,95


In [23]:
df[(df['Project Type']=='Minor Removal Project') & (df['Total SF'] > 80)][['Project Type', 'Description', 'Total SF']].loc[28]['Description']

'ELKIN ES - REM 340 SF VAT IN RM 116, REM 100 SF VAT IN HALLWAY OUTSIDE RM 116, REM 1440 SF OV VAT IN ART ROOM, REM 295 SF VAT IN HALLWAY OUTSIDE ART ROOM & 50 SF VAT IN NURSES OFFICE.'

This project is in the "Minor Removal Project" category but clearly covers more than 80 square feet. 

In [24]:
geocoded_addresses = pd.read_csv('data/geocoded_addresses.csv')

In [25]:
schools = pd.read_csv('data/schools.csv')

In [26]:
schools = schools[['Publication Name', 'GPS Location', 'Street Address', 'Year Opened']]

Add school location information and year opened to asbestos projects dataframe.

In [27]:
df_with_addresses = df.merge(schools, left_on='School Name', right_on='Publication Name', how='left')

In [28]:
df_with_addresses.head()

Unnamed: 0,Application Date,Planned Completion Date,Facility Name,Facility Owner,Operation Type,Permit Number,Project Type,School Address,School Level,School Name,...,URL,Description,Planned Start Date,Total LF,Total SF,Total CF,Publication Name,GPS Location,Street Address,Year Opened
0,1/2/18,2/2/18,HOWE ACADEMICS PLUS ES,,Renovation,AN18-000004,Small Removal Project,,Elementary,Julia W. Howe School,...,https://www.citizenserve.com/Portal/PortalCont...,HOWE ACADEMICS PLUS ES - CAP 1 LF ACPI IN STOR...,1/3/18,4,6,0,Julia W. Howe School,"40.04113187, -75.14211131",5800 N 13TH ST,1913.0
1,1/2/20,1/5/20,FURNESS HS,,Renovation,AN20-000003,Minor Removal Project,,High,Furness High School,...,https://www.citizenserve.com/Portal/PortalCont...,FURNESS HS - REM 15 LF ACPI IN 312.,1/3/20,15,0,0,Furness High School,"39.923762, -75.150585",1900 S 3RD ST,1912.0
2,1/3/18,2/2/18,TM PEIRCE ES,,Renovation,AN18-000008,Minor Removal Project,,Elementary,Thomas M. Peirce School,...,https://www.citizenserve.com/Portal/PortalCont...,TM PEIRCE ES - CAP 1 LF ACPI IN MAIN GYM ADJAC...,1/4/18,24,0,0,Thomas M. Peirce School,"39.998839, -75.168424",3300 HENRY AVE,1908.0
3,1/3/18,1/3/18,KENDERTON ES,,Emergency Renovation,AN18-000009,Small Removal Project,,Elementary/Middle,Kenderton Elementary School,...,https://www.citizenserve.com/Portal/PortalCont...,KENDERTON ES - REM 3 LF ACPI IN CAFETERIA ON L...,1/3/18,4,0,0,Kenderton Elementary School,"40.004940, -75.154241",1500 W ONTARIO ST,2016.0
4,1/3/18,2/2/18,LINCOLN HS POOL & FIELD HOUSES,,Renovation,AN18-000015,Minor Removal Project,,High,Abraham Lincoln High School,...,https://www.citizenserve.com/Portal/PortalCont...,LINCOLN HS FIELD HOUSE & POOL HOUSE - REM <1 L...,1/4/18,11,0,0,Abraham Lincoln High School,"40.04310838, -75.04479221",3201 RYAN AVE,1950.0


There are some cases where the school name was not found in the schools CSV, so we don't have an address for those schools.

In [29]:
df_with_addresses[df_with_addresses['Street Address'].isna()]['School Name'].unique()

array(['Anna B. Pratt School (Closed)', 'John L. Kinsey School (Closed)',
       'Roberts Vaux School (Closed)',
       'Communication Technology High School (Closed)',
       'General David B.\xa0Birney\xa0Charter School',
       'High School for Business & Technology (Closed)',
       'General John F. Reynolds School (Closed)'], dtype=object)

These are schools that are currently closed. Add their addresses and coordinates manually.

In [30]:

manual_addresses = {'School Name': ['Anna B. Pratt School (Closed)', 'Roberts Vaux School (Closed)', 'John L. Kinsey School (Closed)',
                                    'Communication Technology High School (Closed)', 'General David B.\xa0Birney\xa0Charter School', 
                                    'High School for Business & Technology (Closed)', 'General John F. Reynolds School (Closed)'],
                    'Street Address': ['2200 N 22ND ST', '2300 W MASTER ST', '6501 LIMEKILN PIKE', '8110 LYONS AVE', '900 W LINDLEY AVE',
                                       '540 N 13TH ST', '1429 N 24TH ST'],
                    'GPS Location': [[39.989027,-75.16974],
                                     [39.97633317966743, -75.1742140217961],
                                     [40.053635,	-75.152142],
                                     [39.89885149816822, -75.24594565947879], 
                                     [40.02899190864331, -75.13856487296891], 
                                     [39.962982,	-75.159876],
                                     [39.977536,	-75.174784]]}

In [31]:
manual_addresses_df = pd.DataFrame.from_dict(manual_addresses)

In [32]:
manual_addresses_df

Unnamed: 0,School Name,Street Address,GPS Location
0,Anna B. Pratt School (Closed),2200 N 22ND ST,"[39.989027, -75.16974]"
1,Roberts Vaux School (Closed),2300 W MASTER ST,"[39.97633317966743, -75.1742140217961]"
2,John L. Kinsey School (Closed),6501 LIMEKILN PIKE,"[40.053635, -75.152142]"
3,Communication Technology High School (Closed),8110 LYONS AVE,"[39.89885149816822, -75.24594565947879]"
4,General David B. Birney Charter School,900 W LINDLEY AVE,"[40.02899190864331, -75.13856487296891]"
5,High School for Business & Technology (Closed),540 N 13TH ST,"[39.962982, -75.159876]"
6,General John F. Reynolds School (Closed),1429 N 24TH ST,"[39.977536, -75.174784]"


In [33]:
df_with_addresses_all = df_with_addresses.merge(manual_addresses_df, on='School Name', how='left')

In [34]:
df_with_addresses_all['GPS Location_coalesced'] = df_with_addresses_all['GPS Location_x'].combine_first(df_with_addresses_all['GPS Location_y'])

Check to make sure we have GPS locations for all schools in the dataframe.

In [35]:
df_with_addresses_all[df_with_addresses_all['GPS Location_coalesced'].isna()]

Unnamed: 0,Application Date,Planned Completion Date,Facility Name,Facility Owner,Operation Type,Permit Number,Project Type,School Address,School Level,School Name,...,Total LF,Total SF,Total CF,Publication Name,GPS Location_x,Street Address_x,Year Opened,Street Address_y,GPS Location_y,GPS Location_coalesced


Break out latitude and longitude into their own columns.

In [36]:
def get_lat_long_columns(gps_location_column):
  if isinstance(gps_location_column, str):
    gps_location_column = gps_location_column.split(', ')
  return float(gps_location_column[0]), float(gps_location_column[1])

df_with_addresses_all['lat'], df_with_addresses_all['long'] = zip(*df_with_addresses_all['GPS Location_coalesced'].map(get_lat_long_columns))

In [37]:
# Count of projects per school
df_with_addresses_all.groupby(['School Name', 'lat', 'long']).size()

School Name                                          lat        long      
A. Philip Randolph Career and Technical High School  40.008504  -75.179609     1
A.L. Fitzpatrick School                              40.080306  -74.976529    12
Abraham Lincoln High School                          40.043108  -75.044792     6
Abram S. Jenks School                                39.918773  -75.168367    11
Academy at Palumbo                                   39.940198  -75.161949    11
                                                                              ..
William McKinley School                              39.982769  -75.141655     3
William Rowen School                                 40.059307  -75.148647    13
William T. Tilden School                             39.920936  -75.232276     5
William W. Bodine High School                        39.967930  -75.143434    15
Woodrow Wilson School                                40.052367  -75.069086    11
Length: 214, dtype: int64

In [38]:
# Count of projects by type per school
df_with_addresses_all.groupby(['School Name', 'lat', 'long', 'Project Type']).size()

School Name                                          lat        long        Project Type              
A. Philip Randolph Career and Technical High School  40.008504  -75.179609  Non Friable Removal           1
A.L. Fitzpatrick School                              40.080306  -74.976529  Incidental Removal Project    4
                                                                            Major Removal Project         1
                                                                            Minor Removal Project         4
                                                                            Non Friable Removal           3
                                                                                                         ..
William W. Bodine High School                        39.967930  -75.143434  Small Removal Project         1
Woodrow Wilson School                                40.052367  -75.069086  Incidental Removal Project    2
                                 

In [39]:
# Total linear footage of projects per school
df_with_addresses_all.groupby(['School Name', 'lat', 'long'])['Total LF'].sum()

School Name                                          lat        long      
A. Philip Randolph Career and Technical High School  40.008504  -75.179609      0
A.L. Fitzpatrick School                              40.080306  -74.976529    121
Abraham Lincoln High School                          40.043108  -75.044792     41
Abram S. Jenks School                                39.918773  -75.168367    159
Academy at Palumbo                                   39.940198  -75.161949    124
                                                                             ... 
William McKinley School                              39.982769  -75.141655      0
William Rowen School                                 40.059307  -75.148647    861
William T. Tilden School                             39.920936  -75.232276    312
William W. Bodine High School                        39.967930  -75.143434     96
Woodrow Wilson School                                40.052367  -75.069086     60
Name: Total LF, Length:

In [40]:
# Total square footage of projects per school
df_with_addresses_all.groupby(['School Name', 'lat', 'long'])['Total SF'].sum()

School Name                                          lat        long      
A. Philip Randolph Career and Technical High School  40.008504  -75.179609     250
A.L. Fitzpatrick School                              40.080306  -74.976529     725
Abraham Lincoln High School                          40.043108  -75.044792       0
Abram S. Jenks School                                39.918773  -75.168367      57
Academy at Palumbo                                   39.940198  -75.161949    1564
                                                                              ... 
William McKinley School                              39.982769  -75.141655       0
William Rowen School                                 40.059307  -75.148647     528
William T. Tilden School                             39.920936  -75.232276       0
William W. Bodine High School                        39.967930  -75.143434    1751
Woodrow Wilson School                                40.052367  -75.069086     191
Name: Total 

Add in information about 2021 - 2022 school year budgets.

In [41]:
district_school_budgets = pd.read_csv('data/district_school_budgets.csv')

In [42]:
budget_rows = district_school_budgets.values
budget_df = pd.DataFrame.from_records(budget_rows[1:], columns=['School Name', 'Organization Code', 'Economically Disadvantaged Percentage', 'Enrollment (FY 22 Projected)', 'FY 22 Budget (Total Positions)', 'FY 22 Budget (Contracts, etc.)'])

In [43]:
budget_df.head()

Unnamed: 0,School Name,Organization Code,Economically Disadvantaged Percentage,Enrollment (FY 22 Projected),FY 22 Budget (Total Positions),"FY 22 Budget (Contracts, etc.)"
0,A.L. Fitzpatrick School,8390.0,60.42,759.0,7448260.0,195924.0
1,Abraham Lincoln High School,8010.0,67.82,1996.0,23048260.0,292120.0
2,Abram S. Jenks School,2520.0,71.03,288.0,3484980.0,53305.0
3,Academy at Palumbo,2620.0,52.83,1115.0,7860860.0,297961.0
4,Academy for the Middle Years at Northwest,6480.0,71.01,280.0,3093600.0,124943.0


In [44]:
df_with_addresses_all_budget = df_with_addresses_all.merge(budget_df, on='School Name', how='left')

In [45]:
import numpy as np
df_with_addresses_all_budget = df_with_addresses_all_budget.replace(r'^\s*$', np.nan, regex=True)

Add a column dividing total positions by projected enrollment to get rough budgeted amount per student.

In [46]:
df_with_addresses_all_budget['FY 22 Budget (Total Positions) Per Proj Enrolled Student'] = df_with_addresses_all_budget['FY 22 Budget (Total Positions)'].map(lambda x: float(x) if not isinstance(x, type(None)) else 0) / df_with_addresses_all_budget['Enrollment (FY 22 Projected)'].map(lambda x: float(x) if not isinstance(x, type(None)) else 0)

In [47]:
# Count of projects per school with EDP and Budget per student
df_with_addresses_all_budget.groupby(['School Name', 'Economically Disadvantaged Percentage', 'FY 22 Budget (Total Positions) Per Proj Enrolled Student', 'Year Opened', 'lat', 'long']).size()

School Name                                Economically Disadvantaged Percentage  FY 22 Budget (Total Positions) Per Proj Enrolled Student  Year Opened  lat        long      
A.L. Fitzpatrick School                    60.42                                  9813.254282                                               1960.0       40.080306  -74.976529    12
Abraham Lincoln High School                67.82                                  11547.224449                                              1950.0       40.043108  -75.044792     6
Abram S. Jenks School                      71.03                                  12100.625000                                              1897.0       39.918773  -75.168367    11
Academy at Palumbo                         52.83                                  7050.098655                                               2006.0       39.940198  -75.161949    11
Academy for the Middle Years at Northwest  71.01                                  11048.571429       

In [48]:
# Count of projects by type per school with EDP and Budget per student
df_with_addresses_all_budget.groupby(['School Name', 'Economically Disadvantaged Percentage', 'FY 22 Budget (Total Positions) Per Proj Enrolled Student', 'Year Opened', 'lat', 'long', 'Project Type']).size()

School Name                    Economically Disadvantaged Percentage  FY 22 Budget (Total Positions) Per Proj Enrolled Student  Year Opened  lat        long        Project Type              
A.L. Fitzpatrick School        60.42                                  9813.254282                                               1960.0       40.080306  -74.976529  Incidental Removal Project    4
                                                                                                                                                                    Major Removal Project         1
                                                                                                                                                                    Minor Removal Project         4
                                                                                                                                                                    Non Friable Removal           3
Abraham Lincoln High Scho

In [49]:
# Total linear footage of projects per school with EDP and Budget per student
df_with_addresses_all_budget.groupby(['School Name', 'Economically Disadvantaged Percentage', 'FY 22 Budget (Total Positions) Per Proj Enrolled Student', 'Year Opened','lat', 'long'])['Total LF'].sum()

School Name                                Economically Disadvantaged Percentage  FY 22 Budget (Total Positions) Per Proj Enrolled Student  Year Opened  lat        long      
A.L. Fitzpatrick School                    60.42                                  9813.254282                                               1960.0       40.080306  -74.976529    121
Abraham Lincoln High School                67.82                                  11547.224449                                              1950.0       40.043108  -75.044792     41
Abram S. Jenks School                      71.03                                  12100.625000                                              1897.0       39.918773  -75.168367    159
Academy at Palumbo                         52.83                                  7050.098655                                               2006.0       39.940198  -75.161949    124
Academy for the Middle Years at Northwest  71.01                                  11048.571429   

In [50]:
df_with_addresses_all_budget.groupby(['School Name', 'Economically Disadvantaged Percentage', 'FY 22 Budget (Total Positions) Per Proj Enrolled Student', 'Year Opened','lat', 'long'])['Total LF'].sum()

School Name                                Economically Disadvantaged Percentage  FY 22 Budget (Total Positions) Per Proj Enrolled Student  Year Opened  lat        long      
A.L. Fitzpatrick School                    60.42                                  9813.254282                                               1960.0       40.080306  -74.976529    121
Abraham Lincoln High School                67.82                                  11547.224449                                              1950.0       40.043108  -75.044792     41
Abram S. Jenks School                      71.03                                  12100.625000                                              1897.0       39.918773  -75.168367    159
Academy at Palumbo                         52.83                                  7050.098655                                               2006.0       39.940198  -75.161949    124
Academy for the Middle Years at Northwest  71.01                                  11048.571429   

In [51]:
df_with_addresses_all_budget.groupby(['School Name', 'Economically Disadvantaged Percentage', 'FY 22 Budget (Total Positions) Per Proj Enrolled Student', 'Year Opened','lat', 'long'])['Total LF'].sum().to_csv('asbestos_lf_per_school.csv')

In [52]:
# Total square footage of projects per school with EDP and Budget per student
df_with_addresses_all_budget.groupby(['School Name', 'Economically Disadvantaged Percentage', 'FY 22 Budget (Total Positions) Per Proj Enrolled Student', 'Year Opened', 'lat', 'long'])['Total SF'].sum()

School Name                                Economically Disadvantaged Percentage  FY 22 Budget (Total Positions) Per Proj Enrolled Student  Year Opened  lat        long      
A.L. Fitzpatrick School                    60.42                                  9813.254282                                               1960.0       40.080306  -74.976529     725
Abraham Lincoln High School                67.82                                  11547.224449                                              1950.0       40.043108  -75.044792       0
Abram S. Jenks School                      71.03                                  12100.625000                                              1897.0       39.918773  -75.168367      57
Academy at Palumbo                         52.83                                  7050.098655                                               2006.0       39.940198  -75.161949    1564
Academy for the Middle Years at Northwest  71.01                                  11048.57142

In [53]:
df_with_addresses_all_budget.groupby(['School Name', 'Economically Disadvantaged Percentage', 'FY 22 Budget (Total Positions) Per Proj Enrolled Student', 'Year Opened', 'lat', 'long'])['Total SF'].sum().to_csv('asbestos_sf_per_school.csv')