## 2021 Best School Districts in the Chicago Area
### Web Scraping from NICHE
- https://www.niche.com/k12/search/best-school-districts/m/chicago-metro-area/

The 2021 Best School Districts ranking is based on rigorous analysis of key statistics and millions of reviews from students and parents using data from the U.S. Department of Education. Ranking factors include state test scores, college readiness, graduation rates, SAT/ACT scores, teacher quality, public school district ratings, and more.

## Data Sources
### K-12 Schools and Districts
- US Department of Education K12 data on graduation rates and state level test scores.
- Private School Universe Survey (PSS) from National Center for Education Statistics (NCES).
  Source for list of private schools and their information such as enrollment figures.
- Common Core Data (CCD) from National Center for Education Statistics (NCES)
  Source for list of schools and school districts and their information such as enrollment figures.
- Common Core Data (CCD) School District Finance Survey (F-33) from National Center for Education Statistics (NCES)
  School district data on finance information.
- Civil Rights Data Collection
  K12 data on AP/IB classes, disciplinary actions, athletics, etc.
- School Attendance Boundary Survey (SABS) from National Center for Education Statistics (NCES)
  Source for school boundaries.
- Niche K-12 Student and Parent Surveys
  Survey administered to millions of parents, high school students, and recent alumni on Niche.com.
- Niche K-12 Student and Parent Surveys 
  Survey administered to millions of parents, high school students, and recent alumni on Niche.com.

In [3]:
from splinter import Browser
from bs4 import BeautifulSoup
from webdriver_manager.chrome import ChromeDriverManager
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

In [4]:
#Setup splinter
executable_path = {'executable_path': ChromeDriverManager().install()}
browser = Browser('chrome',**executable_path, headless = False)



Current google-chrome version is 93.0.4577
Get LATEST driver version for 93.0.4577
Driver [/Users/zzj8988/.wdm/drivers/chromedriver/mac64/93.0.4577.63/chromedriver] found in cache


In [5]:
# 2021 Best School Districts in the Chicago Area
url = 'https://www.niche.com/k12/search/best-school-districts/m/chicago-metro-area/'

browser.visit(url)

In [16]:
district_info = []

In [17]:
#first 2 pages, there is no rating for the rest of eight pages
#go to the next page
target = 'span[class="icon-arrowright-thin--pagination"]'
browser.find_by_tag(target).click()
for x in range(1,3):
    
    html = browser.html
    soup = BeautifulSoup(html,'html.parser')
    
    districts = soup.find_all('div',class_='card__inner')
    print(f'----------------------------------Page {x}---------------------------------------------')  
    for district in districts:
        try:
            #district name
            name = district.h2.text
            #district rating
            rating = district.find('div',class_='search-result-badge').text
            #city and state, the second item in the unordered list
            city = district.ul.find_all('li')[1].text.strip()
                
            #grade, number, ratio info
            district_fact = district.find('ul',class_='search-result-fact-list').find_all('li')
            #Overall Niche Grade
            grade = district_fact[0].find('div',class_='niche__grade').text
            #total number of school
            number_school = district_fact[1].find('span').text
            #total students number
            number_student = district_fact[2].find('span').text
            info = [name,rating,city,grade,number_school,number_student]
            district_info.append(info)
                
            print(name)
            print(rating)
            print(city)
            print('-'*50)
            
        except AttributeError as e:
            pass
    
    #go to the next page
    target = 'span[class="icon-arrowright-thin--pagination"]'
    browser.find_by_tag(target).click()

----------------------------------Page 1---------------------------------------------
Adlai E. Stevenson High School District No. 125
#1 Best School Districts in Chicago Area
LINCOLNSHIRE, IL
--------------------------------------------------
Community High School District 128
#2 Best School Districts in Chicago Area
VERNON HILLS, IL
--------------------------------------------------
New Trier Township High School District No. 203
#3 Best School Districts in Chicago Area
NORTHFIELD, IL
--------------------------------------------------
Glenbrook High Schools District 225
#4 Best School Districts in Chicago Area
GLENVIEW, IL
--------------------------------------------------
Township High School District No. 113
#5 Best School Districts in Chicago Area
HIGHLAND PARK, IL
--------------------------------------------------
Hinsdale Township High School District No. 86
#6 Best School Districts in Chicago Area
HINSDALE, IL
--------------------------------------------------
Naperville Communi

In [22]:
browser.quit()

In [18]:
len(district_info)

50

In [19]:
district_df = pd.DataFrame(district_info,columns = ['name','rating','city','niche_grade','number_school','number_student'])
district_df

Unnamed: 0,name,rating,city,niche_grade,number_school,number_student
0,Adlai E. Stevenson High School District No. 125,#1 Best School Districts in Chicago Area,"LINCOLNSHIRE, IL",grade A+,2,4271
1,Community High School District 128,#2 Best School Districts in Chicago Area,"VERNON HILLS, IL",grade A+,2,3287
2,New Trier Township High School District No. 203,#3 Best School Districts in Chicago Area,"NORTHFIELD, IL",grade A+,2,4040
3,Glenbrook High Schools District 225,#4 Best School Districts in Chicago Area,"GLENVIEW, IL",grade A+,4,5201
4,Township High School District No. 113,#5 Best School Districts in Chicago Area,"HIGHLAND PARK, IL",grade A+,2,3467
5,Hinsdale Township High School District No. 86,#6 Best School Districts in Chicago Area,"HINSDALE, IL",grade A+,3,4146
6,Naperville Community Unit School District No. 203,#7 Best School Districts in Chicago Area,"NAPERVILLE, IL",grade A+,22,16586
7,Niles Township Community High School District ...,#8 Best School Districts in Chicago Area,"SKOKIE, IL",grade A+,3,4592
8,Barrington Community Unit School District No. 220,#9 Best School Districts in Chicago Area,"BARRINGTON, IL",grade A+,12,8557
9,Township High School District No. 211,#10 Best School Districts in Chicago Area,"PALATINE, IL",grade A+,7,11855


In [21]:
district_df.groupby('city').agg(Count=('city','count')).sort_values('Count', ascending=False)

Unnamed: 0_level_0,Count
city,Unnamed: 1_level_1
"VALPARAISO, IN",2
"ALGONQUIN, IL",2
"ADDISON, IL",1
"PARK RIDGE, IL",1
"MCHENRY, IL",1
"MUNSTER, IN",1
"NAPERVILLE, IL",1
"NEW LENOX, IL",1
"NORTHFIELD, IL",1
"OAK LAWN, IL",1


In [20]:
district_df.to_csv('top50_school_district.csv')