# Campus ratings

## A scripted lesson

This notebook takes my [Excel pivot table lesson](https://docs.google.com/document/d/1PRM1ozgbqkq69ZwpRue1ttho-FCHeKKR7Thybz6AAak/edit#heading=h.h6x8isam3qkn) and scripts it using agate.

## About the data

The Texas Education Agency rates public schools based on test scores and other factors. This lesson is based on the 2017 ratings released August 15, 2017. The file we are using, 2017-school-ratings.csv, is a version of the [downloadable data](https://rptsvr1.tea.texas.gov/perfreport/account/2017/download.html), but some processing has been done to cut down and rename the columns we need.

(This data set is actually a bit contrived for the purposes of learning a couple of skills, so perhaps we can come back later and add the preprocessing to this script.)


## Goal

We want to find a number of things from this data:
- What percentage of charter schools received a "Needs Improvement" rating, compared to traditional public schools.
- Which schools in Austin ISD received a "Needs Improvement" rating?
- Which schools in Region 13 received a "Needs Improvement" rating?


In [1]:
import agate

In [5]:
# set column types
specified_types = {
      'District_ID': agate.Text(),
      'Campus_ID': agate.Text(),
      'REGION': agate.Text(),    
  }

# create table named raw from csv
raw = agate.Table.from_csv('../data/2017-school-ratings.csv', column_types=specified_types)

In [6]:
# print the column names and types
print(raw)

| column      | data_type |
| ----------- | --------- |
| DISTNAME    | Text      |
| CAMPNAME    | Text      |
| District_ID | Text      |
| Campus_ID   | Text      |
| REGION      | Text      |
| CFLALTED    | Boolean   |
| C_RATING    | Text      |
| CI1         | Number    |
| CI1_CUT     | Number    |
| CI1_MET     | Text      |
| CI2         | Number    |
| CI2_CUT     | Number    |
| CI2_MET     | Text      |
| CI3         | Number    |
| CI3_CUT     | Number    |
| CI3_MET     | Text      |
| CI4         | Number    |
| CI4_CUT     | Number    |
| CI4_MET     | Text      |



In [13]:
# showing I can get charter schools
# .where is the method. It returns whatever is true
# we are feeding .where a test:
#  For each row, look at 'Campus_ID' at the 4th position and if it is 8, it is true.
#  If not 8, then it skips it.
raw.where(lambda row: row['Campus_ID'][3] == '8').limit(5).print_table()

| DISTNAME             | CAMPNAME             | District_ID | Campus_ID | REGION | CFLALTED | ... |
| -------------------- | -------------------- | ----------- | --------- | ------ | -------- | --- |
| PINEYWOODS COMMUN... | PINEYWOODS COMMUN... | 003801      | 003801001 | 7      |    False | ... |
| PINEYWOODS COMMUN... | DR TERRY ROBBINS ... | 003801      | 003801042 | 7      |    False | ... |
| PINEYWOODS COMMUN... | SARAH STRINDEN EL    | 003801      | 003801103 | 7      |    False | ... |
| ST MARY'S ACADEMY... | ST MARY'S ACADEMY... | 013801      | 013801101 | 2      |    False | ... |
| RICHARD MILBURN A... | RICHARD MILBURN A... | 014801      | 014801001 | 20     |     True | ... |


In [48]:
# this is a function for the .compute method below
# it evaluates if the value sent it is '8', and if
# so, then it returns 'Charter'. Of not, then 'Not charter'.
def set_charter(value):
    if value == '8':
        return 'Charter'
    else:
        return 'Not charter'

# We are creating a new column called 'charter'. To get the value to insert
# for each row, we are feeding the 4th position of the 'Campus_ID' column
# to the set_charter function, which is telling us what to put in, either
# 'Charter' or 'Not charter'
charter_set = raw.compute([
  ('charter',
   agate.Formula(agate.Text(),
   lambda r: set_charter(r['Campus_ID'][3]))
  )
])

In [55]:
# peek at charter records
charter_set.select([
        'Campus_ID',
        'charter'
    ]).where(lambda row: row['Campus_ID'][3] == '8').limit(5).print_table()

| Campus_ID | charter     |
| --------- | ----------- |
| 001902001 | Not charter |
| 001902041 | Not charter |
| 001902103 | Not charter |
| 001903001 | Not charter |
| 001903041 | Not charter |


In [56]:
# peek at non-charter records
charter_set.select([
        'Campus_ID',
        'charter'
    ]).where(lambda row: row['Campus_ID'][3] != '8').limit(5).print_table()

| Campus_ID | charter     |
| --------- | ----------- |
| 001902001 | Not charter |
| 001902041 | Not charter |
| 001902103 | Not charter |
| 001903001 | Not charter |
| 001903041 | Not charter |


## Create column of explained ratings

In [27]:
# These are the values for the rating.
# C_RATING is on the left, the definition is on the right
# M=Met Standard, A=Met Alternative Standard, I=Improvement Required, X/Z=Not Rated, T=Not Rated: Annexation
rating_values = {
    'I': 'Improvement required',
    'M': 'Met standard',
    'A': 'Met alternative standard',
    'X': 'Not rated',
    'Z': 'Not rated',
    'T': 'Not Rated',
    '': 'Not rated',
}

def map_rating(rating):
    rating = rating.strip()
    return rating_values[rating]

rated = raw.compute([
  ('mapped_rating',
   agate.Formula(agate.Text(),
   lambda r: map_rating(r['C_RATING']))
  )
])

| mapped_rating |
| ------------- |
| Met standard  |
| Met standard  |
| Met standard  |
| Met standard  |
| Met standard  |
| Met standard  |
| Met standard  |
| Met standard  |
| Met standard  |
| Met standard  |
| Met standard  |
| Met standard  |
| Met standard  |
| Met standard  |
| Met standard  |
| Met standard  |
| Met standard  |
| Met standard  |
| Met standard  |
| Met standard  |
| ...           |


In [31]:
rated.columns['C_RATING'].values_distinct()

('A', 'I', 'X', 'Z', 'T', 'M')

In [30]:
rated.columns['mapped_rating'].values_distinct()

('Met standard',
 'Not Rated',
 'Improvement required',
 'Met alternative standard',
 'Not rated')

In [25]:
test.print_table()

AttributeError: 'tuple' object has no attribute 'print_table'