## Starter Code for Assignment 2 - Big Data with H2O

**Regression**

Here is some starter code for Assignment 2 - Big Data with H2O.   You will still need to extend this code somewhat.  


[H2O Docs](http://docs.h2o.ai/h2o/latest-stable/h2o-docs/index.html)  
[http://docs.h2o.ai/h2o/latest-stable/h2o-docs/index.html](http://docs.h2o.ai/h2o/latest-stable/h2o-docs/index.html)  

In [1]:
import h2o
from h2o.automl import H2OAutoML
import random, os, sys
from datetime import datetime
import pandas as pd
import logging
import csv
import optparse
import time
import json
from distutils.util import strtobool

Set up some parameters for the analysis.  

In [2]:
data_path=None
all_variables=None
test_path=None
target=None
nthreads=1 
min_mem_size=6 
run_time=333
classification=False
scale=False
max_models=9    
model_path=None
balance_y=False 
balance_threshold=0.2
name=None 
server_path=None  
analysis=0 

The next sections contains helper functions for automating analysis.             

In [3]:
# Functions

def alphabet(n):
  alpha='0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'    
  str=''
  r=len(alpha)-1   
  while len(str)<n:
    i=random.randint(0,r)
    str+=alpha[i]   
  return str
  
  
def set_meta_data(run_id,analysis,target,run_time,classification,scale,model,balance,balance_threshold,name,nthreads,min_mem_size):
  m_data={}
  m_data['run_id'] =run_id
  m_data['start_time'] = time.time()
  m_data['target']=target
  m_data['max_models']=model
  m_data['run_time']=run_time
  m_data['scale']=scale
  m_data['classification']=classification
  m_data['scale']=False
  m_data['balance']=balance
  m_data['balance_threshold']=balance_threshold
  m_data['project'] =name
  m_data['end_time'] = time.time()
  m_data['execution_time'] = 0.0
  m_data['nthreads'] = nthreads
  m_data['min_mem_size'] = min_mem_size
  m_data['analysis'] = analysis
  return m_data


def dict_to_json(dct,n):
  j = json.dumps(dct, indent=4)
  f = open(n, 'w')
  print(j, file=f)
  f.close()
  
  
def stackedensemble(mod):
    coef_norm=None
    try:
      metalearner = h2o.get_model(mod.metalearner()['name'])
      coef_norm=metalearner.coef_norm()
    except:
      pass        
    return coef_norm

def stackedensemble_df(df):
    bm_algo={ 'GBM': None,'GLM': None,'DRF': None,'XRT': None,'Dee': None}
    for index, row in df.iterrows():
      if len(row['model_id'])>3:
        key=row['model_id'][0:3]
        if key in bm_algo:
          if bm_algo[key] is None:
                bm_algo[key]=row['model_id']
    bm=list(bm_algo.values()) 
    bm=list(filter(None.__ne__, bm))             
    return bm

def se_stats(modl):
    d={}
    d['algo']=modl.algo
    d['model_id']=modl.model_id   
    d['auc']=modl.auc()   
    d['roc']=modl.roc()
    d['mse']=modl.mse()   
    d['null_degrees_of_freedom']=modl.null_degrees_of_freedom()
    d['null_deviance']=modl.null_deviance()
    d['residual_degrees_of_freedom']=modl.residual_degrees_of_freedom()   
    d['residual_deviance']=modl.residual_deviance()
    d['rmse']=modl.rmse()
    return d

def get_model_by_algo(algo,models_dict):
    mod=None
    mod_id=None    
    for m in list(models_dict.keys()):
        if m[0:3]==algo:
            mod_id=m
            mod=h2o.get_model(m)      
    return mod,mod_id     
    
    
def gbm_stats(modl):
    d={}
    d['algo']=modl.algo
    d['model_id']=modl.model_id   
    d['varimp']=modl.varimp()  
    return d
    
    
def dl_stats(modl):
    d={}
    d['algo']=modl.algo
    d['model_id']=modl.model_id   
    d['varimp']=modl.varimp()  
    return d
    
    
def drf_stats(modl):
    d={}
    d['algo']=modl.algo
    d['model_id']=modl.model_id   
    d['varimp']=modl.varimp()  
    d['roc']=modl.roc()      
    return d
    
def xrt_stats(modl):
    d={}
    d['algo']=modl.algo
    d['model_id']=modl.model_id   
    d['varimp']=modl.varimp()  
    d['roc']=modl.roc()      
    return d
    
    
def glm_stats(modl):
    d={}
    d['algo']=modl.algo
    d['model_id']=modl.model_id   
    d['coef']=modl.coef()  
    d['coef_norm']=modl.coef_norm()      
    return d
    
def model_performance_stats(perf):
    d={}
    try:    
      d['mse']=perf.mse()
    except:
      pass      
    try:    
      d['rmse']=perf.rmse() 
    except:
      pass      
    try:    
      d['null_degrees_of_freedom']=perf.null_degrees_of_freedom()
    except:
      pass      
    try:    
      d['residual_degrees_of_freedom']=perf.residual_degrees_of_freedom()
    except:
      pass      
    try:    
      d['residual_deviance']=perf.residual_deviance() 
    except:
      pass      
    try:    
      d['null_deviance']=perf.null_deviance() 
    except:
      pass      
    try:    
      d['aic']=perf.aic() 
    except:
      pass      
    try:
      d['logloss']=perf.logloss() 
    except:
      pass    
    try:
      d['auc']=perf.auc()
    except:
      pass  
    try:
      d['gini']=perf.gini()
    except:
      pass    
    return d
    
def impute_missing_values(df, x, scal=False):
    # determine column types
    ints, reals, enums = [], [], []
    for key, val in df.types.items():
        if key in x:
            if val == 'enum':
                enums.append(key)
            elif val == 'int':
                ints.append(key)            
            else: 
                reals.append(key)    
    _ = df[reals].impute(method='mean')
    _ = df[ints].impute(method='median')
    if scal:
        df[reals] = df[reals].scale()
        df[ints] = df[ints].scale()    
    return


def get_independent_variables(df, targ):
    C = [name for name in df.columns if name != targ]
    # determine column types
    ints, reals, enums = [], [], []
    for key, val in df.types.items():
        if key in C:
            if val == 'enum':
                enums.append(key)
            elif val == 'int':
                ints.append(key)            
            else: 
                reals.append(key)    
    x=ints+enums+reals
    return x
    
def get_all_variables_csv(i):
    ivd={}
    try:
      iv = pd.read_csv(i,header=None)
    except:
      sys.exit(1)    
    col=iv.values.tolist()[0]
    dt=iv.values.tolist()[1]
    i=0
    for c in col:
      ivd[c.strip()]=dt[i].strip()
      i+=1        
    return ivd
    
    

def check_all_variables(df,dct,y=None):     
    targ=list(dct.keys())     
    for key, val in df.types.items():
        if key in targ:
          if dct[key] not in ['real','int','enum']:                      
            targ.remove(key)  
    for key, val in df.types.items():
        if key in targ:            
          if dct[key] != val:
            print('convert ',key,' ',dct[key],' ',val)
            if dct[key]=='enum':
                try:
                  df[key] = df[key].asfactor() 
                except:
                  targ.remove(key)                 
            if dct[key]=='int': 
                try:                
                  df[key] = df[key].asnumeric() 
                except:
                  targ.remove(key)                  
            if dct[key]=='real':
                try:                
                  df[key] = df[key].asnumeric()  
                except:
                  targ.remove(key)                  
    if y is None:
      y=df.columns[-1] 
    if y in targ:
      targ.remove(y)
    else:
      y=targ.pop()            
    return targ    
    
def predictions(mod,data,run_id):
    test = h2o.import_file(data)
    mod_perf=mod_best.model_performance(test)
              
    stats_test={}
    stats_test=model_performance_stats(mod_perf)

    n=run_id+'_test_stats.json'
    dict_to_json(stats_test,n) 

    try:    
      cf=mod_perf.confusion_matrix(metrics=["f1","f2","f0point5","accuracy","precision","recall","specificity","absolute_mcc","min_per_class_accuracy","mean_per_class_accuracy"])
      cf_df=cf[0].table.as_data_frame()
      cf_df.to_csv(run_id+'_test_confusion_matrix.csv')
    except:
      pass

    predictions = mod_best.predict(test)
    predictions_df=test.cbind(predictions).as_data_frame() 
    predictions_df.to_csv(run_id+'_predictions.csv')
    return

def predictions_test(mod,test,run_id):
    mod_perf=mod_best.model_performance(test)          
    stats_test={}
    stats_test=model_performance_stats(mod_perf)
    n=run_id+'_test_stats.json'
    dict_to_json(stats_test,n) 
    try:
      cf=mod_perf.confusion_matrix(metrics=["f1","f2","f0point5","accuracy","precision","recall","specificity","absolute_mcc","min_per_class_accuracy","mean_per_class_accuracy"])
      cf_df=cf[0].table.as_data_frame()
      cf_df.to_csv(run_id+'_test_confusion_matrix.csv')
    except:
      pass
    predictions = mod_best.predict(test)    
    predictions_df=test.cbind(predictions).as_data_frame() 
    predictions_df.to_csv(run_id+'_predictions.csv')
    return predictions

def check_X(x,df):
    for name in x:
        if name not in df.columns:
          x.remove(name)  
    return x    
    
    
def get_stacked_ensemble(lst):
    se=None
    for model in model_set:
      if 'BestOfFamily' in model:
        se=model
    if se is None:     
      for model in model_set:
        if 'AllModels'in model:
          se=model           
    return se       
    
def get_variables_types(df):
    d={}
    for key, val in df.types.items():
        d[key]=val           
    return d    
    
#  End Functions

In [4]:
data_path='data/multiple_regression_school_revenue.csv'

In [5]:
data_path = os.path.join(os.path.abspath(os.curdir),data_path)

In [6]:
all_variables=None

In [7]:
run_id=alphabet(9)
# run_id to std out
print (run_id) 

pNYSLwZc7


In [8]:
server_path=os.path.abspath(os.curdir)
os.chdir(server_path) 
run_dir = os.path.join(server_path,run_id)
os.mkdir(run_dir)
os.chdir(run_dir) 

In [9]:
# 65535 Highest port no
port_no=random.randint(5555,55555)
h2o.init(strict_version_check=False,min_mem_size_GB=min_mem_size,port=port_no)

Checking whether there is an H2O instance running at http://localhost:47402..... not found.
Attempting to start a local H2O server...
  Java Version: openjdk version "1.8.0_121"; OpenJDK Runtime Environment (Zulu 8.20.0.5-macosx) (build 1.8.0_121-b15); OpenJDK 64-Bit Server VM (Zulu 8.20.0.5-macosx) (build 25.121-b15, mixed mode)
  Starting server from /Users/bear/anaconda/lib/python3.6/site-packages/h2o/backend/bin/h2o.jar
  Ice root: /var/folders/lh/42j8mfjx069d1bkc2wlf2pw40000gn/T/tmp9cj7hgka
  JVM stdout: /var/folders/lh/42j8mfjx069d1bkc2wlf2pw40000gn/T/tmp9cj7hgka/h2o_bear_started_from_python.out
  JVM stderr: /var/folders/lh/42j8mfjx069d1bkc2wlf2pw40000gn/T/tmp9cj7hgka/h2o_bear_started_from_python.err
  Server is running at http://127.0.0.1:47402
Connecting to H2O server at http://127.0.0.1:47402... successful.


0,1
H2O cluster uptime:,03 secs
H2O cluster timezone:,America/New_York
H2O data parsing timezone:,UTC
H2O cluster version:,3.20.0.1
H2O cluster version age:,3 months and 25 days !!!
H2O cluster name:,H2O_from_python_bear_8yjhzf
H2O cluster total nodes:,1
H2O cluster free memory:,5.750 Gb
H2O cluster total cores:,8
H2O cluster allowed cores:,8


In [10]:
# meta data
meta_data = set_meta_data(run_id,analysis,target,run_time,classification,scale,max_models,balance_y,balance_threshold,name,nthreads,min_mem_size)
print(meta_data)  

{'run_id': 'pNYSLwZc7', 'start_time': 1538542842.277271, 'target': None, 'max_models': 9, 'run_time': 333, 'scale': False, 'classification': False, 'balance': False, 'balance_threshold': 0.2, 'project': None, 'end_time': 1538542842.2772748, 'execution_time': 0.0, 'nthreads': 1, 'min_mem_size': 6, 'analysis': 0}


In [11]:
print(data_path)

/Users/bear/Downloads/AutoML/data/multiple_regression_school_revenue.csv


In [12]:
df = h2o.import_file(data_path)

Parse progress: |█████████████████████████████████████████████████████████| 100%


In [13]:
df.head()

School Name,SED Code,Location Code,District,Latitude,Longitude,Address,City,Zip,Grades,Grade Low,Grade High,Community School?,Economic Need Index,Percent ELL,Percent Asian,Percent Black,Percent Hispanic,Percent Black / Hispanic,Percent White,Student Attendance Rate,Percent of Students Chronically Absent,Rigorous Instruction %,Rigorous Instruction Rating,Collaborative Teachers %,Collaborative Teachers Rating,Supportive Environment %,Supportive Environment Rating,Effective School Leadership %,Effective School Leadership Rating,Strong Family-Community Ties %,Strong Family-Community Ties Rating,Trust %,Trust Rating,Student Achievement Rating,Average ELA Proficiency,Average Math Proficiency,Grade 3 ELA - All Students Tested,Grade 3 ELA 4s - All Students,Grade 3 ELA 4s - American Indian or Alaska Native,Grade 3 ELA 4s - Black or African American,Grade 3 ELA 4s - Hispanic or Latino,Grade 3 ELA 4s - Asian or Pacific Islander,Grade 3 ELA 4s - White,Grade 3 ELA 4s - Multiracial,Grade 3 ELA 4s - Limited English Proficient,Grade 3 ELA 4s - Economically Disadvantaged,Grade 3 Math - All Students tested,Grade 3 Math 4s - All Students,Grade 3 Math 4s - American Indian or Alaska Native,Grade 3 Math 4s - Black or African American,Grade 3 Math 4s - Hispanic or Latino,Grade 3 Math 4s - Asian or Pacific Islander,Grade 3 Math 4s - White,Grade 3 Math 4s - Multiracial,Grade 3 Math 4s - Limited English Proficient,Grade 3 Math 4s - Economically Disadvantaged,Grade 4 ELA - All Students Tested,Grade 4 ELA 4s - All Students,Grade 4 ELA 4s - American Indian or Alaska Native,Grade 4 ELA 4s - Black or African American,Grade 4 ELA 4s - Hispanic or Latino,Grade 4 ELA 4s - Asian or Pacific Islander,Grade 4 ELA 4s - White,Grade 4 ELA 4s - Multiracial,Grade 4 ELA 4s - Limited English Proficient,Grade 4 ELA 4s - Economically Disadvantaged,Grade 4 Math - All Students Tested,Grade 4 Math 4s - All Students,Grade 4 Math 4s - American Indian or Alaska Native,Grade 4 Math 4s - Black or African American,Grade 4 Math 4s - Hispanic or Latino,Grade 4 Math 4s - Asian or Pacific Islander,Grade 4 Math 4s - White,Grade 4 Math 4s - Multiracial,Grade 4 Math 4s - Limited English Proficient,Grade 4 Math 4s - Economically Disadvantaged,Grade 5 ELA - All Students Tested,Grade 5 ELA 4s - All Students,Grade 5 ELA 4s - American Indian or Alaska Native,Grade 5 ELA 4s - Black or African American,Grade 5 ELA 4s - Hispanic or Latino,Grade 5 ELA 4s - Asian or Pacific Islander,Grade 5 ELA 4s - White,Grade 5 ELA 4s - Multiracial,Grade 5 ELA 4s - Limited English Proficient,Grade 5 ELA 4s - Economically Disadvantaged,Grade 5 Math - All Students Tested,Grade 5 Math 4s - All Students,Grade 5 Math 4s - American Indian or Alaska Native,Grade 5 Math 4s - Black or African American,Grade 5 Math 4s - Hispanic or Latino,Grade 5 Math 4s - Asian or Pacific Islander,Grade 5 Math 4s - White,Grade 5 Math 4s - Multiracial,Grade 5 Math 4s - Limited English Proficient,Grade 5 Math 4s - Economically Disadvantaged,Grade 6 ELA - All Students Tested,Grade 6 ELA 4s - All Students,Grade 6 ELA 4s - American Indian or Alaska Native,Grade 6 ELA 4s - Black or African American,Grade 6 ELA 4s - Hispanic or Latino,Grade 6 ELA 4s - Asian or Pacific Islander,Grade 6 ELA 4s - White,Grade 6 ELA 4s - Multiracial,Grade 6 ELA 4s - Limited English Proficient,Grade 6 ELA 4s - Economically Disadvantaged,Grade 6 Math - All Students Tested,Grade 6 Math 4s - All Students,Grade 6 Math 4s - American Indian or Alaska Native,Grade 6 Math 4s - Black or African American,Grade 6 Math 4s - Hispanic or Latino,Grade 6 Math 4s - Asian or Pacific Islander,Grade 6 Math 4s - White,Grade 6 Math 4s - Multiracial,Grade 6 Math 4s - Limited English Proficient,Grade 6 Math 4s - Economically Disadvantaged,Grade 7 ELA - All Students Tested,Grade 7 ELA 4s - All Students,Grade 7 ELA 4s - American Indian or Alaska Native,Grade 7 ELA 4s - Black or African American,Grade 7 ELA 4s - Hispanic or Latino,Grade 7 ELA 4s - Asian or Pacific Islander,Grade 7 ELA 4s - White,Grade 7 ELA 4s - Multiracial,Grade 7 ELA 4s - Limited English Proficient,Grade 7 ELA 4s - Economically Disadvantaged,Grade 7 Math - All Students Tested,Grade 7 Math 4s - All Students,Grade 7 Math 4s - American Indian or Alaska Native,Grade 7 Math 4s - Black or African American,Grade 7 Math 4s - Hispanic or Latino,Grade 7 Math 4s - Asian or Pacific Islander,Grade 7 Math 4s - White,Grade 7 Math 4s - Multiracial,Grade 7 Math 4s - Limited English Proficient,Grade 7 Math 4s - Economically Disadvantaged,Grade 8 ELA - All Students Tested,Grade 8 ELA 4s - All Students,Grade 8 ELA 4s - American Indian or Alaska Native,Grade 8 ELA 4s - Black or African American,Grade 8 ELA 4s - Hispanic or Latino,Grade 8 ELA 4s - Asian or Pacific Islander,Grade 8 ELA 4s - White,Grade 8 ELA 4s - Multiracial,Grade 8 ELA 4s - Limited English Proficient,Grade 8 ELA 4s - Economically Disadvantaged,Grade 8 Math - All Students Tested,Grade 8 Math 4s - All Students,Grade 8 Math 4s - American Indian or Alaska Native,Grade 8 Math 4s - Black or African American,Grade 8 Math 4s - Hispanic or Latino,Grade 8 Math 4s - Asian or Pacific Islander,Grade 8 Math 4s - White,Grade 8 Math 4s - Multiracial,Grade 8 Math 4s - Limited English Proficient,Grade 8 Math 4s - Economically Disadvantaged,School Income Estimate
P.S. 015 ROBERTO CLEMENTE,310000000000.0,01M015,1,40.7218,-73.9788,"333 E 4TH ST NEW YORK, NY 10009",NEW YORK,10009,"PK,0K,01,02,03,04,05",PK,5,Yes,0.919,0.09,0.05,0.32,0.6,0.92,0.01,0.94,0.18,0.89,Meeting Target,0.94,Meeting Target,0.86,Exceeding Target,0.91,Exceeding Target,0.85,Meeting Target,0.94,Exceeding Target,Approaching Target,2.14,2.17,20,0,0,0,0,0,0,0,0,0,21,0,0,0,0,0,0,0,0,0,15,0,0,0,0,0,0,0,0,0,15,2,0,0,0,0,0,0,0,0,16,0,0,0,0,0,0,0,0,0,16,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,31141.7
P.S. 019 ASHER LEVY,310000000000.0,01M019,1,40.7299,-73.9842,"185 1ST AVE NEW YORK, NY 10003",NEW YORK,10003,"PK,0K,01,02,03,04,05",PK,5,No,0.641,0.05,0.1,0.2,0.63,0.83,0.06,0.92,0.3,0.96,,0.96,,0.97,,0.9,Exceeding Target,0.86,Meeting Target,0.94,Meeting Target,Exceeding Target,2.63,2.98,33,2,0,1,1,0,0,0,0,0,33,6,0,2,1,0,0,0,0,4,29,5,0,0,3,0,0,0,0,3,28,10,0,0,6,0,0,0,0,8,32,7,0,3,1,2,0,0,0,6,32,4,0,0,1,2,0,0,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,56462.9
P.S. 020 ANNA SILVER,310000000000.0,01M020,1,40.7213,-73.9863,"166 ESSEX ST NEW YORK, NY 10002",NEW YORK,10002,"PK,0K,01,02,03,04,05",PK,5,No,0.744,0.15,0.35,0.08,0.49,0.57,0.04,0.94,0.2,0.87,Meeting Target,0.77,Meeting Target,0.82,Approaching Target,0.61,Not Meeting Target,0.8,Approaching Target,0.79,Not Meeting Target,Approaching Target,2.39,2.54,76,6,0,0,0,4,0,0,0,2,76,11,0,0,3,7,0,0,0,6,70,9,0,0,1,6,2,0,0,1,71,13,0,0,0,11,2,0,0,4,73,2,0,0,1,1,0,0,0,0,73,10,0,0,1,9,0,0,1,6,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,44342.6
P.S. 034 FRANKLIN D. ROOSEVELT,310000000000.0,01M034,1,40.7261,-73.975,"730 E 12TH ST NEW YORK, NY 10009",NEW YORK,10009,"PK,0K,01,02,03,04,05,06,07,08",PK,8,No,0.86,0.07,0.05,0.29,0.63,0.92,0.04,0.92,0.28,0.85,Approaching Target,0.78,Meeting Target,0.82,Meeting Target,0.73,Approaching Target,0.89,Meeting Target,0.88,Meeting Target,Exceeding Target,2.48,2.47,27,0,0,0,0,0,0,0,0,0,29,4,0,0,2,0,0,0,0,0,35,1,0,0,1,0,0,0,0,0,34,1,0,0,1,0,0,0,0,0,29,0,0,0,0,0,0,0,0,0,29,1,0,0,1,0,0,0,0,0,54,3,0,0,1,0,0,0,0,3,54,3,0,0,0,0,0,0,0,3,55,4,0,0,3,0,0,0,0,0,55,3,0,0,3,0,0,0,0,0,47,1,0,0,0,0,0,0,0,0,48,1,0,0,0,0,0,0,0,0,31454.0
THE STAR ACADEMY - P.S.63,310000000000.0,01M063,1,40.7244,-73.9864,"121 E 3RD ST NEW YORK, NY 10009",NEW YORK,10009,"PK,0K,01,02,03,04,05",PK,5,No,0.73,0.03,0.04,0.2,0.65,0.84,0.1,0.93,0.23,0.9,Meeting Target,0.88,Meeting Target,0.87,Meeting Target,0.81,Meeting Target,0.89,Meeting Target,0.93,Meeting Target,Meeting Target,2.38,2.54,21,2,0,0,2,0,0,0,0,0,21,5,0,0,2,0,0,0,0,2,15,2,0,1,0,0,0,0,0,0,15,3,0,1,0,0,0,0,0,0,12,1,0,0,0,0,0,0,0,1,12,2,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,46435.6
P.S. 064 ROBERT SIMON,310000000000.0,01M064,1,40.7237,-73.9816,"600 E 6TH ST NEW YORK, NY 10009",NEW YORK,10009,"PK,0K,01,02,03,04,05",PK,5,No,0.858,0.06,0.07,0.19,0.66,0.84,0.07,0.92,0.33,0.93,Meeting Target,0.99,Exceeding Target,0.95,Exceeding Target,0.91,Exceeding Target,0.88,Meeting Target,0.97,Exceeding Target,Meeting Target,2.29,2.48,29,1,0,0,0,0,0,0,0,0,31,4,0,0,1,0,0,0,0,0,40,2,0,0,2,0,0,0,0,2,40,4,0,0,1,0,0,0,0,3,42,0,0,0,0,0,0,0,0,0,44,5,0,2,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,39415.4
P.S. 110 FLORENCE NIGHTINGALE,310000000000.0,01M110,1,40.7153,-73.9799,"285 DELANCEY ST NEW YORK, NY 10002",NEW YORK,10002,"PK,0K,01,02,03,04,05",PK,5,No,0.499,0.01,0.16,0.1,0.43,0.53,0.27,0.95,0.13,0.88,Exceeding Target,0.78,Meeting Target,0.95,Exceeding Target,0.69,Approaching Target,0.87,Meeting Target,0.78,Not Meeting Target,Exceeding Target,2.8,3.2,81,12,0,1,1,3,7,0,0,6,81,37,0,0,11,12,10,0,0,20,49,15,0,0,4,0,9,0,0,4,49,23,0,0,6,0,12,0,0,6,65,16,0,1,5,0,6,0,0,4,65,18,0,1,5,0,6,0,0,5,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,43706.7
P.S. 134 HENRIETTA SZOLD,310000000000.0,01M134,1,40.7143,-73.983,"293 E BROADWAY NEW YORK, NY 10002",NEW YORK,10002,"PK,0K,01,02,03,04,05",PK,5,No,0.833,0.12,0.21,0.2,0.55,0.75,0.03,0.91,0.36,0.87,Approaching Target,0.89,Meeting Target,0.88,Approaching Target,0.88,Meeting Target,0.79,Approaching Target,0.94,Exceeding Target,Meeting Target,2.28,2.73,35,0,0,0,0,0,0,0,0,0,36,6,0,0,1,4,0,0,1,0,43,3,0,0,1,2,0,0,0,0,43,6,0,0,0,6,0,0,0,0,34,5,0,0,5,0,0,0,0,0,34,7,0,0,6,0,0,0,0,7,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,28820.7
P.S. 140 NATHAN STRAUS,310000000000.0,01M140,1,40.7191,-73.9833,"123 RIDGE ST NEW YORK, NY 10002",NEW YORK,10002,"PK,0K,01,02,03,04,05,06,07,08",PK,8,No,0.849,0.14,0.05,0.13,0.78,0.9,0.03,0.93,0.27,0.94,Meeting Target,0.91,Approaching Target,0.85,Meeting Target,0.87,Meeting Target,0.83,Meeting Target,0.93,Meeting Target,Meeting Target,2.21,2.27,28,0,0,0,0,0,0,0,0,0,28,0,0,0,0,0,0,0,0,0,31,0,0,0,0,0,0,0,0,0,32,1,0,0,1,0,0,0,0,0,21,0,0,0,0,0,0,0,0,0,21,0,0,0,0,0,0,0,0,0,54,0,0,0,0,0,0,0,0,0,56,5,0,1,2,0,0,0,0,3,47,2,0,0,2,0,0,0,0,0,46,1,0,0,0,1,0,0,0,1,61,3,0,0,1,0,0,0,0,1,61,2,0,0,1,0,0,0,0,1,34889.2
P.S. 142 AMALIA CASTRO,310000000000.0,01M142,1,40.7182,-73.9841,"100 ATTORNEY ST NEW YORK, NY 10002",NEW YORK,10002,"PK,0K,01,02,03,04,05",PK,5,No,0.861,0.08,0.06,0.11,0.78,0.9,0.02,0.92,0.27,0.92,Meeting Target,0.89,Meeting Target,0.9,Meeting Target,0.83,Meeting Target,0.89,Meeting Target,0.95,Exceeding Target,Meeting Target,2.16,2.31,40,1,0,0,1,0,0,0,0,0,38,1,0,0,1,0,0,0,0,0,58,4,0,0,3,0,0,0,0,0,55,2,0,0,0,0,0,0,0,0,46,1,0,0,1,0,0,0,0,0,47,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,35545.1




In [14]:
df.describe()

Rows:1272
Cols:158




Unnamed: 0,School Name,SED Code,Location Code,District,Latitude,Longitude,Address,City,Zip,Grades,Grade Low,Grade High,Community School?,Economic Need Index,Percent ELL,Percent Asian,Percent Black,Percent Hispanic,Percent Black / Hispanic,Percent White,Student Attendance Rate,Percent of Students Chronically Absent,Rigorous Instruction %,Rigorous Instruction Rating,Collaborative Teachers %,Collaborative Teachers Rating,Supportive Environment %,Supportive Environment Rating,Effective School Leadership %,Effective School Leadership Rating,Strong Family-Community Ties %,Strong Family-Community Ties Rating,Trust %,Trust Rating,Student Achievement Rating,Average ELA Proficiency,Average Math Proficiency,Grade 3 ELA - All Students Tested,Grade 3 ELA 4s - All Students,Grade 3 ELA 4s - American Indian or Alaska Native,Grade 3 ELA 4s - Black or African American,Grade 3 ELA 4s - Hispanic or Latino,Grade 3 ELA 4s - Asian or Pacific Islander,Grade 3 ELA 4s - White,Grade 3 ELA 4s - Multiracial,Grade 3 ELA 4s - Limited English Proficient,Grade 3 ELA 4s - Economically Disadvantaged,Grade 3 Math - All Students tested,Grade 3 Math 4s - All Students,Grade 3 Math 4s - American Indian or Alaska Native,Grade 3 Math 4s - Black or African American,Grade 3 Math 4s - Hispanic or Latino,Grade 3 Math 4s - Asian or Pacific Islander,Grade 3 Math 4s - White,Grade 3 Math 4s - Multiracial,Grade 3 Math 4s - Limited English Proficient,Grade 3 Math 4s - Economically Disadvantaged,Grade 4 ELA - All Students Tested,Grade 4 ELA 4s - All Students,Grade 4 ELA 4s - American Indian or Alaska Native,Grade 4 ELA 4s - Black or African American,Grade 4 ELA 4s - Hispanic or Latino,Grade 4 ELA 4s - Asian or Pacific Islander,Grade 4 ELA 4s - White,Grade 4 ELA 4s - Multiracial,Grade 4 ELA 4s - Limited English Proficient,Grade 4 ELA 4s - Economically Disadvantaged,Grade 4 Math - All Students Tested,Grade 4 Math 4s - All Students,Grade 4 Math 4s - American Indian or Alaska Native,Grade 4 Math 4s - Black or African American,Grade 4 Math 4s - Hispanic or Latino,Grade 4 Math 4s - Asian or Pacific Islander,Grade 4 Math 4s - White,Grade 4 Math 4s - Multiracial,Grade 4 Math 4s - Limited English Proficient,Grade 4 Math 4s - Economically Disadvantaged,Grade 5 ELA - All Students Tested,Grade 5 ELA 4s - All Students,Grade 5 ELA 4s - American Indian or Alaska Native,Grade 5 ELA 4s - Black or African American,Grade 5 ELA 4s - Hispanic or Latino,Grade 5 ELA 4s - Asian or Pacific Islander,Grade 5 ELA 4s - White,Grade 5 ELA 4s - Multiracial,Grade 5 ELA 4s - Limited English Proficient,Grade 5 ELA 4s - Economically Disadvantaged,Grade 5 Math - All Students Tested,Grade 5 Math 4s - All Students,Grade 5 Math 4s - American Indian or Alaska Native,Grade 5 Math 4s - Black or African American,Grade 5 Math 4s - Hispanic or Latino,Grade 5 Math 4s - Asian or Pacific Islander,Grade 5 Math 4s - White,Grade 5 Math 4s - Multiracial,Grade 5 Math 4s - Limited English Proficient,Grade 5 Math 4s - Economically Disadvantaged,Grade 6 ELA - All Students Tested,Grade 6 ELA 4s - All Students,Grade 6 ELA 4s - American Indian or Alaska Native,Grade 6 ELA 4s - Black or African American,Grade 6 ELA 4s - Hispanic or Latino,Grade 6 ELA 4s - Asian or Pacific Islander,Grade 6 ELA 4s - White,Grade 6 ELA 4s - Multiracial,Grade 6 ELA 4s - Limited English Proficient,Grade 6 ELA 4s - Economically Disadvantaged,Grade 6 Math - All Students Tested,Grade 6 Math 4s - All Students,Grade 6 Math 4s - American Indian or Alaska Native,Grade 6 Math 4s - Black or African American,Grade 6 Math 4s - Hispanic or Latino,Grade 6 Math 4s - Asian or Pacific Islander,Grade 6 Math 4s - White,Grade 6 Math 4s - Multiracial,Grade 6 Math 4s - Limited English Proficient,Grade 6 Math 4s - Economically Disadvantaged,Grade 7 ELA - All Students Tested,Grade 7 ELA 4s - All Students,Grade 7 ELA 4s - American Indian or Alaska Native,Grade 7 ELA 4s - Black or African American,Grade 7 ELA 4s - Hispanic or Latino,Grade 7 ELA 4s - Asian or Pacific Islander,Grade 7 ELA 4s - White,Grade 7 ELA 4s - Multiracial,Grade 7 ELA 4s - Limited English Proficient,Grade 7 ELA 4s - Economically Disadvantaged,Grade 7 Math - All Students Tested,Grade 7 Math 4s - All Students,Grade 7 Math 4s - American Indian or Alaska Native,Grade 7 Math 4s - Black or African American,Grade 7 Math 4s - Hispanic or Latino,Grade 7 Math 4s - Asian or Pacific Islander,Grade 7 Math 4s - White,Grade 7 Math 4s - Multiracial,Grade 7 Math 4s - Limited English Proficient,Grade 7 Math 4s - Economically Disadvantaged,Grade 8 ELA - All Students Tested,Grade 8 ELA 4s - All Students,Grade 8 ELA 4s - American Indian or Alaska Native,Grade 8 ELA 4s - Black or African American,Grade 8 ELA 4s - Hispanic or Latino,Grade 8 ELA 4s - Asian or Pacific Islander,Grade 8 ELA 4s - White,Grade 8 ELA 4s - Multiracial,Grade 8 ELA 4s - Limited English Proficient,Grade 8 ELA 4s - Economically Disadvantaged,Grade 8 Math - All Students Tested,Grade 8 Math 4s - All Students,Grade 8 Math 4s - American Indian or Alaska Native,Grade 8 Math 4s - Black or African American,Grade 8 Math 4s - Hispanic or Latino,Grade 8 Math 4s - Asian or Pacific Islander,Grade 8 Math 4s - White,Grade 8 Math 4s - Multiracial,Grade 8 Math 4s - Limited English Proficient,Grade 8 Math 4s - Economically Disadvantaged,School Income Estimate
type,string,int,string,int,real,real,enum,enum,int,enum,enum,int,enum,real,real,real,real,real,real,real,real,real,real,enum,real,enum,real,enum,real,enum,real,enum,real,enum,enum,real,real,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,real
mins,,308000000000.0,,1.0,40.507803,-74.244025,,,10001.0,,,2.0,,0.049,0.0,0.0,0.0,0.02,0.03,0.0,0.0,0.0,0.0,,0.0,,0.0,,0.0,,0.0,,0.0,,,1.81,1.83,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
mean,,328709905660.3774,,16.135220125786162,40.73453669418239,-73.91834700157233,,,10815.720911949686,,,6.811023622047244,,0.6722806736166801,0.1248427672955975,0.11647798742138363,0.31996069182389936,0.41153301886792454,0.731438679245283,0.13163522012578616,0.9272493985565357,0.21574979951884524,0.8947714514835605,,0.8843624699278267,,0.8874819566960707,,0.8161507618283881,,0.8309141940657578,,0.9042261427425822,,,2.5342152834839773,2.668956450287592,60.569182389937104,4.952830188679245,0.009433962264150943,0.7138364779874213,0.8286163522012578,1.3474842767295598,1.3702830188679243,0.03930817610062893,0.02437106918238994,1.9827044025157232,61.651729559748425,13.87814465408805,0.027515723270440245,2.255503144654088,2.815251572327044,4.025943396226415,3.0102201257861636,0.0660377358490566,0.5306603773584905,6.677672955974843,58.007861635220124,9.904088050314465,0.02358490566037736,1.349056603773585,2.029874213836478,2.525157232704402,2.5400943396226414,0.05110062893081761,0.04716981132075472,4.569182389937106,59.00393081761006,13.192610062893081,0.036949685534591194,1.7610062893081762,2.717767295597484,3.949685534591195,2.911949685534591,0.06289308176100629,0.3867924528301886,6.514937106918239,56.33647798742138,6.540880503144654,0.01729559748427673,0.75,1.2075471698113207,1.8018867924528301,1.878930817610063,0.026729559748427674,0.010220125786163523,2.845125786163522,57.263364779874216,9.362421383647797,0.02830188679245283,0.8805031446540881,1.6627358490566038,3.2492138364779874,2.352201257861635,0.02908805031446541,0.17531446540880502,4.461477987421383,54.54638364779874,8.35377358490566,0.02279874213836478,1.121069182389937,1.5361635220125787,2.690251572327044,2.1713836477987423,0.07311320754716981,0.018867924528301886,4.364779874213837,55.4559748427673,12.169025157232705,0.02358490566037736,1.4591194968553463,2.285377358490566,4.289308176100629,2.9905660377358485,0.07547169811320754,0.27908805031446543,6.678459119496855,53.65880503144654,6.540880503144654,0.018867924528301886,0.6808176100628931,1.2122641509433962,1.9992138364779874,1.9842767295597483,0.025943396226415096,0.0023584905660377358,3.1705974842767297,54.286949685534594,8.559748427672956,0.018867924528301886,0.7476415094339622,1.360062893081761,3.3663522012578615,2.3168238993710695,0.03380503144654088,0.12421383647798742,4.551886792452829,52.1627358490566,7.322327044025157,0.013364779874213837,0.9166666666666666,1.520440251572327,2.255503144654088,1.931603773584906,0.015723270440251572,0.0015723270440251573,3.914308176100629,43.84119496855346,4.911949685534591,0.0031446540880503146,0.610062893081761,0.9473270440251572,1.9842767295597483,0.9709119496855346,0.002358490566037736,0.15959119496855345,2.992138364779874,33493.43374901342
maxs,,353000000000.0,,32.0,40.903455,-73.70892,,,11694.0,,,12.0,,0.957,0.99,0.95,0.97,1.0,1.0,0.92,1.0,1.0,1.0,,1.0,,1.0,,0.99,,0.99,,1.0,,,3.93,4.2,356.0,55.0,3.0,24.0,20.0,39.0,37.0,5.0,3.0,29.0,365.0,130.0,4.0,67.0,41.0,110.0,65.0,8.0,39.0,98.0,330.0,101.0,6.0,52.0,43.0,64.0,67.0,9.0,6.0,68.0,332.0,125.0,9.0,54.0,58.0,121.0,75.0,8.0,27.0,101.0,333.0,76.0,9.0,20.0,21.0,48.0,50.0,7.0,3.0,41.0,337.0,148.0,12.0,50.0,38.0,145.0,69.0,7.0,17.0,96.0,631.0,311.0,6.0,56.0,32.0,167.0,191.0,16.0,2.0,176.0,646.0,370.0,7.0,62.0,55.0,200.0,226.0,26.0,25.0,209.0,698.0,238.0,7.0,27.0,37.0,154.0,148.0,11.0,1.0,126.0,715.0,304.0,7.0,45.0,51.0,206.0,176.0,11.0,28.0,166.0,743.0,261.0,5.0,59.0,62.0,203.0,116.0,9.0,1.0,159.0,652.0,312.0,2.0,107.0,71.0,246.0,126.0,3.0,33.0,196.0,181382.06
sigma,,12246547874.627287,,9.245269789690184,0.08660233504334809,0.08057649217602468,,,529.5888745261054,,,2.2469041251441295,,0.2109593993866925,0.1136310389988787,0.17654170039311673,0.28819672945043184,0.2615350054920455,0.29377446459482975,0.20035844893173518,0.0874044556875452,0.14071610576515176,0.06995131225656014,,0.07469503628689782,,0.06585097568309414,,0.09849641549717199,,0.06278551362631553,,0.061227598371688896,,,0.36358941969604563,0.47046989274220613,57.87249551535864,8.300567781823467,0.14812436683748972,2.2728577515459962,1.8390463707371223,4.113688453267606,4.096072527190418,0.3526170464228647,0.18231075754695697,3.73825140954801,59.21195519684897,20.003778408263546,0.2661568275903549,7.041182749132261,5.408096628390776,11.84406100839878,8.294484870028898,0.5570892305983715,2.1815340182819374,11.261355695923555,57.18632102056875,15.164254076129035,0.3037821552046172,3.9166190388466378,4.108373056792661,7.496718136852251,7.276161739047364,0.521510892155569,0.306203641068976,7.7847675483986505,58.481579879188814,20.40085716050271,0.46367953743737567,5.883697483342552,5.564702632554983,11.973967956960408,8.295114274198642,0.6240451736880901,1.6852435202157658,11.771483906271152,55.25507682455075,10.58790613227203,0.304205375023246,2.067792811794701,2.3573724766865647,5.330322052178046,5.59429259467753,0.34705821695329747,0.1218372932019407,5.050530643368284,56.47912917263968,15.933968873987846,0.46173920527332857,3.1201782915714737,3.4842924682246315,10.454276422891665,6.802381556185938,0.3699190186522358,0.9192002451306134,9.022661932100663,92.69076386158294,24.411738316901275,0.29463980928110856,3.54840145810437,3.9117262512027695,12.579064167342445,10.517702224461337,0.8340701827700618,0.14177466249755225,12.658336043545287,94.4446417209382,34.59975327235193,0.31397110426234154,5.1167236497409005,5.990512300548883,19.0971991922262,14.420838111124617,1.054291263567708,1.4989961871039876,18.758632731388513,95.48780686527432,20.417589226306585,0.274180231275205,2.381150068527346,3.4242032823705246,9.448064972162468,10.050763581522993,0.4273717713418066,0.04852606856009667,9.705335281360087,97.10736173777025,28.065471660801826,0.2935809055120767,3.0882627124421944,4.380333991965815,15.693947373602107,12.118978749190497,0.49509504737825105,1.0099326592738163,15.338985660563187,94.99270465043656,21.501245932316913,0.21866629340047597,3.3085691696215855,4.241107562231021,11.116575664595539,9.205967753173162,0.3243167903160076,0.03963697724741776,11.27636030830299,82.8787798474595,20.792370944846287,0.06863523685769031,3.9660827162938865,4.056007473641898,12.84133292390099,6.880223070948333,0.08411582311380664,1.3211953041858797,12.6941235276426,28555.61706502771
zeros,0,0,0,0,0,0,,,0,,,0,,0,35,153,31,0,0,103,10,12,3,,1,,1,,1,,1,,1,,,0,0,394,545,1266,1002,877,1025,1043,1252,1246,718,394,441,1255,919,715,985,1012,1250,1040,605,419,473,1261,941,742,1019,1030,1258,1229,644,419,469,1260,941,733,1010,1017,1258,1092,642,422,508,1264,984,813,1021,1028,1263,1261,673,423,519,1262,976,795,1014,1025,1262,1166,668,665,724,1261,987,902,1117,1118,1259,1249,809,664,739,1262,1006,916,1106,1113,1261,1163,812,687,764,1262,1042,958,1125,1128,1263,1269,843,688,816,1264,1077,979,1118,1129,1263,1219,879,712,767,1265,1031,940,1126,1127,1268,1270,839,749,901,1269,1136,1056,1162,1179,1271,1212,971,391
missing,0,0,0,0,0,0,0,0,0,0,0,2,0,25,0,0,0,0,0,0,25,25,25,0,25,0,25,0,25,0,25,0,25,0,0,55,55,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,5
0,P.S. 015 ROBERTO CLEMENTE,310000000000.0,01M015,1.0,40.721834,-73.978766,"333 E 4TH ST NEW YORK, NY 10009",NEW YORK,10009.0,"PK,0K,01,02,03,04,05",PK,5.0,Yes,0.919,0.09,0.05,0.32,0.6,0.92,0.01,0.94,0.18,0.89,Meeting Target,0.94,Meeting Target,0.86,Exceeding Target,0.91,Exceeding Target,0.85,Meeting Target,0.94,Exceeding Target,Approaching Target,2.14,2.17,20.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,21.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,15.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,15.0,2.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,16.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,16.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,31141.72
1,P.S. 019 ASHER LEVY,310000000000.0,01M019,1.0,40.729892,-73.984231,"185 1ST AVE NEW YORK, NY 10003",NEW YORK,10003.0,"PK,0K,01,02,03,04,05",PK,5.0,No,0.641,0.05,0.1,0.2,0.63,0.83,0.06,0.92,0.3,0.96,,0.96,,0.97,,0.9,Exceeding Target,0.86,Meeting Target,0.94,Meeting Target,Exceeding Target,2.63,2.98,33.0,2.0,0.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,33.0,6.0,0.0,2.0,1.0,0.0,0.0,0.0,0.0,4.0,29.0,5.0,0.0,0.0,3.0,0.0,0.0,0.0,0.0,3.0,28.0,10.0,0.0,0.0,6.0,0.0,0.0,0.0,0.0,8.0,32.0,7.0,0.0,3.0,1.0,2.0,0.0,0.0,0.0,6.0,32.0,4.0,0.0,0.0,1.0,2.0,0.0,0.0,0.0,3.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,56462.88
2,P.S. 020 ANNA SILVER,310000000000.0,01M020,1.0,40.721274,-73.986315,"166 ESSEX ST NEW YORK, NY 10002",NEW YORK,10002.0,"PK,0K,01,02,03,04,05",PK,5.0,No,0.744,0.15,0.35,0.08,0.49,0.57,0.04,0.94,0.2,0.87,Meeting Target,0.77,Meeting Target,0.82,Approaching Target,0.61,Not Meeting Target,0.8,Approaching Target,0.79,Not Meeting Target,Approaching Target,2.39,2.54,76.0,6.0,0.0,0.0,0.0,4.0,0.0,0.0,0.0,2.0,76.0,11.0,0.0,0.0,3.0,7.0,0.0,0.0,0.0,6.0,70.0,9.0,0.0,0.0,1.0,6.0,2.0,0.0,0.0,1.0,71.0,13.0,0.0,0.0,0.0,11.0,2.0,0.0,0.0,4.0,73.0,2.0,0.0,0.0,1.0,1.0,0.0,0.0,0.0,0.0,73.0,10.0,0.0,0.0,1.0,9.0,0.0,0.0,1.0,6.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,44342.61


describe()[source]
Generate an in-depth description of this H2OFrame.

The description is a tabular print of the type, min, max, sigma, number of zeros, and number of missing elements for each H2OVec in this H2OFrame.

Returns:	None (print to stdout) 

In [15]:
# dependent variable
# assign target and inputs for classification or regression
if target==None:
  target=df.columns[-1]   
y = target

In [16]:
print(y)

School Income Estimate


In [17]:
print(all_variables)

None


In [18]:
if all_variables is not None:
  ivd=get_all_variables_csv(all_variables)
  print(ivd)    
  X=check_all_variables(df,ivd,y)
  print(X)

In [19]:
df.describe()

Rows:1272
Cols:158




Unnamed: 0,School Name,SED Code,Location Code,District,Latitude,Longitude,Address,City,Zip,Grades,Grade Low,Grade High,Community School?,Economic Need Index,Percent ELL,Percent Asian,Percent Black,Percent Hispanic,Percent Black / Hispanic,Percent White,Student Attendance Rate,Percent of Students Chronically Absent,Rigorous Instruction %,Rigorous Instruction Rating,Collaborative Teachers %,Collaborative Teachers Rating,Supportive Environment %,Supportive Environment Rating,Effective School Leadership %,Effective School Leadership Rating,Strong Family-Community Ties %,Strong Family-Community Ties Rating,Trust %,Trust Rating,Student Achievement Rating,Average ELA Proficiency,Average Math Proficiency,Grade 3 ELA - All Students Tested,Grade 3 ELA 4s - All Students,Grade 3 ELA 4s - American Indian or Alaska Native,Grade 3 ELA 4s - Black or African American,Grade 3 ELA 4s - Hispanic or Latino,Grade 3 ELA 4s - Asian or Pacific Islander,Grade 3 ELA 4s - White,Grade 3 ELA 4s - Multiracial,Grade 3 ELA 4s - Limited English Proficient,Grade 3 ELA 4s - Economically Disadvantaged,Grade 3 Math - All Students tested,Grade 3 Math 4s - All Students,Grade 3 Math 4s - American Indian or Alaska Native,Grade 3 Math 4s - Black or African American,Grade 3 Math 4s - Hispanic or Latino,Grade 3 Math 4s - Asian or Pacific Islander,Grade 3 Math 4s - White,Grade 3 Math 4s - Multiracial,Grade 3 Math 4s - Limited English Proficient,Grade 3 Math 4s - Economically Disadvantaged,Grade 4 ELA - All Students Tested,Grade 4 ELA 4s - All Students,Grade 4 ELA 4s - American Indian or Alaska Native,Grade 4 ELA 4s - Black or African American,Grade 4 ELA 4s - Hispanic or Latino,Grade 4 ELA 4s - Asian or Pacific Islander,Grade 4 ELA 4s - White,Grade 4 ELA 4s - Multiracial,Grade 4 ELA 4s - Limited English Proficient,Grade 4 ELA 4s - Economically Disadvantaged,Grade 4 Math - All Students Tested,Grade 4 Math 4s - All Students,Grade 4 Math 4s - American Indian or Alaska Native,Grade 4 Math 4s - Black or African American,Grade 4 Math 4s - Hispanic or Latino,Grade 4 Math 4s - Asian or Pacific Islander,Grade 4 Math 4s - White,Grade 4 Math 4s - Multiracial,Grade 4 Math 4s - Limited English Proficient,Grade 4 Math 4s - Economically Disadvantaged,Grade 5 ELA - All Students Tested,Grade 5 ELA 4s - All Students,Grade 5 ELA 4s - American Indian or Alaska Native,Grade 5 ELA 4s - Black or African American,Grade 5 ELA 4s - Hispanic or Latino,Grade 5 ELA 4s - Asian or Pacific Islander,Grade 5 ELA 4s - White,Grade 5 ELA 4s - Multiracial,Grade 5 ELA 4s - Limited English Proficient,Grade 5 ELA 4s - Economically Disadvantaged,Grade 5 Math - All Students Tested,Grade 5 Math 4s - All Students,Grade 5 Math 4s - American Indian or Alaska Native,Grade 5 Math 4s - Black or African American,Grade 5 Math 4s - Hispanic or Latino,Grade 5 Math 4s - Asian or Pacific Islander,Grade 5 Math 4s - White,Grade 5 Math 4s - Multiracial,Grade 5 Math 4s - Limited English Proficient,Grade 5 Math 4s - Economically Disadvantaged,Grade 6 ELA - All Students Tested,Grade 6 ELA 4s - All Students,Grade 6 ELA 4s - American Indian or Alaska Native,Grade 6 ELA 4s - Black or African American,Grade 6 ELA 4s - Hispanic or Latino,Grade 6 ELA 4s - Asian or Pacific Islander,Grade 6 ELA 4s - White,Grade 6 ELA 4s - Multiracial,Grade 6 ELA 4s - Limited English Proficient,Grade 6 ELA 4s - Economically Disadvantaged,Grade 6 Math - All Students Tested,Grade 6 Math 4s - All Students,Grade 6 Math 4s - American Indian or Alaska Native,Grade 6 Math 4s - Black or African American,Grade 6 Math 4s - Hispanic or Latino,Grade 6 Math 4s - Asian or Pacific Islander,Grade 6 Math 4s - White,Grade 6 Math 4s - Multiracial,Grade 6 Math 4s - Limited English Proficient,Grade 6 Math 4s - Economically Disadvantaged,Grade 7 ELA - All Students Tested,Grade 7 ELA 4s - All Students,Grade 7 ELA 4s - American Indian or Alaska Native,Grade 7 ELA 4s - Black or African American,Grade 7 ELA 4s - Hispanic or Latino,Grade 7 ELA 4s - Asian or Pacific Islander,Grade 7 ELA 4s - White,Grade 7 ELA 4s - Multiracial,Grade 7 ELA 4s - Limited English Proficient,Grade 7 ELA 4s - Economically Disadvantaged,Grade 7 Math - All Students Tested,Grade 7 Math 4s - All Students,Grade 7 Math 4s - American Indian or Alaska Native,Grade 7 Math 4s - Black or African American,Grade 7 Math 4s - Hispanic or Latino,Grade 7 Math 4s - Asian or Pacific Islander,Grade 7 Math 4s - White,Grade 7 Math 4s - Multiracial,Grade 7 Math 4s - Limited English Proficient,Grade 7 Math 4s - Economically Disadvantaged,Grade 8 ELA - All Students Tested,Grade 8 ELA 4s - All Students,Grade 8 ELA 4s - American Indian or Alaska Native,Grade 8 ELA 4s - Black or African American,Grade 8 ELA 4s - Hispanic or Latino,Grade 8 ELA 4s - Asian or Pacific Islander,Grade 8 ELA 4s - White,Grade 8 ELA 4s - Multiracial,Grade 8 ELA 4s - Limited English Proficient,Grade 8 ELA 4s - Economically Disadvantaged,Grade 8 Math - All Students Tested,Grade 8 Math 4s - All Students,Grade 8 Math 4s - American Indian or Alaska Native,Grade 8 Math 4s - Black or African American,Grade 8 Math 4s - Hispanic or Latino,Grade 8 Math 4s - Asian or Pacific Islander,Grade 8 Math 4s - White,Grade 8 Math 4s - Multiracial,Grade 8 Math 4s - Limited English Proficient,Grade 8 Math 4s - Economically Disadvantaged,School Income Estimate
type,string,int,string,int,real,real,enum,enum,int,enum,enum,int,enum,real,real,real,real,real,real,real,real,real,real,enum,real,enum,real,enum,real,enum,real,enum,real,enum,enum,real,real,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,real
mins,,308000000000.0,,1.0,40.507803,-74.244025,,,10001.0,,,2.0,,0.049,0.0,0.0,0.0,0.02,0.03,0.0,0.0,0.0,0.0,,0.0,,0.0,,0.0,,0.0,,0.0,,,1.81,1.83,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
mean,,328709905660.3774,,16.135220125786162,40.73453669418239,-73.91834700157233,,,10815.720911949686,,,6.811023622047244,,0.6722806736166801,0.1248427672955975,0.11647798742138363,0.31996069182389936,0.41153301886792454,0.731438679245283,0.13163522012578616,0.9272493985565357,0.21574979951884524,0.8947714514835605,,0.8843624699278267,,0.8874819566960707,,0.8161507618283881,,0.8309141940657578,,0.9042261427425822,,,2.5342152834839773,2.668956450287592,60.569182389937104,4.952830188679245,0.009433962264150943,0.7138364779874213,0.8286163522012578,1.3474842767295598,1.3702830188679243,0.03930817610062893,0.02437106918238994,1.9827044025157232,61.651729559748425,13.87814465408805,0.027515723270440245,2.255503144654088,2.815251572327044,4.025943396226415,3.0102201257861636,0.0660377358490566,0.5306603773584905,6.677672955974843,58.007861635220124,9.904088050314465,0.02358490566037736,1.349056603773585,2.029874213836478,2.525157232704402,2.5400943396226414,0.05110062893081761,0.04716981132075472,4.569182389937106,59.00393081761006,13.192610062893081,0.036949685534591194,1.7610062893081762,2.717767295597484,3.949685534591195,2.911949685534591,0.06289308176100629,0.3867924528301886,6.514937106918239,56.33647798742138,6.540880503144654,0.01729559748427673,0.75,1.2075471698113207,1.8018867924528301,1.878930817610063,0.026729559748427674,0.010220125786163523,2.845125786163522,57.263364779874216,9.362421383647797,0.02830188679245283,0.8805031446540881,1.6627358490566038,3.2492138364779874,2.352201257861635,0.02908805031446541,0.17531446540880502,4.461477987421383,54.54638364779874,8.35377358490566,0.02279874213836478,1.121069182389937,1.5361635220125787,2.690251572327044,2.1713836477987423,0.07311320754716981,0.018867924528301886,4.364779874213837,55.4559748427673,12.169025157232705,0.02358490566037736,1.4591194968553463,2.285377358490566,4.289308176100629,2.9905660377358485,0.07547169811320754,0.27908805031446543,6.678459119496855,53.65880503144654,6.540880503144654,0.018867924528301886,0.6808176100628931,1.2122641509433962,1.9992138364779874,1.9842767295597483,0.025943396226415096,0.0023584905660377358,3.1705974842767297,54.286949685534594,8.559748427672956,0.018867924528301886,0.7476415094339622,1.360062893081761,3.3663522012578615,2.3168238993710695,0.03380503144654088,0.12421383647798742,4.551886792452829,52.1627358490566,7.322327044025157,0.013364779874213837,0.9166666666666666,1.520440251572327,2.255503144654088,1.931603773584906,0.015723270440251572,0.0015723270440251573,3.914308176100629,43.84119496855346,4.911949685534591,0.0031446540880503146,0.610062893081761,0.9473270440251572,1.9842767295597483,0.9709119496855346,0.002358490566037736,0.15959119496855345,2.992138364779874,33493.43374901342
maxs,,353000000000.0,,32.0,40.903455,-73.70892,,,11694.0,,,12.0,,0.957,0.99,0.95,0.97,1.0,1.0,0.92,1.0,1.0,1.0,,1.0,,1.0,,0.99,,0.99,,1.0,,,3.93,4.2,356.0,55.0,3.0,24.0,20.0,39.0,37.0,5.0,3.0,29.0,365.0,130.0,4.0,67.0,41.0,110.0,65.0,8.0,39.0,98.0,330.0,101.0,6.0,52.0,43.0,64.0,67.0,9.0,6.0,68.0,332.0,125.0,9.0,54.0,58.0,121.0,75.0,8.0,27.0,101.0,333.0,76.0,9.0,20.0,21.0,48.0,50.0,7.0,3.0,41.0,337.0,148.0,12.0,50.0,38.0,145.0,69.0,7.0,17.0,96.0,631.0,311.0,6.0,56.0,32.0,167.0,191.0,16.0,2.0,176.0,646.0,370.0,7.0,62.0,55.0,200.0,226.0,26.0,25.0,209.0,698.0,238.0,7.0,27.0,37.0,154.0,148.0,11.0,1.0,126.0,715.0,304.0,7.0,45.0,51.0,206.0,176.0,11.0,28.0,166.0,743.0,261.0,5.0,59.0,62.0,203.0,116.0,9.0,1.0,159.0,652.0,312.0,2.0,107.0,71.0,246.0,126.0,3.0,33.0,196.0,181382.06
sigma,,12246547874.627287,,9.245269789690184,0.08660233504334809,0.08057649217602468,,,529.5888745261054,,,2.2469041251441295,,0.2109593993866925,0.1136310389988787,0.17654170039311673,0.28819672945043184,0.2615350054920455,0.29377446459482975,0.20035844893173518,0.0874044556875452,0.14071610576515176,0.06995131225656014,,0.07469503628689782,,0.06585097568309414,,0.09849641549717199,,0.06278551362631553,,0.061227598371688896,,,0.36358941969604563,0.47046989274220613,57.87249551535864,8.300567781823467,0.14812436683748972,2.2728577515459962,1.8390463707371223,4.113688453267606,4.096072527190418,0.3526170464228647,0.18231075754695697,3.73825140954801,59.21195519684897,20.003778408263546,0.2661568275903549,7.041182749132261,5.408096628390776,11.84406100839878,8.294484870028898,0.5570892305983715,2.1815340182819374,11.261355695923555,57.18632102056875,15.164254076129035,0.3037821552046172,3.9166190388466378,4.108373056792661,7.496718136852251,7.276161739047364,0.521510892155569,0.306203641068976,7.7847675483986505,58.481579879188814,20.40085716050271,0.46367953743737567,5.883697483342552,5.564702632554983,11.973967956960408,8.295114274198642,0.6240451736880901,1.6852435202157658,11.771483906271152,55.25507682455075,10.58790613227203,0.304205375023246,2.067792811794701,2.3573724766865647,5.330322052178046,5.59429259467753,0.34705821695329747,0.1218372932019407,5.050530643368284,56.47912917263968,15.933968873987846,0.46173920527332857,3.1201782915714737,3.4842924682246315,10.454276422891665,6.802381556185938,0.3699190186522358,0.9192002451306134,9.022661932100663,92.69076386158294,24.411738316901275,0.29463980928110856,3.54840145810437,3.9117262512027695,12.579064167342445,10.517702224461337,0.8340701827700618,0.14177466249755225,12.658336043545287,94.4446417209382,34.59975327235193,0.31397110426234154,5.1167236497409005,5.990512300548883,19.0971991922262,14.420838111124617,1.054291263567708,1.4989961871039876,18.758632731388513,95.48780686527432,20.417589226306585,0.274180231275205,2.381150068527346,3.4242032823705246,9.448064972162468,10.050763581522993,0.4273717713418066,0.04852606856009667,9.705335281360087,97.10736173777025,28.065471660801826,0.2935809055120767,3.0882627124421944,4.380333991965815,15.693947373602107,12.118978749190497,0.49509504737825105,1.0099326592738163,15.338985660563187,94.99270465043656,21.501245932316913,0.21866629340047597,3.3085691696215855,4.241107562231021,11.116575664595539,9.205967753173162,0.3243167903160076,0.03963697724741776,11.27636030830299,82.8787798474595,20.792370944846287,0.06863523685769031,3.9660827162938865,4.056007473641898,12.84133292390099,6.880223070948333,0.08411582311380664,1.3211953041858797,12.6941235276426,28555.61706502771
zeros,0,0,0,0,0,0,,,0,,,0,,0,35,153,31,0,0,103,10,12,3,,1,,1,,1,,1,,1,,,0,0,394,545,1266,1002,877,1025,1043,1252,1246,718,394,441,1255,919,715,985,1012,1250,1040,605,419,473,1261,941,742,1019,1030,1258,1229,644,419,469,1260,941,733,1010,1017,1258,1092,642,422,508,1264,984,813,1021,1028,1263,1261,673,423,519,1262,976,795,1014,1025,1262,1166,668,665,724,1261,987,902,1117,1118,1259,1249,809,664,739,1262,1006,916,1106,1113,1261,1163,812,687,764,1262,1042,958,1125,1128,1263,1269,843,688,816,1264,1077,979,1118,1129,1263,1219,879,712,767,1265,1031,940,1126,1127,1268,1270,839,749,901,1269,1136,1056,1162,1179,1271,1212,971,391
missing,0,0,0,0,0,0,0,0,0,0,0,2,0,25,0,0,0,0,0,0,25,25,25,0,25,0,25,0,25,0,25,0,25,0,0,55,55,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,5
0,P.S. 015 ROBERTO CLEMENTE,310000000000.0,01M015,1.0,40.721834,-73.978766,"333 E 4TH ST NEW YORK, NY 10009",NEW YORK,10009.0,"PK,0K,01,02,03,04,05",PK,5.0,Yes,0.919,0.09,0.05,0.32,0.6,0.92,0.01,0.94,0.18,0.89,Meeting Target,0.94,Meeting Target,0.86,Exceeding Target,0.91,Exceeding Target,0.85,Meeting Target,0.94,Exceeding Target,Approaching Target,2.14,2.17,20.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,21.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,15.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,15.0,2.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,16.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,16.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,31141.72
1,P.S. 019 ASHER LEVY,310000000000.0,01M019,1.0,40.729892,-73.984231,"185 1ST AVE NEW YORK, NY 10003",NEW YORK,10003.0,"PK,0K,01,02,03,04,05",PK,5.0,No,0.641,0.05,0.1,0.2,0.63,0.83,0.06,0.92,0.3,0.96,,0.96,,0.97,,0.9,Exceeding Target,0.86,Meeting Target,0.94,Meeting Target,Exceeding Target,2.63,2.98,33.0,2.0,0.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,33.0,6.0,0.0,2.0,1.0,0.0,0.0,0.0,0.0,4.0,29.0,5.0,0.0,0.0,3.0,0.0,0.0,0.0,0.0,3.0,28.0,10.0,0.0,0.0,6.0,0.0,0.0,0.0,0.0,8.0,32.0,7.0,0.0,3.0,1.0,2.0,0.0,0.0,0.0,6.0,32.0,4.0,0.0,0.0,1.0,2.0,0.0,0.0,0.0,3.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,56462.88
2,P.S. 020 ANNA SILVER,310000000000.0,01M020,1.0,40.721274,-73.986315,"166 ESSEX ST NEW YORK, NY 10002",NEW YORK,10002.0,"PK,0K,01,02,03,04,05",PK,5.0,No,0.744,0.15,0.35,0.08,0.49,0.57,0.04,0.94,0.2,0.87,Meeting Target,0.77,Meeting Target,0.82,Approaching Target,0.61,Not Meeting Target,0.8,Approaching Target,0.79,Not Meeting Target,Approaching Target,2.39,2.54,76.0,6.0,0.0,0.0,0.0,4.0,0.0,0.0,0.0,2.0,76.0,11.0,0.0,0.0,3.0,7.0,0.0,0.0,0.0,6.0,70.0,9.0,0.0,0.0,1.0,6.0,2.0,0.0,0.0,1.0,71.0,13.0,0.0,0.0,0.0,11.0,2.0,0.0,0.0,4.0,73.0,2.0,0.0,0.0,1.0,1.0,0.0,0.0,0.0,0.0,73.0,10.0,0.0,0.0,1.0,9.0,0.0,0.0,1.0,6.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,44342.61


In [20]:
# independent variables

X = []  
if all_variables is None:
  X=get_independent_variables(df, target)  
else: 
  ivd=get_all_variables_csv(all_variables)    
  X=check_all_variables(df, ivd)


X=check_X(X,df)


# Add independent variables

meta_data['X']=X  


# impute missing values

_=impute_missing_values(df,X, scale)

In [21]:
if analysis == 3:
  classification=False
elif analysis == 2:
  classification=True
elif analysis == 1:
  classification=True

In [22]:
print(classification)

False


In [23]:
# Force target to be factors
# Only 'int' or 'string' are allowed for asfactor(), got Target (Total orders):real 

if classification:
    df[y] = df[y].asfactor()

In [24]:
def check_y(y,df):
  ok=False
  C = [name for name in df.columns if name == y]
  for key, val in df.types.items():
    if key in C:
      if val in ['real','int','enum']:        
        ok=True         
  return ok, val   

In [25]:
ok,val=check_y(y,df)

In [26]:
print(val)

real


In [27]:
print(ok)

True


In [28]:
if val=='enum':
    print(df[y].levels())

In [29]:
df.describe()

Rows:1272
Cols:158




Unnamed: 0,School Name,SED Code,Location Code,District,Latitude,Longitude,Address,City,Zip,Grades,Grade Low,Grade High,Community School?,Economic Need Index,Percent ELL,Percent Asian,Percent Black,Percent Hispanic,Percent Black / Hispanic,Percent White,Student Attendance Rate,Percent of Students Chronically Absent,Rigorous Instruction %,Rigorous Instruction Rating,Collaborative Teachers %,Collaborative Teachers Rating,Supportive Environment %,Supportive Environment Rating,Effective School Leadership %,Effective School Leadership Rating,Strong Family-Community Ties %,Strong Family-Community Ties Rating,Trust %,Trust Rating,Student Achievement Rating,Average ELA Proficiency,Average Math Proficiency,Grade 3 ELA - All Students Tested,Grade 3 ELA 4s - All Students,Grade 3 ELA 4s - American Indian or Alaska Native,Grade 3 ELA 4s - Black or African American,Grade 3 ELA 4s - Hispanic or Latino,Grade 3 ELA 4s - Asian or Pacific Islander,Grade 3 ELA 4s - White,Grade 3 ELA 4s - Multiracial,Grade 3 ELA 4s - Limited English Proficient,Grade 3 ELA 4s - Economically Disadvantaged,Grade 3 Math - All Students tested,Grade 3 Math 4s - All Students,Grade 3 Math 4s - American Indian or Alaska Native,Grade 3 Math 4s - Black or African American,Grade 3 Math 4s - Hispanic or Latino,Grade 3 Math 4s - Asian or Pacific Islander,Grade 3 Math 4s - White,Grade 3 Math 4s - Multiracial,Grade 3 Math 4s - Limited English Proficient,Grade 3 Math 4s - Economically Disadvantaged,Grade 4 ELA - All Students Tested,Grade 4 ELA 4s - All Students,Grade 4 ELA 4s - American Indian or Alaska Native,Grade 4 ELA 4s - Black or African American,Grade 4 ELA 4s - Hispanic or Latino,Grade 4 ELA 4s - Asian or Pacific Islander,Grade 4 ELA 4s - White,Grade 4 ELA 4s - Multiracial,Grade 4 ELA 4s - Limited English Proficient,Grade 4 ELA 4s - Economically Disadvantaged,Grade 4 Math - All Students Tested,Grade 4 Math 4s - All Students,Grade 4 Math 4s - American Indian or Alaska Native,Grade 4 Math 4s - Black or African American,Grade 4 Math 4s - Hispanic or Latino,Grade 4 Math 4s - Asian or Pacific Islander,Grade 4 Math 4s - White,Grade 4 Math 4s - Multiracial,Grade 4 Math 4s - Limited English Proficient,Grade 4 Math 4s - Economically Disadvantaged,Grade 5 ELA - All Students Tested,Grade 5 ELA 4s - All Students,Grade 5 ELA 4s - American Indian or Alaska Native,Grade 5 ELA 4s - Black or African American,Grade 5 ELA 4s - Hispanic or Latino,Grade 5 ELA 4s - Asian or Pacific Islander,Grade 5 ELA 4s - White,Grade 5 ELA 4s - Multiracial,Grade 5 ELA 4s - Limited English Proficient,Grade 5 ELA 4s - Economically Disadvantaged,Grade 5 Math - All Students Tested,Grade 5 Math 4s - All Students,Grade 5 Math 4s - American Indian or Alaska Native,Grade 5 Math 4s - Black or African American,Grade 5 Math 4s - Hispanic or Latino,Grade 5 Math 4s - Asian or Pacific Islander,Grade 5 Math 4s - White,Grade 5 Math 4s - Multiracial,Grade 5 Math 4s - Limited English Proficient,Grade 5 Math 4s - Economically Disadvantaged,Grade 6 ELA - All Students Tested,Grade 6 ELA 4s - All Students,Grade 6 ELA 4s - American Indian or Alaska Native,Grade 6 ELA 4s - Black or African American,Grade 6 ELA 4s - Hispanic or Latino,Grade 6 ELA 4s - Asian or Pacific Islander,Grade 6 ELA 4s - White,Grade 6 ELA 4s - Multiracial,Grade 6 ELA 4s - Limited English Proficient,Grade 6 ELA 4s - Economically Disadvantaged,Grade 6 Math - All Students Tested,Grade 6 Math 4s - All Students,Grade 6 Math 4s - American Indian or Alaska Native,Grade 6 Math 4s - Black or African American,Grade 6 Math 4s - Hispanic or Latino,Grade 6 Math 4s - Asian or Pacific Islander,Grade 6 Math 4s - White,Grade 6 Math 4s - Multiracial,Grade 6 Math 4s - Limited English Proficient,Grade 6 Math 4s - Economically Disadvantaged,Grade 7 ELA - All Students Tested,Grade 7 ELA 4s - All Students,Grade 7 ELA 4s - American Indian or Alaska Native,Grade 7 ELA 4s - Black or African American,Grade 7 ELA 4s - Hispanic or Latino,Grade 7 ELA 4s - Asian or Pacific Islander,Grade 7 ELA 4s - White,Grade 7 ELA 4s - Multiracial,Grade 7 ELA 4s - Limited English Proficient,Grade 7 ELA 4s - Economically Disadvantaged,Grade 7 Math - All Students Tested,Grade 7 Math 4s - All Students,Grade 7 Math 4s - American Indian or Alaska Native,Grade 7 Math 4s - Black or African American,Grade 7 Math 4s - Hispanic or Latino,Grade 7 Math 4s - Asian or Pacific Islander,Grade 7 Math 4s - White,Grade 7 Math 4s - Multiracial,Grade 7 Math 4s - Limited English Proficient,Grade 7 Math 4s - Economically Disadvantaged,Grade 8 ELA - All Students Tested,Grade 8 ELA 4s - All Students,Grade 8 ELA 4s - American Indian or Alaska Native,Grade 8 ELA 4s - Black or African American,Grade 8 ELA 4s - Hispanic or Latino,Grade 8 ELA 4s - Asian or Pacific Islander,Grade 8 ELA 4s - White,Grade 8 ELA 4s - Multiracial,Grade 8 ELA 4s - Limited English Proficient,Grade 8 ELA 4s - Economically Disadvantaged,Grade 8 Math - All Students Tested,Grade 8 Math 4s - All Students,Grade 8 Math 4s - American Indian or Alaska Native,Grade 8 Math 4s - Black or African American,Grade 8 Math 4s - Hispanic or Latino,Grade 8 Math 4s - Asian or Pacific Islander,Grade 8 Math 4s - White,Grade 8 Math 4s - Multiracial,Grade 8 Math 4s - Limited English Proficient,Grade 8 Math 4s - Economically Disadvantaged,School Income Estimate
type,string,int,string,int,real,real,enum,enum,int,enum,enum,real,enum,real,real,real,real,real,real,real,real,real,real,enum,real,enum,real,enum,real,enum,real,enum,real,enum,enum,real,real,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,real
mins,,308000000000.0,,1.0,40.507803,-74.244025,,,10001.0,,,2.0,,0.049,0.0,0.0,0.0,0.02,0.03,0.0,0.0,0.0,0.0,,0.0,,0.0,,0.0,,0.0,,0.0,,,1.81,1.83,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
mean,,328709905660.3774,,16.135220125786162,40.73453669418239,-73.91834700157233,,,10815.720911949686,,,6.811023622047244,,0.6722806736166801,0.1248427672955975,0.11647798742138363,0.31996069182389936,0.41153301886792454,0.731438679245283,0.13163522012578616,0.9272493985565357,0.21574979951884524,0.8947714514835606,,0.8843624699278267,,0.8874819566960706,,0.8161507618283881,,0.8309141940657578,,0.9042261427425822,,,2.534215283483977,2.668956450287592,60.569182389937104,4.952830188679245,0.009433962264150943,0.7138364779874213,0.8286163522012578,1.3474842767295598,1.3702830188679243,0.03930817610062893,0.02437106918238994,1.9827044025157232,61.651729559748425,13.87814465408805,0.027515723270440245,2.255503144654088,2.815251572327044,4.025943396226415,3.0102201257861636,0.0660377358490566,0.5306603773584905,6.677672955974843,58.007861635220124,9.904088050314465,0.02358490566037736,1.349056603773585,2.029874213836478,2.525157232704402,2.5400943396226414,0.05110062893081761,0.04716981132075472,4.569182389937106,59.00393081761006,13.192610062893081,0.036949685534591194,1.7610062893081762,2.717767295597484,3.949685534591195,2.911949685534591,0.06289308176100629,0.3867924528301886,6.514937106918239,56.33647798742138,6.540880503144654,0.01729559748427673,0.75,1.2075471698113207,1.8018867924528301,1.878930817610063,0.026729559748427674,0.010220125786163523,2.845125786163522,57.263364779874216,9.362421383647797,0.02830188679245283,0.8805031446540881,1.6627358490566038,3.2492138364779874,2.352201257861635,0.02908805031446541,0.17531446540880502,4.461477987421383,54.54638364779874,8.35377358490566,0.02279874213836478,1.121069182389937,1.5361635220125787,2.690251572327044,2.1713836477987423,0.07311320754716981,0.018867924528301886,4.364779874213837,55.4559748427673,12.169025157232705,0.02358490566037736,1.4591194968553463,2.285377358490566,4.289308176100629,2.9905660377358485,0.07547169811320754,0.27908805031446543,6.678459119496855,53.65880503144654,6.540880503144654,0.018867924528301886,0.6808176100628931,1.2122641509433962,1.9992138364779874,1.9842767295597483,0.025943396226415096,0.0023584905660377358,3.1705974842767297,54.286949685534594,8.559748427672956,0.018867924528301886,0.7476415094339622,1.360062893081761,3.3663522012578615,2.3168238993710695,0.03380503144654088,0.12421383647798742,4.551886792452829,52.1627358490566,7.322327044025157,0.013364779874213837,0.9166666666666666,1.520440251572327,2.255503144654088,1.931603773584906,0.015723270440251572,0.0015723270440251573,3.914308176100629,43.84119496855346,4.911949685534591,0.0031446540880503146,0.610062893081761,0.9473270440251572,1.9842767295597483,0.9709119496855346,0.002358490566037736,0.15959119496855345,2.992138364779874,33493.43374901342
maxs,,353000000000.0,,32.0,40.903455,-73.70892,,,11694.0,,,12.0,,0.957,0.99,0.95,0.97,1.0,1.0,0.92,1.0,1.0,1.0,,1.0,,1.0,,0.99,,0.99,,1.0,,,3.93,4.2,356.0,55.0,3.0,24.0,20.0,39.0,37.0,5.0,3.0,29.0,365.0,130.0,4.0,67.0,41.0,110.0,65.0,8.0,39.0,98.0,330.0,101.0,6.0,52.0,43.0,64.0,67.0,9.0,6.0,68.0,332.0,125.0,9.0,54.0,58.0,121.0,75.0,8.0,27.0,101.0,333.0,76.0,9.0,20.0,21.0,48.0,50.0,7.0,3.0,41.0,337.0,148.0,12.0,50.0,38.0,145.0,69.0,7.0,17.0,96.0,631.0,311.0,6.0,56.0,32.0,167.0,191.0,16.0,2.0,176.0,646.0,370.0,7.0,62.0,55.0,200.0,226.0,26.0,25.0,209.0,698.0,238.0,7.0,27.0,37.0,154.0,148.0,11.0,1.0,126.0,715.0,304.0,7.0,45.0,51.0,206.0,176.0,11.0,28.0,166.0,743.0,261.0,5.0,59.0,62.0,203.0,116.0,9.0,1.0,159.0,652.0,312.0,2.0,107.0,71.0,246.0,126.0,3.0,33.0,196.0,181382.06
sigma,,12246547874.627287,,9.245269789690184,0.08660233504334809,0.08057649217602468,,,529.5888745261054,,,2.245135605291097,,0.20887435711634506,0.1136310389988787,0.17654170039311673,0.28819672945043184,0.2615350054920455,0.29377446459482975,0.20035844893173518,0.08654058337251651,0.13932532142706586,0.06925994015678537,,0.07395677902746262,,0.06520012974666259,,0.09752291448045333,,0.06216496554807429,,0.06062244813383284,,,0.3556356169594651,0.4601779961201868,57.87249551535864,8.300567781823467,0.14812436683748972,2.2728577515459962,1.8390463707371223,4.113688453267606,4.096072527190418,0.3526170464228647,0.18231075754695697,3.73825140954801,59.21195519684897,20.003778408263546,0.2661568275903549,7.041182749132261,5.408096628390776,11.84406100839878,8.294484870028898,0.5570892305983715,2.1815340182819374,11.261355695923555,57.18632102056875,15.164254076129035,0.3037821552046172,3.9166190388466378,4.108373056792661,7.496718136852251,7.276161739047364,0.521510892155569,0.306203641068976,7.7847675483986505,58.481579879188814,20.40085716050271,0.46367953743737567,5.883697483342552,5.564702632554983,11.973967956960408,8.295114274198642,0.6240451736880901,1.6852435202157658,11.771483906271152,55.25507682455075,10.58790613227203,0.304205375023246,2.067792811794701,2.3573724766865647,5.330322052178046,5.59429259467753,0.34705821695329747,0.1218372932019407,5.050530643368284,56.47912917263968,15.933968873987846,0.46173920527332857,3.1201782915714737,3.4842924682246315,10.454276422891665,6.802381556185938,0.3699190186522358,0.9192002451306134,9.022661932100663,92.69076386158294,24.411738316901275,0.29463980928110856,3.54840145810437,3.9117262512027695,12.579064167342445,10.517702224461337,0.8340701827700618,0.14177466249755225,12.658336043545287,94.4446417209382,34.59975327235193,0.31397110426234154,5.1167236497409005,5.990512300548883,19.0971991922262,14.420838111124617,1.054291263567708,1.4989961871039876,18.758632731388513,95.48780686527432,20.417589226306585,0.274180231275205,2.381150068527346,3.4242032823705246,9.448064972162468,10.050763581522993,0.4273717713418066,0.04852606856009667,9.705335281360087,97.10736173777025,28.065471660801826,0.2935809055120767,3.0882627124421944,4.380333991965815,15.693947373602107,12.118978749190497,0.49509504737825105,1.0099326592738163,15.338985660563187,94.99270465043656,21.501245932316913,0.21866629340047597,3.3085691696215855,4.241107562231021,11.116575664595539,9.205967753173162,0.3243167903160076,0.03963697724741776,11.27636030830299,82.8787798474595,20.792370944846287,0.06863523685769031,3.9660827162938865,4.056007473641898,12.84133292390099,6.880223070948333,0.08411582311380664,1.3211953041858797,12.6941235276426,28555.61706502771
zeros,0,0,0,0,0,0,,,0,,,0,,0,35,153,31,0,0,103,10,12,3,,1,,1,,1,,1,,1,,,0,0,394,545,1266,1002,877,1025,1043,1252,1246,718,394,441,1255,919,715,985,1012,1250,1040,605,419,473,1261,941,742,1019,1030,1258,1229,644,419,469,1260,941,733,1010,1017,1258,1092,642,422,508,1264,984,813,1021,1028,1263,1261,673,423,519,1262,976,795,1014,1025,1262,1166,668,665,724,1261,987,902,1117,1118,1259,1249,809,664,739,1262,1006,916,1106,1113,1261,1163,812,687,764,1262,1042,958,1125,1128,1263,1269,843,688,816,1264,1077,979,1118,1129,1263,1219,879,712,767,1265,1031,940,1126,1127,1268,1270,839,749,901,1269,1136,1056,1162,1179,1271,1212,971,391
missing,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,5
0,P.S. 015 ROBERTO CLEMENTE,310000000000.0,01M015,1.0,40.721834,-73.978766,"333 E 4TH ST NEW YORK, NY 10009",NEW YORK,10009.0,"PK,0K,01,02,03,04,05",PK,5.0,Yes,0.919,0.09,0.05,0.32,0.6,0.92,0.01,0.94,0.18,0.89,Meeting Target,0.94,Meeting Target,0.86,Exceeding Target,0.91,Exceeding Target,0.85,Meeting Target,0.94,Exceeding Target,Approaching Target,2.14,2.17,20.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,21.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,15.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,15.0,2.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,16.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,16.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,31141.72
1,P.S. 019 ASHER LEVY,310000000000.0,01M019,1.0,40.729892,-73.984231,"185 1ST AVE NEW YORK, NY 10003",NEW YORK,10003.0,"PK,0K,01,02,03,04,05",PK,5.0,No,0.641,0.05,0.1,0.2,0.63,0.83,0.06,0.92,0.3,0.96,,0.96,,0.97,,0.9,Exceeding Target,0.86,Meeting Target,0.94,Meeting Target,Exceeding Target,2.63,2.98,33.0,2.0,0.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,33.0,6.0,0.0,2.0,1.0,0.0,0.0,0.0,0.0,4.0,29.0,5.0,0.0,0.0,3.0,0.0,0.0,0.0,0.0,3.0,28.0,10.0,0.0,0.0,6.0,0.0,0.0,0.0,0.0,8.0,32.0,7.0,0.0,3.0,1.0,2.0,0.0,0.0,0.0,6.0,32.0,4.0,0.0,0.0,1.0,2.0,0.0,0.0,0.0,3.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,56462.88
2,P.S. 020 ANNA SILVER,310000000000.0,01M020,1.0,40.721274,-73.986315,"166 ESSEX ST NEW YORK, NY 10002",NEW YORK,10002.0,"PK,0K,01,02,03,04,05",PK,5.0,No,0.744,0.15,0.35,0.08,0.49,0.57,0.04,0.94,0.2,0.87,Meeting Target,0.77,Meeting Target,0.82,Approaching Target,0.61,Not Meeting Target,0.8,Approaching Target,0.79,Not Meeting Target,Approaching Target,2.39,2.54,76.0,6.0,0.0,0.0,0.0,4.0,0.0,0.0,0.0,2.0,76.0,11.0,0.0,0.0,3.0,7.0,0.0,0.0,0.0,6.0,70.0,9.0,0.0,0.0,1.0,6.0,2.0,0.0,0.0,1.0,71.0,13.0,0.0,0.0,0.0,11.0,2.0,0.0,0.0,4.0,73.0,2.0,0.0,0.0,1.0,1.0,0.0,0.0,0.0,0.0,73.0,10.0,0.0,0.0,1.0,9.0,0.0,0.0,1.0,6.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,44342.61


In [30]:
allV=get_variables_types(df)
allV

{'Address': 'enum',
 'Average ELA Proficiency': 'real',
 'Average Math Proficiency': 'real',
 'City': 'enum',
 'Collaborative Teachers %': 'real',
 'Collaborative Teachers Rating': 'enum',
 'Community School?': 'enum',
 'District': 'int',
 'Economic Need Index': 'real',
 'Effective School Leadership %': 'real',
 'Effective School Leadership Rating': 'enum',
 'Grade 3 ELA - All Students Tested': 'int',
 'Grade 3 ELA 4s - All Students': 'int',
 'Grade 3 ELA 4s - American Indian or Alaska Native': 'int',
 'Grade 3 ELA 4s - Asian or Pacific Islander': 'int',
 'Grade 3 ELA 4s - Black or African American': 'int',
 'Grade 3 ELA 4s - Economically Disadvantaged': 'int',
 'Grade 3 ELA 4s - Hispanic or Latino': 'int',
 'Grade 3 ELA 4s - Limited English Proficient': 'int',
 'Grade 3 ELA 4s - Multiracial': 'int',
 'Grade 3 ELA 4s - White': 'int',
 'Grade 3 Math - All Students tested': 'int',
 'Grade 3 Math 4s - All Students': 'int',
 'Grade 3 Math 4s - American Indian or Alaska Native': 'int',
 'Gr

In [31]:
meta_data['variables']=allV

In [32]:
# split into training and test for showing how to predict
train, test = df.split_frame([0.9])

In [33]:
# Set up AutoML

aml = H2OAutoML(max_runtime_secs=run_time,project_name = name)

In [34]:
model_start_time = time.time()

In [35]:
aml.train(x=X,y=y,training_frame=train)

AutoML progress: |████████████████████████████████████████████████████████| 100%


In [36]:
meta_data['model_execution_time'] = time.time() - model_start_time

In [37]:
# get leaderboard
aml_leaderboard_df=aml.leaderboard.as_data_frame()

In [38]:
aml_leaderboard_df

Unnamed: 0,model_id,mean_residual_deviance,rmse,mse,mae,rmsle
0,GBM_grid_0_AutoML_20181003_010045_model_2,54573820.0,7387.409471,54573820.0,4131.771195,
1,GBM_grid_0_AutoML_20181003_010045_model_3,54661130.0,7393.316483,54661130.0,4159.142172,
2,GBM_grid_0_AutoML_20181003_010045_model_1,55934230.0,7478.919181,55934230.0,4145.526918,
3,GBM_grid_0_AutoML_20181003_010045_model_0,56928090.0,7545.070603,56928090.0,4156.135535,
4,GBM_grid_0_AutoML_20181003_010045_model_8,58703210.0,7661.802233,58703210.0,4376.487551,
5,XRT_0_AutoML_20181003_010045,61487940.0,7841.424877,61487940.0,4263.158954,2.071108
6,DRF_0_AutoML_20181003_010045,61751820.0,7858.232479,61751820.0,4331.268802,2.044908
7,GBM_grid_0_AutoML_20181003_010045_model_5,71338600.0,8446.217797,71338600.0,5138.914246,
8,GBM_grid_0_AutoML_20181003_010045_model_11,73396430.0,8567.171821,73396430.0,5620.32372,4.712404
9,GBM_grid_0_AutoML_20181003_010045_model_4,83906820.0,9160.066742,83906820.0,5553.964856,


In [39]:
# STart best model as first model

model_set=aml_leaderboard_df['model_id']
mod_best=h2o.get_model(model_set[0])

In [40]:
mod_best._id

'GBM_grid_0_AutoML_20181003_010045_model_2'

In [41]:
# Get stacked ensemble  
se=get_stacked_ensemble(model_set)

In [42]:
print(se)

StackedEnsemble_BestOfFamily_0_AutoML_20181003_010045


In [43]:
if se is not None:
  mod_best=h2o.get_model(se)

In [44]:
dir(mod_best)

['__class__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattr__',
 '__getattribute__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__weakref__',
 '_bc',
 '_bcin',
 '_check_targets',
 '_compute_algo',
 '_estimator_type',
 '_future',
 '_get_metrics',
 '_have_mojo',
 '_have_pojo',
 '_id',
 '_is_xvalidated',
 '_job',
 '_keyify_if_h2oframe',
 '_make_model',
 '_metrics_class',
 '_model_json',
 '_parms',
 '_plot',
 '_requires_training_frame',
 '_resolve_model',
 '_verify_training_frame_params',
 '_xval_keys',
 'actual_params',
 'aic',
 'algo',
 'auc',
 'base_models',
 'biases',
 'catoffsets',
 'coef',
 'coef_norm',
 'cross_validation_fold_assignment',
 'cross_validation_holdout_predictions',
 'cross_validation_metrics_summary',
 'cross_validation_models',


In [45]:
mod_best._id

'StackedEnsemble_BestOfFamily_0_AutoML_20181003_010045'

In [46]:
mod_best._get_metrics

<function h2o.model.model_base.ModelBase._get_metrics>

In [47]:
type(mod_best)

h2o.estimators.stackedensemble.H2OStackedEnsembleEstimator

In [48]:
mods=mod_best.coef_norm
print(mods)

Model Details
H2OStackedEnsembleEstimator :  Stacked Ensemble
Model Key:  StackedEnsemble_BestOfFamily_0_AutoML_20181003_010045
No model summary for this model


ModelMetricsRegressionGLM: stackedensemble
** Reported on train data. **

MSE: 798239944.7588885
RMSE: 28253.14044064639
MAE: 22634.091551931684
RMSLE: 5.820001479726165
R^2: 2.42920152395687e-05
Mean Residual Deviance: 798239944.7588885
Null degrees of freedom: 921
Residual degrees of freedom: 917
Null deviance: 735995107872.072
Residual deviance: 735977229067.6952
AIC: 21527.604704258654

ModelMetricsRegressionGLM: stackedensemble
** Reported on validation data. **

MSE: 949977989.7867891
RMSE: 30821.712959970104
MAE: 23900.40496528627
RMSLE: 5.619061826242455
R^2: -0.010180483378021465
Mean Residual Deviance: 949977989.7867891
Null degrees of freedom: 213
Residual degrees of freedom: 209
Null deviance: 203299852235.3675
Residual deviance: 203295289814.37286
AIC: 5043.102858171625

ModelMetricsRegressionGLM: stackedensemble


In [49]:
bm=stackedensemble_df(aml_leaderboard_df)

In [50]:
bm

['GBM_grid_0_AutoML_20181003_010045_model_2',
 'GLM_grid_0_AutoML_20181003_010045_model_0',
 'DRF_0_AutoML_20181003_010045',
 'XRT_0_AutoML_20181003_010045',
 'DeepLearning_0_AutoML_20181003_010045']

In [51]:
aml_leaderboard_df

Unnamed: 0,model_id,mean_residual_deviance,rmse,mse,mae,rmsle
0,GBM_grid_0_AutoML_20181003_010045_model_2,54573820.0,7387.409471,54573820.0,4131.771195,
1,GBM_grid_0_AutoML_20181003_010045_model_3,54661130.0,7393.316483,54661130.0,4159.142172,
2,GBM_grid_0_AutoML_20181003_010045_model_1,55934230.0,7478.919181,55934230.0,4145.526918,
3,GBM_grid_0_AutoML_20181003_010045_model_0,56928090.0,7545.070603,56928090.0,4156.135535,
4,GBM_grid_0_AutoML_20181003_010045_model_8,58703210.0,7661.802233,58703210.0,4376.487551,
5,XRT_0_AutoML_20181003_010045,61487940.0,7841.424877,61487940.0,4263.158954,2.071108
6,DRF_0_AutoML_20181003_010045,61751820.0,7858.232479,61751820.0,4331.268802,2.044908
7,GBM_grid_0_AutoML_20181003_010045_model_5,71338600.0,8446.217797,71338600.0,5138.914246,
8,GBM_grid_0_AutoML_20181003_010045_model_11,73396430.0,8567.171821,73396430.0,5620.32372,4.712404
9,GBM_grid_0_AutoML_20181003_010045_model_4,83906820.0,9160.066742,83906820.0,5553.964856,


In [52]:
#  Get best_models and coef_norm()
best_models={}
best_models=stackedensemble(mod_best)
bm=[]
if best_models is not None: 
  if 'Intercept' in best_models.keys():
    del best_models['Intercept']
  bm=list(best_models.keys())
else:
  best_models={}
  bm=stackedensemble_df(aml_leaderboard_df)   
  for b in bm:   
    best_models[b]=None

if mod_best.model_id not in bm:
    bm.append(mod_best.model_id)

In [53]:
bm

['GBM_grid_0_AutoML_20181003_010045_model_2',
 'XRT_0_AutoML_20181003_010045',
 'DRF_0_AutoML_20181003_010045',
 'DeepLearning_0_AutoML_20181003_010045',
 'GLM_grid_0_AutoML_20181003_010045_model_0',
 'StackedEnsemble_BestOfFamily_0_AutoML_20181003_010045']

In [54]:
# Best of Family leaderboard

aml_leaderboard_df=aml_leaderboard_df.loc[aml_leaderboard_df['model_id'].isin(bm)]


In [55]:
aml_leaderboard_df

Unnamed: 0,model_id,mean_residual_deviance,rmse,mse,mae,rmsle
0,GBM_grid_0_AutoML_20181003_010045_model_2,54573820.0,7387.409471,54573820.0,4131.771195,
5,XRT_0_AutoML_20181003_010045,61487940.0,7841.424877,61487940.0,4263.158954,2.071108
6,DRF_0_AutoML_20181003_010045,61751820.0,7858.232479,61751820.0,4331.268802,2.044908
11,DeepLearning_0_AutoML_20181003_010045,113427400.0,10650.232136,113427400.0,6823.14207,
16,StackedEnsemble_BestOfFamily_0_AutoML_20181003...,798259300.0,28253.48361,798259300.0,22634.368158,5.820009
18,GLM_grid_0_AutoML_20181003_010045_model_0,800278400.0,28289.192532,800278400.0,22660.437027,5.820572


In [56]:
# save leaderboard
leaderboard_stats=run_id+'_leaderboard.csv'
aml_leaderboard_df.to_csv(leaderboard_stats)

In [57]:
top=aml_leaderboard_df.iloc[0]['model_id']
print(top)

GBM_grid_0_AutoML_20181003_010045_model_2


In [58]:
mod_best=h2o.get_model(top)
print(mod_best._id)
print(mod_best.algo)

GBM_grid_0_AutoML_20181003_010045_model_2
gbm


In [59]:
meta_data['mod_best']=mod_best._id
meta_data['mod_best_algo']=mod_best.algo

In [60]:
meta_data['models']=bm

In [61]:
models_path=os.path.join(run_dir,'models')
for mod in bm:
  try:   
    m=h2o.get_model(mod) 
    h2o.save_model(m, path = models_path)
  except:    
    pass    

In [62]:
print(models_path)

/Users/bear/Downloads/AutoML/pNYSLwZc7/models


In [63]:
# GBM
 
mod,mod_id=get_model_by_algo("GBM",best_models)
if mod is not None:
    try:     
        sh_df=mod.scoring_history()
        sh_df.to_csv(run_id+'_gbm_scoring_history.csv') 
    except:
        pass   
    try:     
        stats_gbm={}
        stats_gbm=gbm_stats(mod)
        n=run_id+'_gbm_stats.json'
        dict_to_json(stats_gbm,n)
        print(stats_gbm)
    except:
        pass        

{'algo': 'gbm', 'model_id': 'GBM_grid_0_AutoML_20181003_010045_model_2', 'varimp': [('Grade 3 Math - All Students tested', 1200304029696.0, 1.0, 0.3335231248683164), ('Grade 3 ELA - All Students Tested', 653129547776.0, 0.5441367617014645, 0.18148219311839892), ('Economic Need Index', 531167772672.0, 0.4425276925934577, 0.14759321887453575), ('Grades', 422396198912.0, 0.351907673774102, 0.11736934702227858), ('Percent White', 247670702080.0, 0.20633997383373556, 0.06881915285827414), ('City', 109596106752.0, 0.09130695560503727, 0.03045298115560467), ('Percent Black / Hispanic', 76606210048.0, 0.06382233846819792, 0.02128622576231673), ('Grade Low', 44641665024.0, 0.037191964635248584, 0.012404380265140022), ('Percent of Students Chronically Absent', 28514639872.0, 0.02375618107290857, 0.00792323574637402), ('Longitude', 27097735168.0, 0.022575726230680923, 0.00752952675862832), ('Grade 3 Math 4s - White', 25084768256.0, 0.020898678697557318, 0.0069701926248282335), ('Address', 2394618




In [64]:
# DeepLearning

mod,mod_id=get_model_by_algo("Dee",best_models)


In [65]:
if mod is not None:
    try:    
        sh_df=mod.scoring_history()
        sh_df.to_csv(run_id+'_dl_scoring_history.csv') 
    except:
        pass 
    try:
        stats_dl={}
        stats_dl=dl_stats(mod)
        n=run_id+'_dl_stats.json'
        dict_to_json(stats_dl,n)
        print(stats_dl)
    except:
        pass    
    try:
        cf=mod.confusion_matrix()    
        cf_df.to_csv(run_id+'_dl_confusion_matrix.csv')
    except:
        pass       

{'algo': 'deeplearning', 'model_id': 'DeepLearning_0_AutoML_20181003_010045', 'varimp': [('Grade Low.6', 1.0, 1.0, 0.0030656480927871528), ('Economic Need Index', 0.8579245209693909, 0.8579245209693909, 0.0026300946714651447), ('Grades.06,07,08', 0.572059690952301, 0.572059690952301, 0.0017537337005283297), ('City.BRONX', 0.5641229152679443, 0.5641229152679443, 0.001729402339288702), ('Grade High', 0.43213173747062683, 0.43213173747062683, 0.0013247638368096258), ('Strong Family-Community Ties Rating.Meeting Target', 0.4191891849040985, 0.4191891849040985, 0.0012850865252182508), ('Grade 6 Math - All Students Tested', 0.41764408349990845, 0.41764408349990845, 0.0012803497880453328), ('Latitude', 0.39417752623558044, 0.39417752623558044, 0.001208409581523665), ('Grade Low.5', 0.3891947865486145, 0.3891947865486145, 0.001193134255105463), ('Strong Family-Community Ties %', 0.3875519335269928, 0.3875519335269928, 0.001188097845872999), ('City.FLUSHING', 0.3800198435783386, 0.3800198435783




In [66]:
# DRF

mod,mod_id=get_model_by_algo("DRF",best_models)
if mod is not None:
    try:     
         sh_df=mod.scoring_history()
         sh_df.to_csv(run_id+'_drf_scoring_history.csv') 
    except:
         pass  
    try: 
         stats_drf={}
         stats_drf=drf_stats(mod)
         n=run_id+'_drf_stats.json'
         dict_to_json(stats_drf,n)
         print(stats_drf)
    except:
         pass     

In [67]:
# XRT

mod,mod_id=get_model_by_algo("XRT",best_models)
if mod is not None:
    try:     
         sh_df=mod.scoring_history()
         sh_df.to_csv(run_id+'_xrt_scoring_history.csv')
    except:
         pass     
    try:        
         stats_xrt={}
         stats_xrt=xrt_stats(mod)
         n=run_id+'_xrt_stats.json'
         dict_to_json(stats_xrt,n)
         print(stats_xrt)
    except:
         pass     

In [68]:
# GLM

mod,mod_id=get_model_by_algo("GLM",best_models)
if mod is not None:
    try:     
         stats_glm={}
         stats_glm=glm_stats(mod)
         n=run_id+'_glm_stats.json'
         dict_to_json(stats_glm,n)
         print(stats_glm)
    except:
         pass     

{'algo': 'glm', 'model_id': 'GLM_grid_0_AutoML_20181003_010045_model_0', 'coef': {'Intercept': 33044.16808616913, 'Address.1 ALBEMARLE RD BROOKLYN, NY 11218': 0.0, 'Address.1 CORPORATE COMMONS-1 TELEPORT DR STATEN ISLAND, NY 10311': -2.0865164134549933e-05, 'Address.1 PECK SLIP NEW YORK, NY 10038': 5.384891544277357e-05, 'Address.1-50 51ST AVE LONG ISLAND CITY, NY 11101': 0.0, 'Address.1-90 BEACH 110TH ST ROCKAWAY PARK, NY 11694': -7.158011614831473e-06, 'Address.10 E 15TH ST NEW YORK, NY 10003': -2.086521023010438e-05, 'Address.10-45 NAMEOKE ST FAR ROCKAWAY, NY 11691': -2.086519022317714e-05, 'Address.100 ATTORNEY ST NEW YORK, NY 10002': 1.001043143421921e-06, 'Address.100 CLERMONT AVE BROOKLYN, NY 11205': -1.7967746199727906e-05, 'Address.100 ESSEX DR STATEN ISLAND, NY 10314': -2.086517165319168e-05, 'Address.100 HESTER ST NEW YORK, NY 10002': -2.0865135187452755e-05, 'Address.100 IRVING AVE BROOKLYN, NY 11237': 7.460089343688985e-06, 'Address.100 NOLL ST BROOKLYN, NY 11206': 6.60406




In [69]:
predictions_df=predictions_test(mod_best,test,run_id)

gbm prediction progress: |████████████████████████████████████████████████| 100%


In [70]:
predictions_df.head()

predict
32807.3
36146.8
92039.0
59711.3
48040.5
80417.9
1214.51
6621.11
-205.955
34782.7




In [71]:
predictions_df.describe()

Rows:131
Cols:1




Unnamed: 0,predict
type,real
mins,-1465.5927400073779
mean,32897.270676007356
maxs,92039.02070885088
sigma,26131.357246003794
zeros,0
missing,0
0,32807.32305156376
1,36146.794481030964
2,92039.02070885088


In [72]:
# Update and save meta data

meta_data['end_time'] = time.time()
meta_data['execution_time'] = meta_data['end_time'] - meta_data['start_time']
  
n=run_id+'_meta_data.json'
dict_to_json(meta_data,n)    


In [73]:
meta_data

{'X': ['SED Code',
  'District',
  'Zip',
  'Grade High',
  'Grade 3 ELA - All Students Tested',
  'Grade 3 ELA 4s - All Students',
  'Grade 3 ELA 4s - American Indian or Alaska Native',
  'Grade 3 ELA 4s - Black or African American',
  'Grade 3 ELA 4s - Hispanic or Latino',
  'Grade 3 ELA 4s - Asian or Pacific Islander',
  'Grade 3 ELA 4s - White',
  'Grade 3 ELA 4s - Multiracial',
  'Grade 3 ELA 4s - Limited English Proficient',
  'Grade 3 ELA 4s - Economically Disadvantaged',
  'Grade 3 Math - All Students tested',
  'Grade 3 Math 4s - All Students',
  'Grade 3 Math 4s - American Indian or Alaska Native',
  'Grade 3 Math 4s - Black or African American',
  'Grade 3 Math 4s - Hispanic or Latino',
  'Grade 3 Math 4s - Asian or Pacific Islander',
  'Grade 3 Math 4s - White',
  'Grade 3 Math 4s - Multiracial',
  'Grade 3 Math 4s - Limited English Proficient',
  'Grade 3 Math 4s - Economically Disadvantaged',
  'Grade 4 ELA - All Students Tested',
  'Grade 4 ELA 4s - All Students',
  'Gra

In [74]:
# Clean up
os.chdir(server_path)

In [75]:
h2o.cluster().shutdown()

H2O session _sid_ae4f closed.


**When availble for your models you will need to make the following plots: **

## Variable Importance Plots

The Variable Importance Plot graphs the VIP values for each X variable. 

https://www.researchgate.net/figure/Random-forest-variable-importance-plot-Variables-are-ranked-in-terms-of-importance-on_fig2_295097543



## Partial Dependence Plots

For models that include only numerical values, you can view a Partial Dependence Plot (PDP) for that model. This provides a graphical representation of the marginal effect of a variable on the class probability (classification) or response (regression).

https://www.kaggle.com/dansbecker/partial-dependence-plots



## Gains/Lift Charts

[Score: Gains/Lift Table](http://h2o-release.s3.amazonaws.com/h2o/master/1648/docs-website/userguide/scoregainslift.html)  

The Gains/Lift Table page uses predicted data to evaluate model performance. The accuracy of the classification model for a random sample is evaluated according to the results when the model is and is not used.

The Gains/Lift Table is particularly useful for direct marketing applications, for example. The gains/lift chart shows the effectiveness of the current model(s) compared to a baseline, allowing users to quickly identify the most useful model.

To create a Gains/Lift table, H2O applies the model to each entry in the original dataset to find the response probability (Pi^), then orders the entries according to their predicted response probabilities. Finally, H2O divides the dataset into equal groups and calculates the average response rate for each group.

H2O uses the response rate of the top ten groups to evaluate the model performance; the highest response and greatest variation rates indicate the best model.

The lift is calculated from the gains. H2O uses the following formula to calculate the lift: λk=rkr⎯⎯⎯

where λk is the lift for k, rk is the response rate for k, and r⎯⎯⎯ is the average response rate for k. In other words, λk defines how much more likely k customers are to respond in comparison to the average response rate.

Requirements:

The vactual column must contain actual binary class labels.
The vpredict column must contain probabilities.

Evaluating Classifiers: Gains and Lift Charts https://youtu.be/1dYOcDaDJLY 

Understanding And Interpreting Gain And Lift Charts

   https://www.datasciencecentral.com/profiles/blogs/understanding-and-interpreting-gain-and-lift-charts
   
   

**Regression Metrics**

MSE

Mean Squared Error. The “squared” bit means the bigger the error, the more it is punished. 

deviance

Actually short for mean residual deviance. If the distribution is gaussian, then it is equal to MSE, and when not it usually gives a more useful estimate of error, which is why it is the default. Needs to be specified as “residual_deviance” when sorting grids.


RMSE
The square root of MSE. If your response variable units are dollars, the units of MSE is dollars-squared, but RMSE is back into dollars.

MAE
Mean Absolute Error.

R2
R-squared, also written as R², and also known as the coefficient of determination. 

RMSLE

Root Mean Squared Logarithmic Error. Prefer this to RMSE if an under-prediction is worse than an over-prediction.
