# SEDA Data

##### GCS, Grade (within cohort) scale metric, data file

In [3]:
import pandas as pd
import geopandas as gpd 
import matplotlib.pyplot as plt 
import contextily as ctx
import numpy as np
import quilt3
from geopandas_view import view

In [4]:
seda = pd.read_csv("https://stacks.stanford.edu/file/druid:db586ns4974/seda_school_pool_gcs_4.0.csv",converters={"sedasch":str})
seda.sedasch=seda.sedasch.str.rjust(12, "0")

seda.head()

Unnamed: 0,sedasch,sedaschname,fips,stateabb,subcat,subgroup,gradecenter,gap,tot_asmts,cellcount,...,gcs_mn_grd_ol_se,gcs_mn_mth_ol_se,gcs_mn_avg_eb,gcs_mn_coh_eb,gcs_mn_grd_eb,gcs_mn_mth_eb,gcs_mn_avg_eb_se,gcs_mn_coh_eb_se,gcs_mn_grd_eb_se,gcs_mn_mth_eb_se
0,10000201667,Camps,1,AL,all,all,7.5,0,13,2,...,,,,,,,,,,
1,10000201670,Det Ctr,1,AL,all,all,7.5,0,2,1,...,,,,,,,,,,
2,10000201705,Wallace Sch - Mt Meigs Campus,1,AL,all,all,7.0,0,98,12,...,,,,,,,,,,
3,10000201706,McNeel Sch - Vacca Campus,1,AL,all,all,7.0,0,118,12,...,,,2.632403,,,,0.469368,,,
4,10000500870,Albertville Middle School,1,AL,all,all,7.5,0,12520,39,...,,0.16536,6.363105,-0.026981,,-0.256272,0.082592,0.027416,,0.155361


In [6]:
seda

Unnamed: 0,sedasch,sedaschname,fips,stateabb,subcat,subgroup,gradecenter,gap,tot_asmts,cellcount,...,gcs_mn_grd_ol_se,gcs_mn_mth_ol_se,gcs_mn_avg_eb,gcs_mn_coh_eb,gcs_mn_grd_eb,gcs_mn_mth_eb,gcs_mn_avg_eb_se,gcs_mn_coh_eb_se,gcs_mn_grd_eb_se,gcs_mn_mth_eb_se
0,010000201667,Camps,1,AL,all,all,7.5,0,13,2,...,,,,,,,,,,
1,010000201670,Det Ctr,1,AL,all,all,7.5,0,2,1,...,,,,,,,,,,
2,010000201705,Wallace Sch - Mt Meigs Campus,1,AL,all,all,7.0,0,98,12,...,,,,,,,,,,
3,010000201706,McNeel Sch - Vacca Campus,1,AL,all,all,7.0,0,118,12,...,,,2.632403,,,,0.469368,,,
4,010000500870,Albertville Middle School,1,AL,all,all,7.5,0,12520,39,...,,0.16536,6.363105,-0.026981,,-0.256272,0.082592,0.027416,,0.155361
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
82006,729999200586,TOMAS ALBA EDISON,72,PR,all,all,5.0,0,36,3,...,,,,,,,,,,
82007,729999200776,CENTRO VOCACIONAL ESPECIAL,72,PR,all,all,7.0,0,20,7,...,,,,,,,,,,
82008,729999201270,TOMAS CARRION MADURO,72,PR,all,all,5.5,0,140,27,...,,,,,,,,,,
82009,729999201511,JOSE M. TORRES,72,PR,all,all,5.5,0,1505,48,...,,,,,,,,,,


In [7]:
seda.columns

Index(['sedasch', 'sedaschname', 'fips', 'stateabb', 'subcat', 'subgroup',
       'gradecenter', 'gap', 'tot_asmts', 'cellcount', 'mn_asmts',
       'gcs_mn_avg_ol', 'gcs_mn_coh_ol', 'gcs_mn_grd_ol', 'gcs_mn_mth_ol',
       'gcs_mn_avg_ol_se', 'gcs_mn_coh_ol_se', 'gcs_mn_grd_ol_se',
       'gcs_mn_mth_ol_se', 'gcs_mn_avg_eb', 'gcs_mn_coh_eb', 'gcs_mn_grd_eb',
       'gcs_mn_mth_eb', 'gcs_mn_avg_eb_se', 'gcs_mn_coh_eb_se',
       'gcs_mn_grd_eb_se', 'gcs_mn_mth_eb_se'],
      dtype='object')

## Variables Defined (from SEDA Codebook CVS)

**sedasch**: SEDA School ID

**sedaschname**: School Name

**fips**: State FIPS Code

**stateabb**: State Abbreviation

**subcat**: Subgroup Category

**subgroup**: Subgroup Case

**gradecenter**: Grade used for pooled centering

**gap**: Gap Estimate Indicator

**tot_asmts**: Total number of math + RLA tests for pooled estimates

**cellcount**: Total number of math + RLA cells for pooled estimates

**mn_asmts**: Per grade number of math + RLA cells for pooled estimates (tot_asmts/cellcount)

**gcs_mn_avg_ol**: School Mean SEDA EDFacts Test-Based Achievement Math&RLA, Ordinary Least Squares (OLS) estimate,  Grade-Cohort Scale (GCS)

**gcs_mn_coh_ol**: School Cohort Slope of Mean SEDA EDFacts Test-Based Achievement  Math&RLA, Ordinary Least Squares (OLS) estimate,  Grade-Cohort Scale (GCS)

**gcs_mn_grd_ol**: School Grade Slope of Mean SEDA EDFacts Test-Based Achievement  Math&RLA, Ordinary Least Squares (OLS) estimate,  Grade-Cohort Scale (GCS)

**gcs_mn_mth_ol**: School Math-RLA Diff in Mean SEDA EDFacts Test-Based Achievement  Math&RLA, Ordinary Least Squares (OLS) estimate,  Grade-Cohort Scale (GCS)

**gcs_mn_avg_ol_se**: School Standard Error (SE) of Mean SEDA EDFacts Test-Based Achievement  Math&RLA, Ordinary Least Squares (OLS) estimate,  Grade-Cohort Scale (GCS)

**gcs_mn_coh_ol_se**: School Standard Error (SE) of Cohort Slope of Mean SEDA EDFacts Test-Based Achievement  Math&RLA, Ordinary Least Squares (OLS) estimate,  Grade-Cohort Scale (GCS)

**gcs_mn_grd_ol_se**: School Standard Error (SE) of Grade Slope of Mean SEDA EDFacts Test-Based Achievement  Math&RLA, Ordinary Least Squares (OLS) estimate,  Grade-Cohort Scale (GCS)

**gcs_mn_mth_ol_se**: School Standard Error (SE) of Math-RLA Diff in Mean SEDA EDFacts Test-Based Achievement  Math&RLA, Ordinary Least Squares (OLS) estimate,  Grade-Cohort Scale (GCS)

**gcs_mn_avg_eb**: School Mean SEDA EDFacts Test-Based Achievement  Math&RLA, Empirical Bayes (EB) estimate,  Grade-Cohort Scale (GCS)

**gcs_mn_coh_eb**: School Cohort Slope of Mean SEDA EDFacts Test-Based Achievement  Math&RLA, Empirical Bayes (EB) estimate,  Grade-Cohort Scale (GCS)

**gcs_mn_grd_eb**: School Grade Slope of Mean SEDA EDFacts Test-Based Achievement  Math&RLA, Empirical Bayes (EB) estimate,  Grade-Cohort Scale (GCS)

**gcs_mn_mth_eb**: School Math-RLA Diff in Mean SEDA EDFacts Test-Based Achievement  Math&RLA, Empirical Bayes (EB) estimate,  Grade-Cohort Scale (GCS)

**gcs_mn_avg_eb_se**: School Standard Error (SE) of Mean SEDA EDFacts Test-Based Achievement  Math&RLA, Empirical Bayes (EB) estimate,  Grade-Cohort Scale (GCS)

**gcs_mn_coh_eb_se**: School Standard Error (SE) of Cohort Slope of Mean SEDA EDFacts Test-Based Achievement  Math&RLA, Empirical Bayes (EB) estimate,  Grade-Cohort Scale (GCS)

**gcs_mn_grd_eb_se**: School Standard Error (SE) of Grade Slope of Mean SEDA EDFacts Test-Based Achievement  Math&RLA, Empirical Bayes (EB) estimate,  Grade-Cohort Scale (GCS)

**gcs_mn_mth_eb_se**: School Standard Error (SE) of Math-RLA Diff in Mean SEDA EDFacts Test-Based Achievement  Math&RLA, Empirical Bayes (EB) estimate,  Grade-Cohort Scale (GCS)

## Interpreting the Variables

### CS Scale vs. GCS Scale
CS Scale is a metric that is used to describe aggregated change over time in test scores.
 - advantage of being able to describe aggregated changed over time in test scores
 - does not enable absolute comparisons across grades

GCS Scale is a metric that shows the standardized relative to the average difference in NAEP scores between students one grade level apart in a given cohort. Interpreted as equivalent to the average difference in skills between students one grade level apart in school.
 -  useful for descriptive research to broad audiences not familiar with interpreting standard deviation units, but may not be appropriate in all statistical analyses

Decide whether to use the CS scale or GCS scale?
 - same variables

### Variables to Consider for Analysis

Consider interpreting the "gcs_mn_avg_ol" variable for the analysis
 - the GCS average mean of the SEDA Math & ELA test based achievement