-
Notifications
You must be signed in to change notification settings - Fork 482
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NCES public schools data 2020-21 upload #45941
Merged
Merged
Changes from all commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
There are no files selected for viewing
126 changes: 126 additions & 0 deletions
126
bin/oneoff/nces_data/import_school_stats_by_years_2020_2021_ccd
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,126 @@ | ||
#!/usr/bin/env ruby | ||
|
||
require_relative '../../../dashboard/config/environment' | ||
|
||
CDO.log = Logger.new(STDOUT) | ||
|
||
SURVEY_YEAR = '2020-2021'.freeze | ||
|
||
DRY_RUN = true | ||
|
||
VIRTUAL_SCHOOL_MAP = { | ||
'Full Virtual' => 'Yes', | ||
'Not Virtual' => 'No', | ||
'Supplemental Virtual' => 'No', | ||
'Virtual with face to face options' => 'Yes', | ||
'–' => nil, | ||
'†' => nil | ||
}.freeze | ||
|
||
TITLE_I_MAP = { | ||
'1-Title I targeted assistance eligible school-No program' => '1', | ||
'2-Title I targeted assistance school' => '2', | ||
'3-Title I schoolwide eligible-Title I targeted assistance program' => '3', | ||
'4-Title I schoolwide eligible school-No program' => '4', | ||
'5-Title I schoolwide school' => '5', | ||
'6-Not a Title I school' => '6', | ||
'–' => nil, | ||
'†' => nil | ||
}.freeze | ||
|
||
COMMUNITY_TYPE_MAP = { | ||
'11' => 'city_large', | ||
'12' => 'city_midsize', | ||
'13' => 'city_small', | ||
'21' => 'suburban_large', | ||
'22' => 'suburban_midsize', | ||
'23' => 'suburban_small', | ||
'31' => 'town_fringe', | ||
'32' => 'town_distant', | ||
'33' => 'town_remote', | ||
'41' => 'rural_fringe', | ||
'42' => 'rural_distant', | ||
'43' => 'rural_remote' | ||
}.freeze | ||
|
||
GRADES_MAP = { | ||
'Prekindergarten' => 'PK', | ||
'Kindergarten' => 'KG', | ||
'1st Grade' => '01', | ||
'2nd Grade' => '02', | ||
'3rd Grade' => '03', | ||
'4th Grade' => '04', | ||
'5th Grade' => '05', | ||
'6th Grade' => '06', | ||
'7th Grade' => '07', | ||
'8th Grade' => '08', | ||
'9th Grade' => '09', | ||
'10th Grade' => '10', | ||
'11th Grade' => '11', | ||
'12th Grade' => '12', | ||
'13th Grade' => '13', | ||
'Adult Education' => 'AE', | ||
'Ungraded' => 'UG', | ||
'–' => 'M', | ||
'†' => 'N' | ||
} | ||
|
||
# @param unsanitized [String, nil] the unsanitized string | ||
# @returns [String, nil] the sanitized version of the string, with equal signs and double | ||
# quotations removed. Returns nil on nil input. | ||
def sanitize_string_for_db(unsanitized) | ||
unsanitized&.tr('="', '') | ||
end | ||
|
||
# –, † .to_i will return 0 | ||
|
||
AWS::S3.process_file('cdo-nces', "#{SURVEY_YEAR}/ccd/schools_public.csv") do |filename| | ||
SchoolStatsByYear.merge_from_csv(filename, {col_sep: ",", headers: true, quote_char: "\x00", encoding: 'UTF-8'}, dry_run: DRY_RUN) do |row| | ||
# remove quote and eq sign from ="12345" | ||
row = row.to_h.map {|k, v| [k, sanitize_string_for_db(v)]}.to_h | ||
|
||
{ | ||
school_id: row['School ID - NCES Assigned [Public School] Latest available year'].to_i.to_s, | ||
school_year: SURVEY_YEAR, | ||
grades_offered_lo: GRADES_MAP[row['Lowest Grade Offered [Public School] 2020-21']], | ||
grades_offered_hi: GRADES_MAP[row['Highest Grade Offered [Public School] 2020-21']], | ||
grade_pk_offered: row['Prekindergarten offered [Public School] 2020-21'] == '1-Yes', | ||
grade_kg_offered: row['Kindergarten offered [Public School] 2020-21'] == '1-Yes', | ||
grade_01_offered: row['Grade 1 offered [Public School] 2020-21'] == '1-Yes', | ||
grade_02_offered: row['Grade 2 offered [Public School] 2020-21'] == '1-Yes', | ||
grade_03_offered: row['Grade 3 offered [Public School] 2020-21'] == '1-Yes', | ||
grade_04_offered: row['Grade 4 offered [Public School] 2020-21'] == '1-Yes', | ||
grade_05_offered: row['Grade 5 offered [Public School] 2020-21'] == '1-Yes', | ||
grade_06_offered: row['Grade 6 offered [Public School] 2020-21'] == '1-Yes', | ||
grade_07_offered: row['Grade 7 offered [Public School] 2020-21'] == '1-Yes', | ||
grade_08_offered: row['Grade 8 offered [Public School] 2020-21'] == '1-Yes', | ||
grade_09_offered: row['Grade 9 offered [Public School] 2020-21'] == '1-Yes', | ||
grade_10_offered: row['Grade 10 offered [Public School] 2020-21'] == '1-Yes', | ||
grade_11_offered: row['Grade 11 offered [Public School] 2020-21'] == '1-Yes', | ||
grade_12_offered: row['Grade 12 offered [Public School] 2020-21'] == '1-Yes', | ||
grade_13_offered: row['Grade 13 offered [Public School] 2020-21'] == '1-Yes', | ||
|
||
virtual_status: VIRTUAL_SCHOOL_MAP[row['Virtual School Status (SY 2016-17 onward) [Public School] 2020-21']], | ||
students_total: row['Total Students All Grades (Excludes AE) [Public School] 2020-21'].presence.try {|v| v.to_i == 0 ? nil : v.to_i}, | ||
student_am_count: row['American Indian/Alaska Native Students [Public School] 2020-21'].to_i, | ||
student_as_count: row['Asian or Asian/Pacific Islander Students [Public School] 2020-21'].to_i, | ||
student_hi_count: row['Hispanic Students [Public School] 2020-21'].to_i, | ||
student_bl_count: row['Black or African American Students [Public School] 2020-21'].to_i, | ||
student_wh_count: row['White Students [Public School] 2020-21'].to_i, | ||
student_hp_count: row['Nat. Hawaiian or Other Pacific Isl. Students [Public School] 2020-21'].to_i, | ||
student_tr_count: row['Two or More Races Students [Public School] 2020-21'].to_i, | ||
title_i_status: TITLE_I_MAP[row['Title I School Status [Public School] 2020-21']], | ||
frl_eligible_total: row['Free and Reduced Lunch Students [Public School] 2020-21'].to_i | ||
} | ||
end | ||
end | ||
|
||
AWS::S3.process_file('cdo-nces', "#{SURVEY_YEAR}/ccd/locale_public.csv") do |filename| | ||
SchoolStatsByYear.merge_from_csv(filename, {col_sep: ",", headers: true, quote_char: "\x00", encoding: 'UTF-8'}, dry_run: DRY_RUN) do |row| | ||
{ | ||
school_id: row['NCESSCH'].to_i.to_s, | ||
school_year: SURVEY_YEAR, | ||
community_type: COMMUNITY_TYPE_MAP[row['LOCALE']] | ||
} | ||
end | ||
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need these characters in the hashes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They're in the raw data as their own values, do you have an example file, clare?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes! Here are the files in drive which explain the symbols. (I removed the extra rows at the top and bottom before uploading them to S3)