Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Census: Add 29 2019-2020 state CS files #35199

Merged
merged 1 commit into from Jun 8, 2020
Merged

Conversation

hacodeorg
Copy link
Contributor

@hacodeorg hacodeorg commented Jun 8, 2020

Adding state CS offerings in 2019-2020 school year to git.

Links

Files in Google Drive:

How to import state CS offering CSV file

Step 1. Verify data

  • Must have 3 columns: state_school_id, nces_id, course_id
  • Must not have teacher_name_or_id column, or have teacher first name/last name/id in any other columns.

Step 2. Test importing data

Option 1: Test locally

  • Seed SchoolDistrict and School data from S3 to the local machine by running SchoolDistrict.seed_from_s3; School.seed_from_s3. (State CS offering data requires valid matching School ids, which in turn requires SchoolDistrict ids.)
  • If seeding fails because of mismatching data or missing foreign keys, an alternative could be dumping data from staging or production to the local machine.
  • Run Census::StateCsOffering.seed_from_csv. (There is a dry_run option, which only parses and matches data in the input file but doesn't write anything to the database.)

Option 2: Test in production-console using file in S3

This is the easiest way to import new data if you think data is quite clean and don't expect to make any code change. See details in #34925.

Step 3: Deploy

  • Deploy code change, if any, to production first.
  • Upload CSV file to S3 cdo-census bucket after code changes are in production. New file will be automatically imported to staging and production.
  • Add CSV file to git folder in dashboard/config/state_cs_offerings. Double check that the file doesn't contain any personal data!

Step 4: Verify results after import

From a Rails console in production-console:

count_offers = lambda {|state, year| Census::StateCsOffering.where(school_year: year).where('state_school_id LIKE ?', "#{state}%").count}
count_offers.call('CA', 2018)

What's next

Tracked by PLC-908.
Currently we have access report data (state AP offerings and AP CS offerings) in S3, Google Drive, and Git. We prefer to have Git as the source of truth because of its version-tracking benefit, and stop using S3. (Google Drive can still be used as a staging area to share and edit data.)

Several items to tackle are:

  • Move all School, SchoolDistrict, ApSchoolCode, ApCSOffering, StateCSOffering CSV files from S3 to Git. Assure that they don't have any PII data.
  • Seed efficiently, don't re-seed files if we don't need to. (Seeding fully can take at least several minutes.)
  • Decide what environments need fully-seeded data, what don't.

@hacodeorg hacodeorg requested review from a team, bencodeorg and breville June 8, 2020 20:10
@hacodeorg hacodeorg marked this pull request as ready for review June 8, 2020 20:10
@hacodeorg hacodeorg merged commit 6775850 into staging Jun 8, 2020
@hacodeorg hacodeorg deleted the ha/census-2020-data branch June 8, 2020 20:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants