Skip to content

Scripts

Benjamin Shen edited this page Oct 28, 2022 · 4 revisions

Background

For CoursePlan, scripts are functions that are not used by directly by the main web app. Rather, they are invoked by a developer/maintainer or scheduled to run (via a cloud function or Github workflow).

The first scripts on CoursePlan were used for the pre-computation part of the requirements algorithm. They lived directly in the src/ folder, and were invoked by developers whenever requirements fulfillment data changed or new course roster data was released.

Now, there are many more use cases for scripts, including scheduled functions and migration scripts. This document serves as a source of truth for best practices when creating and running scripts.

Structure

We use ts-node to run scripts, and tsconfig.node.json as the configuration file. This configuration file should be kept up-to-date (eg. when updating node versions). There is a npm script npm run ts-node which runs ts-node with the appropriate configuration file. Scripts that are called frequently or by external services can be added to package.json as new npm scripts.

Scripts that don’t have a dependence on other code in src/ should be put into the scripts/ folder rather than the main source code. One-off scripts should always be committed and pushed to the codebase, so other developers can verify their correctness and use them as examples for future scripts, and so they can be run on the production database (if necessary) by the TPM. These scripts should be cleaned up by the author after they are completed.

Database scripts run at an elevated level, since they often read and write data indiscriminately across collections and users. Therefore, we use an admin Firebase configuration (via Firebase service accounts), which is different from the app-level Firebase configuration. There are different service accounts for different environments. To run a database script on production, the environment variable PROD should have the value 'true'. A dev can achieve this by typing export PROD=true in their terminal. In Github workflows, it’s important to set a custom service account since there are no service account files in the workflow environment. So, the environment variable SERVICE_ACCOUNT should be set to an environment-specific service account stored in the repo’s Github secrets.

Scripts that use command line arguments should use the minimist package.

Migration Scripts

The most common kind of database script is a database migration script. Common use cases include mapping old values to new values, and changing schemas. Database migration scripts should be kept in scripts/migration, and removed after the migration is completed if they are one-off scripts.

How to Run a Migration Script with Breaking Changes

Migration scripts with breaking changes (eg. requirements id change or schema change) are the trickiest to run, since they can break production. There are three options for running these scripts.

The first is to run the script soon after deployment to staging or production. This is almost never recommended, since there is a chance of causing database inconsistencies with any active users.

The second is to temporarily shut down CoursePlan (put it on “maintenance mode”). Then, we can run the script and commit code changes during this time frame. This is not the most ideal option, since it causes app downtime and there is time pressure (which increases risk of mistakes). However, it may be necessary for certain breaking changes, such as requirement id changes. If possible, these scripts should be postponed to periods of low user activity. It would be best to avoid these kinds of scripts altogether.

The third is to follow a series of steps that always keeps CoursePlan in a safe state. The downside is that this can take a long time (i.e., over the course of a semester), and these steps can vary depending on the kind of script.

As an example, the safe steps for a schema change are as follows.

Part 1, update schema with safety in lower environments

  • [code] start writing to and reading from the database with the new schema, while still handling reads with the old schema (this likely involves updating the Firestore types)
  • [deploy] merge code to main branch, deploy to staging, test staging environment

Part 2, migrate database in lower environments

  • [script] write and execute script in dev/staging for a mass migration from the old schema to the new schema, test staging environment

Part 3, update schema with safety and migrate database in production

  • [deploy] release code to production, test production environment
  • [script] execute the same script in production, test production environment

Part 4, remove safety in lower environments

  • [code] stop handling reads with the old schema (this likely involves cleaning up the Firestore types)
  • [deploy] merge code to main branch, deploy to staging, test staging environment

Part 5, cleanup database in lower environments

  • [script] write and execute script in dev/staging for cleanup (i.e., remove any old fields), test staging environment

Part 6, remove safety and cleanup database in production

  • [deploy] release code to production, test production environment
  • [script] execute the same script in production, test production environment

Scheduled Functions

Due to issues with Firebase functions in the past, we currently use Github workflows to run scheduled functions. For more information: Events that trigger workflows - GitHub Docs

These scripts may or may not also write/read to/from the database. The track-users script is an example of a scheduled function that is also a database script.

Other Scripts

There is a high potential for other kinds of scripts in CoursePlan. For example, there may be a script in the near future to update the courses json automatically, which would be a script run manually via Github workflow dispatch.

Examples

Courses json generator script: course-plan/courses-json-generator.ts at master · cornell-dti/course-plan · GitHub

Requirements json generator script: course-plan/requirement-json-generator.ts at master · cornell-dti/course-plan · GitHub

Example of scheduled function: course-plan/track-users.ts at master · cornell-dti/course-plan · GitHub, course-plan/track-users.yml at master · cornell-dti/course-plan · GitHub

Example of database non-migration script: course-plan/copy-user-data.ts at master · cornell-dti/course-plan · GitHub

Example of database migration script without breaking changes: Add script for migrating old subject colors to new subject colors by bungaepyo · Pull Request #614 · cornell-dti/course-plan · GitHub

Example of database migration script with breaking changes: Add semester season property from semester type by benjamin-shen · Pull Request #593 · cornell-dti/course-plan · GitHub