Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add infrastructure for clearing repository and running seeds #5472

Merged
merged 3 commits into from
Mar 24, 2022

Conversation

elrayle
Copy link
Contributor

@elrayle elrayle commented Feb 28, 2022

This puts in place a series of classes that support generation of seed data.

Pattern

The pattern handles 3 scenarios. NOTE: all paths to files start from app/utils.

  • clear existing repository metadata and data that is tightly coupled with the repository model (see data_maintenance.rb and files under data_destroyers)
  • generate required repository objects (e.g. default collection types, default admin set) (see required_data_seeder.rb and files under required_data_seeders)
  • generate repository metadata for release and development testing (see test_data_seeder.rb and files under test_data_seeders)

Benefits

Benefits of this approach:

  • each scenario can be run together or separately
  • allows applications to easily override the details of the seeding process using normal customization approaches
  • establishes a pattern that applications can follow to create their own seed data
  • provides an easy means of adding additional scenarios (e.g. current scenarios are for testing; a site can also have seeds that drive static production data, like collection types and collections)

To run:

NOTE: These examples show setting environment variables from the command line. They can be set in the regular way that an app sets environment variables. But since the running of seeds is generally not something you want to run all the time, it seems more like that setting the environment variables from the command line is a more common approach for seeds.

To list options without running anything...

$ bundle exec rails db:seed

To wipe data and generate required data...

$ bundle exec rails db:seed WIPE_DATA=true
####################################################################################

WARNING: You are about to clear all repository metadata from the datastore and solr.
Are you sure? [YES|n]
YES

To generate release data...

$ bundle exec rails db:seed SEED_RELEASE_TESTING=true

To run both...

$ bundle exec rails db:seed WIPE_AND_SEED_RELEASE_TESTING=true

Related Work

Issue #5351
PR #5056

@samvera/hyrax-code-reviewers

@dlpierce
Copy link
Contributor

I like the approach and that it allows seed specs to be written.

Could the combined option be replaced with running bundle exec rails db:seed WIPE_DATA=true SEED_RELEASE_TESTING=true?

@elrayle elrayle force-pushed the dev-data-seeds branch 2 times, most recently from 024bcd8 to 65abef5 Compare March 1, 2022 14:34
@dlpierce dlpierce mentioned this pull request Mar 2, 2022
@elrayle elrayle force-pushed the dev-data-seeds branch 2 times, most recently from ef1b0cc to a90b7cd Compare March 23, 2022 22:41
Also makes sure that collections are created twice with the same title and collection type.  Prevents duplicates if the seeds are run multiple times.
@elrayle elrayle merged commit b9c8de8 into main Mar 24, 2022
@elrayle elrayle deleted the dev-data-seeds branch March 24, 2022 13:40
@dlpierce dlpierce added the notes-minor Release Notes: Non-breaking features label Mar 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feedback needed notes-minor Release Notes: Non-breaking features
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants