Skip to content

feat!(backend): add column in backend for raw original data that should never be modified#6254

Merged
anna-parker merged 4 commits intomainfrom
add_new_column
Apr 9, 2026
Merged

feat!(backend): add column in backend for raw original data that should never be modified#6254
anna-parker merged 4 commits intomainfrom
add_new_column

Conversation

@anna-parker
Copy link
Copy Markdown
Contributor

@anna-parker anna-parker commented Apr 9, 2026

#3262

Adds a new field to the database called unprocessed_data. It is for now populated with a copy of the original_data jsonb column. unprocessed_data will be used instead of original_data for calls to preprocessing/requests for original data. original_data on the other hand will not be used and serves only as a backup/storage of the initial metadata we recieved. This will simplify the reprocessing process as we can migrate unprocessed_data in a db migration which is not stored in the code but still keep a record of the data originally submitted to us.

The biggest code change is the subsequent field renaming that I did to make the distrinction between original and unprocessed data clearer in the code.

Breaking change

The get-original-data endpoint is now called get-unprocessed-data endpoint.

Screenshot

PR Checklist

  • All necessary documentation has been adapted.
  • The implemented feature is covered by appropriate, automated tests.
  • Any manual testing that has been done is documented (i.e. what exactly was tested?)
    Preview comes up, revision (which uses affected endpoint) via the webpage is WAI: LOC_00001V8.2

🚀 Preview: https://add-new-column.loculus.org

@anna-parker anna-parker added the preview Triggers a deployment to argocd label Apr 9, 2026
@anna-parker anna-parker marked this pull request as ready for review April 9, 2026 10:26
@anna-parker anna-parker changed the title feat: add column in backend for raw data feat(backend): add column in backend for raw data Apr 9, 2026
@claude
Copy link
Copy Markdown
Contributor

claude bot commented Apr 9, 2026

Claude encountered an error —— View job


I'll analyze this and get back to you.

@anna-parker anna-parker changed the title feat(backend): add column in backend for raw data feat(backend): add column in backend for raw original data that should never be modified Apr 9, 2026
…he backend now exposes unprocessed metadata and stores original metadata internally)
@anna-parker anna-parker changed the title feat(backend): add column in backend for raw original data that should never be modified feat!(backend): add column in backend for raw original data that should never be modified Apr 9, 2026
Copy link
Copy Markdown
Contributor

@maverbiest maverbiest left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

Connected to the DB via port forwarding. Checked that this query returns all rows:

SELECT * FROM public.sequence_entries
WHERE original_data = unprocessed_data;

And that this query returns no rows:

SELECT * FROM public.sequence_entries
WHERE original_data != unprocessed_data;

I also did spot checks on the preview web page and data seems to be coming up fine, with all the data as expected in the sequence details view I checked.

@anna-parker
Copy link
Copy Markdown
Contributor Author

I guess it could now be a tad confusing the the prepro pipeline calls extract-unprocessed-data to get unprocessed data - previously original data (updates the status to IN_PROCESSING) and the get-unprocessed-data (previously get-original-data does the same thing just without setting the status to IN_PROCESSING.

This is a preexisting naming issue, but might need a second thought, @theosanderson what do you think?

@anna-parker anna-parker merged commit 907cd5d into main Apr 9, 2026
53 of 54 checks passed
@anna-parker anna-parker deleted the add_new_column branch April 9, 2026 14:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

preview Triggers a deployment to argocd update_db_schema

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants