Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make sources.filesystem_id non-nullable #6350

Merged
merged 1 commit into from Apr 19, 2022

Conversation

legoktm
Copy link
Member

@legoktm legoktm commented Mar 18, 2022

Status

Ready for review

Description of Changes

We already set this field all the time and in all places expect it to be
a string, not checking if its None first.

Discovered by the mypy work I'm doing in #6346.

Refs #6226.

Testing

  • Before applying this patch, create a source, log in with it. Then check out this branch, in the dev container run alembic upgrade head to apply this schema change, and try to log in with the source. Verify you can log in properly.
  • CI passes

Deployment

Any special considerations for deployment? No, just a normal schema change.

Checklist

  • Linting (make lint) and tests (make test) pass in the development container
  • I have written a test plan and validated it for this PR
  • These changes do not require documentation

legoktm added a commit that referenced this pull request Mar 18, 2022
* Upgrade mypy to 0.941 and sqlalchemy-stubs to 0.4. Type stubs are no
  longer bundled with mypy, so we need to explicitly install various
  types-{name} packages.
* The main issue spotted by the SQLAlchemy plugin, that
  sources.filesystem_id is nullable, is being fixed in a separate PR:
  #6350.
* Use `--explicit-package-bases` when running mypy, so it treats the
  securedrop/admin directories as modules regardless of whether
  `__init__.py` files exist. This is needed so the two tests/
  directories don't conflict with each other.
* Add missing return types to all schema changes and the templates that
  generate them.
* In the `create_source_uuid_column` migration, drop the use of
  `quoted_name()`. For some reason mypy doesn't like it, but more
  importantly we don't need it in SQLite and we don't use it in any
  other schema change.
* Make SecureTemporaryFile.write()'s return type match its parent by
  returning the number of bytes that were written.

Fixes #6346.
@codecov-commenter
Copy link

codecov-commenter commented Mar 21, 2022

Codecov Report

Merging #6350 (fc72191) into develop (3e39d73) will decrease coverage by 0.04%.
The diff coverage is 69.23%.

@@             Coverage Diff             @@
##           develop    #6350      +/-   ##
===========================================
- Coverage    84.00%   83.95%   -0.05%     
===========================================
  Files           61       62       +1     
  Lines         4295     4307      +12     
  Branches       522      522              
===========================================
+ Hits          3608     3616       +8     
- Misses         564      568       +4     
  Partials       123      123              
Impacted Files Coverage Δ
...ns/b7f98cfd6a70_make_filesystem_id_non_nullable.py 66.66% <66.66%> (ø)
securedrop/models.py 89.57% <100.00%> (ø)

📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more

legoktm added a commit that referenced this pull request Mar 21, 2022
* Upgrade mypy to 0.941 and sqlalchemy-stubs to 0.4. Type stubs are no
  longer bundled with mypy, so we need to explicitly install various
  types-{name} packages.
* The main issue spotted by the SQLAlchemy plugin, that
  sources.filesystem_id is nullable, is being fixed in a separate PR:
  #6350.
* Use `--explicit-package-bases` when running mypy, so it treats the
  securedrop/admin directories as modules regardless of whether
  `__init__.py` files exist. This is needed so the two tests/
  directories don't conflict with each other.
* Add missing return types to all schema changes and the templates that
  generate them.
* In the `create_source_uuid_column` migration, drop the use of
  `quoted_name()`. For some reason mypy doesn't like it, but more
  importantly we don't need it in SQLite and we don't use it in any
  other schema change.
* Make SecureTemporaryFile.write()'s return type match its parent by
  returning the number of bytes that were written.

Fixes #6346.
legoktm added a commit that referenced this pull request Mar 22, 2022
* Upgrade mypy to 0.941 and sqlalchemy-stubs to 0.4. Type stubs are no
  longer bundled with mypy, so we need to explicitly install various
  types-{name} packages.
* The main issue spotted by the SQLAlchemy plugin, that
  sources.filesystem_id is nullable, is being fixed in a separate PR:
  #6350.
* Drop the explicit type comments now that mypy can infer them.
* Use `--explicit-package-bases` when running mypy, so it treats the
  securedrop/admin directories as modules regardless of whether
  `__init__.py` files exist. This is needed so the two tests/
  directories don't conflict with each other.
* Add missing return types to all schema changes and the templates that
  generate them.
* In the `create_source_uuid_column` migration, drop the use of
  `quoted_name()`. For some reason mypy doesn't like it, but more
  importantly we don't need it in SQLite and we don't use it in any
  other schema change.
* Make SecureTemporaryFile.write()'s return type match its parent by
  returning the number of bytes that were written.

Fixes #6346.
legoktm added a commit that referenced this pull request Mar 23, 2022
* Upgrade mypy to 0.941 and sqlalchemy-stubs to 0.4. Type stubs are no
  longer bundled with mypy, so we need to explicitly install various
  types-{name} packages.
* The main issue spotted by the SQLAlchemy plugin, that
  sources.filesystem_id is nullable, is being fixed in a separate PR:
  #6350.
* Drop the explicit type comments now that mypy can infer them.
* Use `--explicit-package-bases` when running mypy, so it treats the
  securedrop/admin directories as modules regardless of whether
  `__init__.py` files exist. This is needed so the two tests/
  directories don't conflict with each other.
* Add missing return types to all schema changes and the templates that
  generate them.
* In the `create_source_uuid_column` migration, drop the use of
  `quoted_name()`. For some reason mypy doesn't like it, but more
  importantly we don't need it in SQLite and we don't use it in any
  other schema change.
* Make SecureTemporaryFile.write()'s return type match its parent by
  returning the number of bytes that were written.

Fixes #6346.
@legoktm
Copy link
Member Author

legoktm commented Mar 24, 2022

I tried to git blame on whether we were ever intentionally writing NULLs to filesystem_id and didn't really find anything (but also the filesystem_id column dates back to the initial SQLAlchemy commit, so it's super old!).

I am relatively confident that there aren't NULLs lying around because SD would blow up pretty quickly in the app code if filesystem_id was NULL, so moving this to ready for review.

@legoktm legoktm marked this pull request as ready for review March 24, 2022 20:56
@legoktm legoktm requested a review from a team as a code owner March 24, 2022 20:57
@legoktm legoktm added this to Ready for Review in SecureDrop Team Board Mar 24, 2022
legoktm added a commit that referenced this pull request Mar 31, 2022
* Upgrade mypy to 0.942 and sqlalchemy-stubs to 0.4. Type stubs are no
  longer bundled with mypy, so we need to explicitly install various
  types-{name} packages.
* Since we need to install extra packages and need other dependencies
  like Flask, move mypy into test-requirements.txt and run it via the
  dev container.
* The main issue spotted by the SQLAlchemy plugin, that
  sources.filesystem_id is nullable, is being fixed in a separate PR:
  #6350.
* Drop the explicit type comments now that mypy can infer them.
* Use `--explicit-package-bases` when running mypy, so it treats the
  securedrop/admin directories as modules regardless of whether
  `__init__.py` files exist. This is needed so the two tests/
  directories don't conflict with each other.
* Add missing return types to all schema changes and the templates that
  generate them.
* In the `create_source_uuid_column` migration, drop the use of
  `quoted_name()`. For some reason mypy doesn't like it, but more
  importantly we don't need it in SQLite and we don't use it in any
  other schema change.
* Make SecureTemporaryFile.write()'s return type match its parent by
  returning the number of bytes that were written.

Fixes #6346.
Fixes #6358.
legoktm added a commit that referenced this pull request Mar 31, 2022
* Upgrade mypy to 0.942 and sqlalchemy-stubs to 0.4. Type stubs are no
  longer bundled with mypy, so we need to explicitly install various
  types-{name} packages.
* Since we need to install extra packages and need other dependencies
  like Flask, move mypy into test-requirements.txt and run it via the
  dev container.
* The main issue spotted by the SQLAlchemy plugin, that
  sources.filesystem_id is nullable, is being fixed in a separate PR:
  #6350.
* Drop the explicit type comments now that mypy can infer them.
* Use `--explicit-package-bases` when running mypy, so it treats the
  securedrop/admin directories as modules regardless of whether
  `__init__.py` files exist. This is needed so the two tests/
  directories don't conflict with each other.
* Add missing return types to all schema changes and the templates that
  generate them.
* In the `create_source_uuid_column` migration, drop the use of
  `quoted_name()`. For some reason mypy doesn't like it, but more
  importantly we don't need it in SQLite and we don't use it in any
  other schema change.
* Make SecureTemporaryFile.write()'s return type match its parent by
  returning the number of bytes that were written.

Fixes #6346.
Fixes #6358.
@zenmonkeykstop
Copy link
Contributor

Poked around a bit as well (including manually setting a source's filesystem_id to null and seeing what happens) - returning users are authenticated by generating the expected filesystem_id and querying the database for it. So, if a source's filesystem_id field is ever set to null, they can't log back in.

Probably what this means for this PR is that just deleting sources with a null filesystem_id value is a valid migration strategy, as their submissions are already effectively orphaned and will get swept up in the nightly jobs to detect disconnected files/entries.

@legoktm
Copy link
Member Author

legoktm commented Mar 31, 2022

Deleting a source is not the easiest thing in alembic because we need to clean up referential tables, but it seems doable.

@legoktm legoktm marked this pull request as draft March 31, 2022 21:41
@legoktm
Copy link
Member Author

legoktm commented Mar 31, 2022

Moving back to draft state, I pushed my work from today but there's still a bit more to do in the migration, we delete stuff out of sources, replies, submissions, but aren't then deleting stuff that refer to those replies and submissions...

@legoktm legoktm force-pushed the filesystem-id-non-null branch 2 times, most recently from f5d52e6 to 27515a2 Compare April 1, 2022 04:29
@legoktm legoktm marked this pull request as ready for review April 1, 2022 04:39
@legoktm legoktm force-pushed the filesystem-id-non-null branch 2 times, most recently from a160a6b to 0fb2dd9 Compare April 7, 2022 22:12
legoktm added a commit that referenced this pull request Apr 7, 2022
* Upgrade mypy to 0.942 and sqlalchemy-stubs to 0.4. Type stubs are no
  longer bundled with mypy, so we need to explicitly install various
  types-{name} packages.
* Since we need to install extra packages and need other dependencies
  like Flask, move mypy into test-requirements.txt and run it via the
  dev container.
* The main issue spotted by the SQLAlchemy plugin, that
  sources.filesystem_id is nullable, is being fixed in a separate PR:
  #6350.
* Drop the explicit type comments now that mypy can infer them.
* Use `--explicit-package-bases` when running mypy, so it treats the
  securedrop/admin directories as modules regardless of whether
  `__init__.py` files exist. This is needed so the two tests/
  directories don't conflict with each other.
* Add missing return types to all schema changes and the templates that
  generate them.
* In the `create_source_uuid_column` migration, drop the use of
  `quoted_name()`. For some reason mypy doesn't like it, but more
  importantly we don't need it in SQLite and we don't use it in any
  other schema change.
* Make SecureTemporaryFile.write()'s return type match its parent by
  returning the number of bytes that were written.

Fixes #6346.
Fixes #6358.
We already set this field all the time and in all places expect it to be
a string, not checking if its None first.

The database migration will delete any sources that have a NULL filesystem_id,
plus anything that references those sources.

Discovered by the mypy work I'm doing in #6346.

Refs #6226.
legoktm added a commit that referenced this pull request Apr 12, 2022
* Upgrade mypy to 0.942 and sqlalchemy-stubs to 0.4. Type stubs are no
  longer bundled with mypy, so we need to explicitly install various
  types-{name} packages.
* Since we need to install extra packages and need other dependencies
  like Flask, move mypy into test-requirements.txt and run it via the
  dev container.
* The main issue spotted by the SQLAlchemy plugin, that
  sources.filesystem_id is nullable, is being fixed in a separate PR:
  #6350.
* Drop the explicit type comments now that mypy can infer them.
* Use `--explicit-package-bases` when running mypy, so it treats the
  securedrop/admin directories as modules regardless of whether
  `__init__.py` files exist. This is needed so the two tests/
  directories don't conflict with each other.
* Add missing return types to all schema changes and the templates that
  generate them.
* In the `create_source_uuid_column` migration, drop the use of
  `quoted_name()`. For some reason mypy doesn't like it, but more
  importantly we don't need it in SQLite and we don't use it in any
  other schema change.
* Make SecureTemporaryFile.write()'s return type match its parent by
  returning the number of bytes that were written.

Fixes #6346.
Fixes #6358.
Copy link
Contributor

@zenmonkeykstop zenmonkeykstop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested as follows:

  • set up 2.3.1 prod VM environment, created multiple sources, nulled out the filesystem ID for one of them.
  • Confirmed that a 500 error is thrown on the All Sources page due to the missing filesystem_id (so this unlikely scenario would have been immediately noticeable)
  • used the upgrade scenario procedure to update the prod instance with debs built from this branch
  • confirmed that the All Sources page no longer 500s out and displays only the non-nulled sources.
  • confirmed that replies and message from those sources are still available for download
  • confirmed that the nulled source was deleted from the database
  • confirmed that the filesystem_id field has nullable=false applied.

LGTM!

@zenmonkeykstop zenmonkeykstop merged commit b5810d3 into develop Apr 19, 2022
SecureDrop Team Board automation moved this from Ready for Review to Done Apr 19, 2022
@legoktm legoktm deleted the filesystem-id-non-null branch April 19, 2022 15:16
legoktm added a commit that referenced this pull request Apr 19, 2022
* Upgrade mypy to 0.942 and sqlalchemy-stubs to 0.4. Type stubs are no
  longer bundled with mypy, so we need to explicitly install various
  types-{name} packages.
* Since we need to install extra packages and need other dependencies
  like Flask, move mypy into test-requirements.txt and run it via the
  dev container.
* The main issue spotted by the SQLAlchemy plugin, that
  sources.filesystem_id is nullable, is being fixed in a separate PR:
  #6350.
* Drop the explicit type comments now that mypy can infer them.
* Use `--explicit-package-bases` when running mypy, so it treats the
  securedrop/admin directories as modules regardless of whether
  `__init__.py` files exist. This is needed so the two tests/
  directories don't conflict with each other.
* Add missing return types to all schema changes and the templates that
  generate them.
* In the `create_source_uuid_column` migration, drop the use of
  `quoted_name()`. For some reason mypy doesn't like it, but more
  importantly we don't need it in SQLite and we don't use it in any
  other schema change.
* Make SecureTemporaryFile.write()'s return type match its parent by
  returning the number of bytes that were written.

Fixes #6346.
Fixes #6358.
@zenmonkeykstop zenmonkeykstop mentioned this pull request May 10, 2022
35 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

None yet

3 participants