Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new metadata queue. #715

Merged
merged 8 commits into from
Jan 27, 2020
Merged

Conversation

ntoll
Copy link
Contributor

@ntoll ntoll commented Jan 21, 2020

Description

Towards #652.

This does exactly what it says on the tin. Ignore my rather naive questions on the original ticket. Once I dug deeper I realised the suggested solution was relatively simple and followed an existing precedent with the file download queue.

Test Plan

Updated unit tests as per changes to the code.

Checklist

If these changes modify code paths involving cryptography, the opening of files in VMs or network (via the RPC service) traffic, Qubes testing in the staging environment is required. For fine tuning of the graphical user interface, testing in any environment in Qubes is required. Please check as applicable:

  • I have tested these changes in the appropriate Qubes environment
  • I do not have an appropriate Qubes OS workstation set up (the reviewer will need to test these changes)
  • These changes should not need testing in Qubes

If these changes add or remove files other than client code, packaging logic (e.g., the AppArmor profile) may need to be updated. Please check as applicable:

  • I have submitted a separate PR to the packaging repo
  • No update to the packaging logic (e.g., AppArmor profile) is required for these changes
  • I don't know and would appreciate guidance

@ntoll
Copy link
Contributor Author

ntoll commented Jan 21, 2020

NOTE: the CI failure is down to an error (missing keys?) in the Debian packaging process.

@emkll
Copy link
Contributor

emkll commented Jan 21, 2020

@ntoll I think you will need to rebase on latest master, due to changes introduced in #701

@ntoll
Copy link
Contributor Author

ntoll commented Jan 21, 2020

Aha... thanks for the heads up.

@eloquence
Copy link
Member

(Edited PR body to not resolve #652, as the scope of that issue is a bit larger, including an increase in the frequency of syncs, but it's fine to split that into multiple PRs.)

Copy link
Contributor

@sssoleileraaa sssoleileraaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The scope of the issue says:

What's in scope for this issue?

So the scope of this issue is to run MetadataSyncJob outside of the main queue (until we add async job support) and to sync more frequently to make up for removing syncs from other areas of the code. What still needs to be decided by is:

* How often to sync (idea: every 15 seconds or until a MetadataSyncJob is successful, whichever one is longer)

* Whether or not to create another queue just for MetadataSyncJob processing (using a queue fits within our current architecture of using queues for processing jobs, however, it would also make a lot of sense to _not_ use a queue since we don't ever need to line up more than one MetadataSyncJob at a time)

So this PR needs to increase frequency of background syncs

@sssoleileraaa
Copy link
Contributor

  • Whether or not to create another queue just for MetadataSyncJob processing (using a queue fits within our current architecture of using queues for processing jobs, however, it would also make a lot of sense to not use a queue since we don't ever need to line up more than one MetadataSyncJob at a time)

I personally think not using a queue is the right approach, but if you choose to use a queue then limiting it to one job at a time might work.

@ntoll
Copy link
Contributor Author

ntoll commented Jan 22, 2020

@creviera yeah... this was discussed in yesterday's stand-up, but not yet reflected here. 👍 (Apologies for the opacity).

Copy link
Contributor

@sssoleileraaa sssoleileraaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we need to ensure that only one job runs at a time, you can either try redshiftzero's idea: #652 (comment) or just run this job outside of a queue (not everything needs to be in a queue)

@sssoleileraaa
Copy link
Contributor

@creviera yeah... this was discussed in yesterday's stand-up, but not yet reflected here. +1 (Apologies for the opacity).

Sorry, just to be clear, if you want to increase the frequency of syncs in a separate pr that sounds good to me (smaller PRs ftw!) I updated my review comment with only one requested change.

@sssoleileraaa
Copy link
Contributor

Just discussed pausing and resuming queues with @ntoll, once we resume_queues on a successful metadata sync instead of on failure (relying on the fact that metadatasync has the highest priority and will run forever until it succeeds) this will close #672

self.sync_update = QTimer()
self.sync_update.timeout.connect(self.sync_api)
self.sync_update.start(1000 * 60 * 5) # every 5 minutes.
self.sync_update.start(1000 * 60) # every minute.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this time seems good for now, we can increase sync frequency once all sync_api calls have been removed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also: you need to remove the line of code that resumes the queues in on_sync_failure now that sync is outside of the main queue and we don't need to unpause the queue in order to run the MetadataSyncJob. We should never pause the metadata sync queue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one more note: while offline you can stop the thread and restart it on login, which will take care of this issue: #671 since you can remove the sync_api call on login and instead just start the metadata sync thread

Copy link
Contributor Author

@ntoll ntoll Jan 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed the queue resumption code as requested. The QTimer will kick in (after a minute) and try another MetadataSyncJob.

See my comment in #671.


self.main_queue.pinged.connect(self.resume_queues)
self.download_file_queue.pinged.connect(self.resume_queues)
self.metadata_queue.pinged.connect(self.resume_queues)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this queue doesn't need to pause or resume so you can remove this as well as the pinged signal. instead you can just call resume_queues from the Controller on_sync_success if the queues are paused.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I follow.

The queue decides itself (see the exception handling in the process method of the RunnableQueue class.) Therefore, the metadata queue could be paused if it encounters a problem (which is a good thing). However, the update timer will restart the queue after X period of time, right...?

In which case, I'd argue these signals still need to be there. IYSWIM.

Copy link
Contributor

@sssoleileraaa sssoleileraaa Jan 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So currently, we never stop metadata syncs when they fail because of a request timeout. We want them to run in the background until they succeed, which is why it doesn't make sense to pause the metadata sync queue. It looks like your code will try to enqueue another MetadataSyncJob after 60 seconds but it will be dropped because the queue will be full and paused. The right behavior is to not pause the metadata queue and it makes the code simpler because you can remove the pinged signal and remove complicated logic around unpausing the metadata sync queue when we want to keep reaching out to server to see if we can unpause the other queues, see #671 (comment) for an idea on how to make this much simpler.

@sssoleileraaa sssoleileraaa moved this from Ready for Review to In Development in SecureDrop Team Board Jan 22, 2020

self.main_queue.paused.connect(self.on_queue_paused)
self.download_file_queue.paused.connect(self.on_queue_paused)
self.metadata_queue.paused.connect(self.on_queue_paused)

self.main_queue.pinged.connect(self.resume_queues)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can remove the main_queue and download_file_queue pinged signal also because there is already a resume signal emitted by the queue manager, see

def resume_queues(self) -> None:
logger.info("Resuming queues")
self.start_queues()
self.main_queue.resume.emit()
self.download_file_queue.resume.emit()

so once you remove these three lines of code, and add self.resume_queues()` in on_sync_success, and update the queue manager's resume function to check if a queue is paused before emitting the resume signal, then this PR should be good to go.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, this is what closes #672, see #715 (comment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK... I've made the changes you've requested, updated the unit tests, and rebased with latest master. 👍

Copy link
Contributor

@sssoleileraaa sssoleileraaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the metadata sync queue should never pause so you should remove the resume signal but otherwise this lgtm and wfm

@sssoleileraaa sssoleileraaa merged commit eb9ce9c into freedomofpress:master Jan 27, 2020
SecureDrop Team Board automation moved this from In Development to Done Jan 27, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

None yet

4 participants