Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

app: add pending reply status, persist replies in the database #578

Merged
merged 12 commits into from Nov 7, 2019

Conversation

redshiftzero
Copy link
Contributor

@redshiftzero redshiftzero commented Oct 21, 2019

Description

Fixes #350.
Fixes #294.

What it looks like now when you send replies (the failed replies will persist between application restarts and clicking between sources):

pending_status

For a followup:

Test Plan

To generate lots of failed replies and introduce delay I'm testing using the following server diff:

diff --git a/securedrop/journalist_app/api.py b/securedrop/journalist_app/api.py
index aa776d520..8eb8f26c8 100644
--- a/securedrop/journalist_app/api.py
+++ b/securedrop/journalist_app/api.py
@@ -230,6 +230,13 @@ def make_blueprint(config):
                 {'replies': [reply.to_json() for
                              reply in source.replies]}), 200
         elif request.method == 'POST':
+            import time
+            time.sleep(2)
+
+            import random
+            fail_early = random.choice([True, False])
+            if fail_early:
+                abort(409, 'That UUID is already in use.')
             source = get_or_404(Source, source_uuid,
                                 column=Source.uuid)
             if request.json is None:

You can use that or you can just use staging and rely on regular ol tor network issues to:

  1. Send a reply (or a few) to a source (source A) until one fails.
  2. Click to another source (source B).
  3. Click back to the source A. Confirm that the failed reply is still there.
  4. Close the application.
  5. Restart the application with the same sdc homedir. Confirm the failed reply is still there.

Checklist

If these changes modify code paths involving cryptography, the opening of files in VMs or network (via the RPC service) traffic, Qubes testing in the staging environment is required. For fine tuning of the graphical user interface, testing in any environment in Qubes is required. Please check as applicable:

  • I have tested these changes in the appropriate Qubes environment
  • I do not have an appropriate Qubes OS workstation set up (the reviewer will need to test these changes)
  • These changes should not need testing in Qubes

@ninavizz
Copy link
Member

Question: is the above video just looping, or does it ever go from purple to urgent-coral to blue in a seconds-long window? The urgent-coral (visible error/stalled state) should not show unless the client has tried and tried and tried, and the user needs to likely intervene to troubleshoot.

@redshiftzero
Copy link
Contributor Author

oh yeah the gif is just looping. i can confirm it is the case that the urgent-coral bar will only show up if there is some problem that requires user intervention. for example, if the reply send fails the first time due to a network error, then the reply gets resent again transparently to the user without turning the color bar urgent-coral.

@redshiftzero
Copy link
Contributor Author

redshiftzero commented Oct 23, 2019

Today I thought a bit about how to handle the server/client reply order state synchronization (the remaining issue with this branch, and the underlying issue behind #489), here are my thoughts on how to proceed, comment if you think there's an error in my logic or you think I'm overlooking a simpler/more elegant solution.

First here are some test cases that we need to consider and handle. For all cases we only consider a single source. Other requirements/constraints are:

  • file_counter is used for ordering the conversation view and cannot be duplicated for any conversation item with a single source: i.e. the ordering for a given source’s conversation view is unique.
  • Pending and failed replies can only be stored locally.
  • There may be multiple journalists interacting with a given source at a given time.

Case 1: Single journalist, multiple successful replies

  1. Source sends message M with file_counter=1.
  2. Journalist A submits reply X with local file_counter=2 to queue.
  3. Journalist A submits reply Y with local file_counter=3 to queue.
  4. Reply with file_counter=2 (X) is saved on the server.
  5. Reply with file_counter=3 (Y)is saved on the server.

Expected ordering (server and client): M, X, Y

Case 2: Single user, multiple replies, some that fail

  1. Source sends message M with file_counter=1.
  2. Journalist A submits reply X with local file_counter=2 to queue.
  3. Reply with local file_counter=2 (X) fails to send.
  4. Journalist A submits reply Y with local file_counter=3 to queue.
  5. Reply with local file_counter=3 (Y) is saved on the server as file_counter=2.

Expected ordering (server): M, Y
Expected ordering (client): M, X, Y

Case 3: Multiple clients, multiple successful replies

  1. Source sends message M with file_counter=1.
  2. Journalist A submits reply X with local file_counter=2 to their queue.
  3. Journalist B submits reply Y with local file_counter=2 to their queue.
  4. Reply from A (X) with local file_counter=2 is saved on the server.
  5. Reply from B (Y) with local file_counter=2 is saved on the server with file_counter=3.

Expected ordering (server, journalist A, journalist B): M, X, Y

Case 4: Single journalist, multiple successful replies, source messaging at the same time

  1. Source sends message M1 with file_counter=1.
  2. Journalist A submits reply X with local file_counter=2 to their queue.
  3. Source sends message M2 with file_counter=2.
  4. Client downloads message M2 with file_counter=2.
  5. Reply from A (X) with local file_counter=2 is saved on the server with file_counter=3.

Expected ordering (server, journalist A, journalist B): M1, M2, X

Current behavior

New behavior

Here’s my current thinking on the best way to handle this:

  • In the response from the POST reply endpoint on the server side, we return the actual file_counter of the saved reply. This way we know where in the conversation this item is (without having to do another round trip which could fail). When we get a successful reply upload, we update the file_counter and filename in the database. This handles case 3 and case 4. (Edited: we currently return the filename so we can extract file_counter already 🎉)
  • We make a new ORM object called DraftReply:
    • it has all the rows the reply table does except it does not have the file_counter field.
    • it also will add a timestamp field: this is the local timestamp that the reply was sent and this will be used for ordering local replies when there are multiple attempted replies in between conversation items from the server.
    • it also will add a field called prev_file_counter, which points to the file_counter of the message/reply/file after which the draft reply was sent. We need to do this because we only have the timestamp of the most recent conversation item server side (so we can’t order by timestamp for the conversation items from the server).
    • when we construct the conversation view, we include all messages, replies, and files as we were doing before ordered by file_counter. Then we interleave in the pending and failed local replies, placed after the corresponding prev_file_counter, ordered by timestamp. This handles case 2.

@redshiftzero redshiftzero changed the title [wip] app: add pending reply status, persist replies in the database app: add pending reply status, persist replies in the database Oct 24, 2019
@redshiftzero redshiftzero marked this pull request as ready for review October 24, 2019 22:06
@redshiftzero redshiftzero added this to Ready for Review in SecureDrop Team Board Oct 24, 2019
Copy link
Contributor

@sssoleileraaa sssoleileraaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick feedback while I'm testing this:

The client won't open now after I closed the application when the reply fails to send. I'm seeing:

  File "/home/creviera/workspace/freedomofpress/securedrop-client/securedrop_client/storage.py", line 166, in update_sources
    delete_single_submission_or_reply_on_disk(document, data_dir)
  File "/home/creviera/workspace/freedomofpress/securedrop-client/securedrop_client/storage.py", line 437, in delete_single_submission_or_reply_on_disk
    filename_without_extensions = obj_db.filename.split('.')[0]
AttributeError: 'DraftReply' object has no attribute 'filename'

@sssoleileraaa sssoleileraaa moved this from Ready for Review to Under Review in SecureDrop Team Board Oct 25, 2019
@redshiftzero
Copy link
Contributor Author

ahh thanks, forgot the source sync deletion iterates through the source.collection (which now includes DraftReplys) - added a fix and regression test coverage in two small commits (not force pushing in case you are mid-review)

@sssoleileraaa
Copy link
Contributor

sssoleileraaa commented Oct 29, 2019

not sure what's happening but all my replies are now saved in draftreplies with send_status_id as 2 (FAILED). when i switch to master it replies send. i'll have to review the code more tomorrow and try to figure out what's wrong with my setup or maybe there's a bug in the code, but this does feel like a setup issue.

Update: i was wrong about replies sending on master, could have sworn they were

@redshiftzero redshiftzero moved this from Under Review to In Development in SecureDrop Team Board Oct 29, 2019
@redshiftzero
Copy link
Contributor Author

For the interested observer @creviera's report is correct - replies now fail to send - however this is due to #598 on master (to reproduce on master you must first add a new source since the issue appears to be something related to gpg key imports).

@redshiftzero
Copy link
Contributor Author

I can confirm that after deleting the lock file cited in freedomofpress/securedrop#4909 (which from my testing just now appears to be the cause of #598) then testing this branch in Qubes works without issue for me

@redshiftzero redshiftzero moved this from In Development to Ready for Review in SecureDrop Team Board Oct 29, 2019
@sssoleileraaa sssoleileraaa self-requested a review October 29, 2019 22:14
Copy link
Contributor

@sssoleileraaa sssoleileraaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

STR:

  1. Send a reply that you know will fail (cut connection to server)
  2. Send a couple other replies (you'll see them stay in the pending state)
  3. Click retry (nothing should change)
  4. Fix connection to server and refresh

Expected

The red reply and pending replies that follow are sent

Actual

The red reply shows up once in red as failed, followed by a copy of itself once in green as successful. The pending replies that follow are sent and green.

securedrop_client/gui/widgets.py Show resolved Hide resolved
@redshiftzero redshiftzero moved this from Ready for Review to In Development in SecureDrop Team Board Oct 30, 2019
@redshiftzero
Copy link
Contributor Author

Interesting! I think what is happening here is that the reply send does time out: the reply status is failed and is stored as a draft on the client side. However, the reply actually did get saved on the server, we just didn't get the response due to network issues. I think the right place to handle this is during the sync action for new replies: we should check if there is a draft locally with the same uuid as the new reply from the server: if yes, then we simply delete the local draft. I'll implement this and then retest your scenario.

during the sync, we don't attempt to delete draft replies in
the source.collection they aren't stored on disk, but will get
deleted by the cascade delete when the source is deleted.

however we also ensure that duplicate drafts are cleaned up on sync
to handle the scenario where a ReplySendJob "fails" but the reply
_was_ actually saved properly on the server.
@redshiftzero
Copy link
Contributor Author

So - there's a remaining issue with the queue pausing which will complicate testing (basically as is the queue isn't pausing when the server is completely down, not timing out, which requires proxy changes): that's freedomofpress/securedrop-proxy#128 - we'll need to handle this outside of this PR.

@redshiftzero redshiftzero moved this from In Development to Ready for Review in SecureDrop Team Board Nov 4, 2019
@sssoleileraaa sssoleileraaa moved this from Ready for Review to Under Review in SecureDrop Team Board Nov 5, 2019
@sssoleileraaa sssoleileraaa self-requested a review November 5, 2019 21:00
Copy link
Contributor

@sssoleileraaa sssoleileraaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good so far, just made a follow-up issue that we can discuss on Monday.

Switching to code review now.

if a user sends multiple replies A, B, C, the order should always be
A, B, C, even if:

* A fails
* B sends successfully
* C is pending

we ensure that once a reply sends successfully - or we find that
a reply _did_ send but we have it marked as failed as we never got
the response (h/t @creviera for testing this case) - that we ensure
that C appears after B. this is done by updating the file_counter.
@redshiftzero
Copy link
Contributor Author

added a commit 95db9b4 to handle a case @creviera and I just discussed synchronously:

  1. User sends Replies A and B
  2. Reply A sends
  3. Reply A fails, and the queue pauses. Reply B is still pending.

At this point reply A has been successfully sent on the server in this scenario - we just never got the response. That reply will appears as failed locally until we sync and find out otherwise. In real world use this can happen, and we now have logic in this PR to handle this situation on sync: if we find that a draft exists locally matching a successfully sent reply on the server, we remove the local draft and update to reflect the server state locally.

However, in this case we need to also ensure that pending reply B will appear after now successfully reply A in the conversation view for the user. To do so, we run the logic that runs after a reply sends successfully updating the ordering of draft replies to move the drafts to after the file_counter corresponding to the successfully sent reply.

@sssoleileraaa
Copy link
Contributor

sssoleileraaa commented Nov 6, 2019

That reply will appears as failed locally until we sync and find out otherwise.

Yeah, so what I'm seeing is:

0. User sends Replies A and B
1. Reply A sends
2. Reply A fails, and the queue pauses. Reply B is still pending.
3. When queue starts again, Reply B sends, Reply A still appears as failed in the client
4. ~10 seconds later, Reply A shows up as sent

This is because, as you say, Reply A actually sent and made it to the server in step 2 but the client doesn't know about it yet. It would be nice to be able to see Reply A as successful before Reply B shows up as successful because there's about 10 seconds of it looking like Reply A was never sent.

You might have mentioned the reason to not sync the client with the server before starting the queue again, but I can't remember it so it'll be helpful to have an explanation in writing somewhere.

Update

Eventually we will move Metadata sync to a job that'll be added to a queue at a set interval but could also be added to the main queue when the sync icon is clicked or when we do things like unpausing a queue. I think it might make sense to make what I'm describing above into an issue and mention that we could prioritize a Metadata sync job before any other job on the queue.

@sssoleileraaa sssoleileraaa self-requested a review November 6, 2019 01:10
@sssoleileraaa
Copy link
Contributor

However, in this case we need to also ensure that pending reply B will appear after now successfully reply A in the conversation view for the user. To do so, we run the logic that runs after a reply sends successfully updating the ordering of draft replies to move the drafts to after the file_counter corresponding to the successfully sent reply.

Reply A is always showing up before Reply B now, but instead of Reply B showing up as successful it now shows up as failed (in the case where we close the client, see steps to repro below):

  1. User sends Replies A and B
  2. Reply A sends
  3. Reply A fails, and the queue pauses. Reply B is still pending.
  4. User closes client, reauthenticates when logs back in, the queue starts again
  5. After a few or more seconds, Reply A shows up as sent, but Reply B remains failed.

Before the latest commit, I'm pretty sure Reply B would send. I'm super curious to see what's going on here, but might have to wait until tomorrow morning.

@sssoleileraaa
Copy link
Contributor

but instead of Reply B showing up as successful it now shows up as failed

Oh wait, actually, this is what we want! We don't want to automatically send pending replies between different client sessions.

Okay, nvm, ignore latest comment.

Copy link
Contributor

@sssoleileraaa sssoleileraaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the only thing I haven't reviewed yet are tests. I have some comments, a few questions, nothing really blocking the PR, but it would be nice to have another day to let all these changes sink in and chat with you more about design etc.

@@ -6,7 +6,8 @@

from securedrop_client.api_jobs.base import ApiJob
from securedrop_client.crypto import GpgHelper
from securedrop_client.db import Reply, Source
from securedrop_client.db import DraftReply, Reply, ReplySendStatus, ReplySendStatusCodes, Source
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so organized!

securedrop_client/api_jobs/uploads.py Show resolved Hide resolved
session.commit()

return reply_db_object.uuid
except RequestTimeoutError as e:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is out of the scope of this PR but shouldn't we also be catching AuthError and ApiInaccessibleError and raising a custom exception to include reply_uuid and message like we do SendReplyJobTimeoutError?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:o yes... yes we should

@@ -54,7 +56,11 @@ def collection(self) -> List:
collection.extend(self.messages)
collection.extend(self.files)
collection.extend(self.replies)
collection.sort(key=lambda x: x.file_counter)
collection.extend(self.draftreplies)
# Sort first by the file_counter, then by timestamp (used only for draft replies).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if there's a more high-level visible place we should mention how we use timestamps for saved drafts. I can't think of one other than in a docstring for source or in our client architecture doc. I was just hoping to find more information about why we use the timestamp somewhere. I did find your PR comment:

this is the local timestamp that the reply was sent and this will be used for ordering local replies when there are multiple attempted replies in between conversation items from the server.

So we could use a local_file_counter instead of timestamp right? I don't feel strongly about this but maybe it makes it clearer that we don't actually care about time, we just care about order in which a reply was drafted so that we can display the drafts in the correct order in the client?

Or perhaps we'll want to show the timestamp next to the draft to help the journalist remember when they drafted it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm yeah good point - how about I add a description of the ordering situation here to the wiki architecture page?

i.e. saying something like

"draft replies store:

  • a file_counter which points to the file_counter of the previously sent item. this enables us to interleave the drafts with the items from the source conversation fetched from the server, which do not have timestamps associated with them.
  • a timestamp which contains the timestamp the draft reply was saved locally: this is used to order drafts in the case where there are multiple drafts sent after a given reply (i.e. when file_counter is the same for multiple drafts)"

with an example

I actually did call the DraftReply.file_counter field local_file_counter field (😇) but then renamed it back to file_counter to simplify the source.collection.sort key. You're right that we could ditch timestamp and have two fields file_counter and local_file_counter. imho I figure is slightly more useful to have the actual timestamp locally for if we ever do want to expose the draft timestamp to users (I could imagine that being useful).



class ReplySendStatusCodes(Enum):
"""In progress (sending) replies can currently have the following statuses"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see a SAVED status in the future 🔮

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

haha yeah so at some point in this PR before I created the DraftReply object this status object was stored on the Reply and had three codes: FAILED, SUCCESSFUL, PENDING. but I removed SUCCESSFUL when I created the DraftReply as only drafts can be FAILED or PENDING, all replies are by construction SUCCESSFUL

I could imagine this is a place to store more detailed status codes about the failures in the future like ENCRYPTION_FAILED_NO_SOURCE_KEY, SEND_FAILED_SOURCE_DELETED. We could also imagine some more granular pending statuses like PENDING_ENCRYPTION_IN_PROGRESS, PENDING_AWAITING_SERVER_RESPONSE.

securedrop_client/gui/widgets.py Show resolved Hide resolved
Send reply and emit a signal so that the gui can be updated immediately, even before the
the reply is saved locally.
Send reply and emit a signal so that the gui can be updated immediately indicating
that it is a pending reply.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

@@ -241,7 +244,7 @@ def update_replies(remote_replies: List[SDKReply], local_replies: List[Reply],
* Existing replies are updated in the local database.
* New replies have an entry created in the local database.
* Local replies not returned in the remote replies are deleted from the
local database.
local database unless they are pending or failed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

When we confirm a sent reply, if there are drafts that were sent after it,
we need to reposition them to ensure that they appear _after_ the confirmed
replies.
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would find it super helpful for you to add a quick description for each parameter. For instance I was confused how the new_file_counter was found until I looked at where this function is used. It looks like new_file_counter is from the reply (that is causing the draft replies to be reordered) that the server has recorded? And old_file_counter is from the reply (that is causing the draft replies to be reordered) that the client has recorded?

Why do we need to set all the draft replies to have the same file_counter as the new_file_counter?

It looks like this works but just not sure why you decided to approach it this way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

your understanding here is correct - I've added some more explanation and an example in 44c4394, let me know if that makes sense!

@sssoleileraaa sssoleileraaa self-requested a review November 7, 2019 21:47
Copy link
Contributor

@sssoleileraaa sssoleileraaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docstring change is really helpful. This change is significant and not only fixes a very important issue of a journalist potentially losing what they write in the ReplyBox, it also sets us up nicely for adding a "Save" feature and allowing users to selectively send a draft or failed reply.

This looks ready to be framed 🖼️

Let's shipit!

@sssoleileraaa sssoleileraaa merged commit 2acfaea into master Nov 7, 2019
SecureDrop Team Board automation moved this from Under Review to Done Nov 7, 2019
@sssoleileraaa sssoleileraaa deleted the persist-replies-db branch November 7, 2019 21:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

Reply pending state Failed replies may disappear from UI
3 participants