Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Allow users to export their messages from the UI #8834

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

punchagan
Copy link
Member

Testing Plan:

GIFs or Screenshots:

@punchagan punchagan force-pushed the issue/2047-export-my-data branch 2 times, most recently from 6b60e18 to d7819d9 Compare March 27, 2018 17:07
e.preventDefault();
e.stopPropagation();
channel.get({
url: "/json/users/me/export-my-data",
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think /export would be a more RESTful UI.

$('#user_export_data_message').text(response.msg).addClass('show');
},
error: function(response){
// FIXME: Does error have response.msg??!
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, it does. But see the ui_report library.

</button>
</div>
</form>
</div>
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we might want to move this into the below section and title it was "Account options."

<form class="grid">
<div class="input-group">
<button type="submit" class="button rounded w-200 input-size" id="user_export_data">
{{t 'Export my messages' }}
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe "Export my Zulip message history"?

@@ -229,3 +230,13 @@ def change_enter_sends(request: HttpRequest, user_profile: UserProfile,
enter_sends: bool=REQ(validator=check_bool)) -> HttpResponse:
do_change_enter_sends(user_profile, enter_sends)
return json_success()

# FIXME: Do we plan to allow exports of all messages sent by bots?
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think human_users_only is reasonable for this.

user_profile = get_user_profile_by_email(email)

# FIXME: Can we check if an export is currently running, or has been
# started in the past hour, and deny exports?
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll want to do something here.

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably should just manage this with a new UserDataExport table.

@@ -527,3 +535,37 @@ def consume(self, event: Mapping[str, Any]) -> None:
(stream, recipient, sub) = access_stream_by_id(user_profile, stream_id,
require_active=False)
do_mark_stream_messages_as_read(user_profile, stream)


@assign_queue('export_data')
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use the deferred_work queue; this seems like a rare enough thing that we don't need several.

zproject/urls.py Outdated
@@ -258,6 +258,10 @@
url(r'^users/me/avatar$', rest_dispatch,
{'POST': 'zerver.views.user_settings.set_avatar_backend',
'DELETE': 'zerver.views.user_settings.delete_avatar_backend'}),
url(r'^users/me/export-my-data$', rest_dispatch,
# FIXME: Use a POST?
{'GET': 'zerver.views.user_settings.export_my_data'}),
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this should be a POST; we don't want browsers to auto-retry this sort of request.

size = getsize(tarball_path)
name = basename(tarball_path)
print(content_type, name, size)
upload_path = upload_message_image(name, size, content_type, f.read(), user_profile)
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, is that function really called upload_message_image? Sigh, that name is super confusing. Can you do a separate PR (or commit at the base of this PR) to rename that to upload_message_file? That seems like a clear thing to merge first. Be sure to use good commit discipline for that refactor; check out the Zulip commit message guidelines for more details: https://zulip.readthedocs.io/en/latest/contributing/version-control.html

def consume(self, email: str) -> None:
output_dir = tempfile.mkdtemp(prefix="/tmp/zulip-export-")
if os.path.exists(output_dir):
shutil.rmtree(output_dir)
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can probably just assert that it doesn't exist; that should never happen given that mkdtemp is supposed to make us a secure, randomly generated path. Also, does mkdtemp create the directory, in which case we should be asserting it does exist?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this code is something I just copied over from the export command, and your suggestions make sense. 👍

print("Finished exporting to %s; tarring" % (output_dir,))
tarball_path = output_dir.rstrip('/') + '.tar.gz'
subprocess.check_call(["tar", "--strip-components=1", "-czf", tarball_path, output_dir])
print("Tarball written to %s" % (tarball_path,))
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be sure to delete the file after uploading.

@punchagan punchagan force-pushed the issue/2047-export-my-data branch 4 times, most recently from 5db4cf6 to ecb8fbc Compare March 28, 2018 17:11
@zulipbot zulipbot added size: XL and removed size: L labels Mar 28, 2018
@punchagan punchagan force-pushed the issue/2047-export-my-data branch 3 times, most recently from e796050 to 9d8f28a Compare March 29, 2018 05:06
@punchagan
Copy link
Member Author

@timabbott I pushed a few changes based on your initial feedback. I also implemented some tracking for a running export, but I'm not sure what the best way to handle failures during an export would be.

Also, like you said, the export part could definitely use some more tests and error handling. Any advice on that would be helpful too.

@punchagan punchagan force-pushed the issue/2047-export-my-data branch 6 times, most recently from 49a5294 to 15985bb Compare March 30, 2018 00:38
super(DeferredWorker, self).consume_wrapper(data)
if data['type'] == 'export_user_messages':
user_id = data['user_profile_id']
UserDataExport.export_finished(user_id)
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be easier done as a try/except block in the handler after extracting the function.

@punchagan punchagan force-pushed the issue/2047-export-my-data branch 4 times, most recently from 2fe5990 to 90cf68f Compare April 2, 2018 05:12
@punchagan punchagan force-pushed the issue/2047-export-my-data branch 4 times, most recently from 04ffd11 to 1ebb0a7 Compare May 3, 2018 05:20
@punchagan punchagan force-pushed the issue/2047-export-my-data branch 4 times, most recently from dc923a9 to 227f310 Compare July 12, 2018 07:39
@punchagan
Copy link
Member Author

@timabbott I've finally been able to address the suggestion to update the status of the export for the user using an event. This PR is ready for another review.

@zulipbot
Copy link
Member

Heads up @punchagan, we just merged some commits that conflict with the changes your made in this pull request! You can review this repository's recent commits to see where the conflicts occur. Please rebase your feature branch against the upstream/master branch and resolve your pull request's merge conflicts accordingly.

@brainwane
Copy link
Contributor

@punchagan It looks like Tim made several other suggestions/requests in his review - did you address those already?

It would be great to get this feature! :-)

@punchagan
Copy link
Member Author

Hey @brainwane thanks for the nudge/ping on this. This PR has become quite stale, and I will try to take a look at this again. (I seem to have some time on my hands owing to the pandemic situation).

Meanwhile, this PR also doesn't really implement the exact feature you requested - dump of all the messages you sent. This PR tries to create an export of all sent and received messages. I remember @timabbott mentioned on CZO that just getting the messages you sent would be easy to do using the API. Would it be useful to look into that and write a script that fetches this data, before I jump back into this PR? 🤔

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants