New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON Exporter, part deux #5499

Merged
merged 1 commit into from Jan 16, 2015

Conversation

Projects
None yet
5 participants
@gdpelican
Contributor

gdpelican commented Dec 28, 2014

Thought I'd share this for a little feedback if you'd care to give it a look @jhass @jaywink

This currently:

  • Boots the current export off to Sidekiq
  • Uploads a json file through carrierwave
  • Sends an email notifying the user when the job's complete, which includes a link to the upload
  • Should pass tests, although we'll see when Travis gets its mitts on it.

It doesn't:

  • Have a good UI solution yet (you just click on 'download my profile' and it takes you to a blank screen with status 200; we'll have to do a flash notice or something to tell the user what's happening)
  • Include any of the heavy lifting associations we outscoped from the last PR
Show outdated Hide outdated app/models/user.rb
@@ -505,6 +518,6 @@ def clearable_fields
"created_at", "updated_at", "locked_at",
"serialized_private_key", "getting_started",
"disable_mail", "show_community_spotlight_in_stream",
"email", "remove_after"]
"email", "remove_after", "export"]

This comment has been minimized.

@gdpelican

gdpelican Dec 28, 2014

Contributor

Should export get cleared when we do user.clear_account!? I'd imagine so...

@gdpelican

gdpelican Dec 28, 2014

Contributor

Should export get cleared when we do user.clear_account!? I'd imagine so...

This comment has been minimized.

@jhass

jhass Dec 28, 2014

Member

Sounds right.

@jhass

jhass Dec 28, 2014

Member

Sounds right.

######### Data export ##################
mount_uploader :export, ExportedUser
def queue_export

This comment has been minimized.

@gdpelican

gdpelican Dec 28, 2014

Contributor

I could probably be convinced this perform_async call should just live in the controller action, but I do like having the queue/perform methods together on the model.

@gdpelican

gdpelican Dec 28, 2014

Contributor

I could probably be convinced this perform_async call should just live in the controller action, but I do like having the queue/perform methods together on the model.

Show outdated Hide outdated app/workers/export_user.rb
def perform(user_id)
@user = User.find(user_id)
@user.perform_export! && ExportMailer.export_complete_for(@user)

This comment has been minimized.

@gdpelican

gdpelican Dec 28, 2014

Contributor

There's some somewhat half-hearted error handling going on here; the email won't get sent unless perform_export! returns true, which only happens when we successfully perform the Carrierwave upload, ie. a file successfully gets uploaded.

I'm thinking there's not a ton of exceptions that'll occur along this workflow, but if it doesn't work for some reason maybe we want to send out a separate email saying 'oops that didn't work for some reason; tell us about it and/or try again.'

@gdpelican

gdpelican Dec 28, 2014

Contributor

There's some somewhat half-hearted error handling going on here; the email won't get sent unless perform_export! returns true, which only happens when we successfully perform the Carrierwave upload, ie. a file successfully gets uploaded.

I'm thinking there's not a ton of exceptions that'll occur along this workflow, but if it doesn't work for some reason maybe we want to send out a separate email saying 'oops that didn't work for some reason; tell us about it and/or try again.'

This comment has been minimized.

@jhass

jhass Dec 28, 2014

Member

Separate email sounds good, style wise please avoid doing control flow with logic expressions.

@jhass

jhass Dec 28, 2014

Member

Separate email sounds good, style wise please avoid doing control flow with logic expressions.

This comment has been minimized.

@gdpelican

gdpelican Dec 29, 2014

Contributor

Aw, such elegant! But I understand why not.

I'll just do an if/else and chuck in an error email.

@gdpelican

gdpelican Dec 29, 2014

Contributor

Aw, such elegant! But I understand why not.

I'll just do an if/else and chuck in an error email.

@jhass

This comment has been minimized.

Show comment
Hide comment
@jhass

jhass Dec 28, 2014

Member

Speaking of the mails, we should keep in mind that sending mails is entirely optional and not configured by default.

Member

jhass commented Dec 28, 2014

Speaking of the mails, we should keep in mind that sending mails is entirely optional and not configured by default.

@gdpelican

This comment has been minimized.

Show comment
Hide comment
@gdpelican

gdpelican Dec 29, 2014

Contributor

Hm, I don't think sending emails should be a hard dependency for this feature, it's more of a nice-to-have.

How about something like this:

When you visit your profile page

If you have not requested an export
There is a link to request an export

If you have requested an export and it is not complete
There is a message to the effect of 'We're still working on your export, please check back soon'

If you have requested an export and it is complete
There is a link to download your export
And a link which says 'Update my export', with a timestamp of the last created export

Contributor

gdpelican commented Dec 29, 2014

Hm, I don't think sending emails should be a hard dependency for this feature, it's more of a nice-to-have.

How about something like this:

When you visit your profile page

If you have not requested an export
There is a link to request an export

If you have requested an export and it is not complete
There is a message to the effect of 'We're still working on your export, please check back soon'

If you have requested an export and it is complete
There is a link to download your export
And a link which says 'Update my export', with a timestamp of the last created export

@Flaburgan

This comment has been minimized.

Show comment
Hide comment
@Flaburgan

Flaburgan Dec 29, 2014

Member

That looks like a nice solution ;)

Member

Flaburgan commented Dec 29, 2014

That looks like a nice solution ;)

@gdpelican

This comment has been minimized.

Show comment
Hide comment
@gdpelican

gdpelican Dec 29, 2014

Contributor

So I've added

  • Sending out a failure email for if export fails to upload for some reason
  • A frontend workflow which looks like this:

User who hasn't requested an export:
screenshot from 2014-12-30 01 30 53

User who has an export pending:
screenshot from 2014-12-30 01 32 15

User who has requested a completed export:
screenshot from 2014-12-30 01 31 20

Next up would be throwing in the big json chunks (posts and comments) to the serializer, and having a look at gzipping the file we generate.

Contributor

gdpelican commented Dec 29, 2014

So I've added

  • Sending out a failure email for if export fails to upload for some reason
  • A frontend workflow which looks like this:

User who hasn't requested an export:
screenshot from 2014-12-30 01 30 53

User who has an export pending:
screenshot from 2014-12-30 01 32 15

User who has requested a completed export:
screenshot from 2014-12-30 01 31 20

Next up would be throwing in the big json chunks (posts and comments) to the serializer, and having a look at gzipping the file we generate.

@gdpelican

This comment has been minimized.

Show comment
Hide comment
@gdpelican

gdpelican Jan 6, 2015

Contributor

Alright, I've changed it so the data is GZipped and added in posts and comments to the serializers (might need some feedback on which fields want to be there)

It'd be cool to get your take on this, @jaywink ?

Contributor

gdpelican commented Jan 6, 2015

Alright, I've changed it so the data is GZipped and added in posts and comments to the serializers (might need some feedback on which fields want to be there)

It'd be cool to get your take on this, @jaywink ?

Show outdated Hide outdated app/models/user.rb
end
def perform_export!
export = Tempfile.new([username, '.json'], encoding: 'ascii-8bit')

This comment has been minimized.

@jaywink

jaywink Jan 10, 2015

Contributor

Hmm why is the encoding ascii-8bit?

@jaywink

jaywink Jan 10, 2015

Contributor

Hmm why is the encoding ascii-8bit?

This comment has been minimized.

@jhass

jhass Jan 10, 2015

Member

ASCII-8bit is Rubys clunky way of saying BINARY (which is an alias for ASCII-8bit in fact). Remember, ASCII is a 7bit encoding ;) Since we write gzipped data here, that's correct.

@jhass

jhass Jan 10, 2015

Member

ASCII-8bit is Rubys clunky way of saying BINARY (which is an alias for ASCII-8bit in fact). Remember, ASCII is a 7bit encoding ;) Since we write gzipped data here, that's correct.

Show outdated Hide outdated app/controllers/users_controller.rb
end
def download_profile
send_data decompressed_export, type: :json, filename: current_user.export.filename

This comment has been minimized.

@jaywink

jaywink Jan 10, 2015

Contributor

Is there any reason we can't just send over a compressed file? IMHO as it's text data, this would make sense to decrease downloads. I'm guessing it will be 1 week after this is released that someone will offer an utility to automate exporting of profiles.. ;)

@jaywink

jaywink Jan 10, 2015

Contributor

Is there any reason we can't just send over a compressed file? IMHO as it's text data, this would make sense to decrease downloads. I'm guessing it will be 1 week after this is released that someone will offer an utility to automate exporting of profiles.. ;)

This comment has been minimized.

@gdpelican

gdpelican Jan 10, 2015

Contributor

Just send it over as application/gzip with a .gz file extension, and let the user handle inflating it?

I worry about people on windows machines needing to download a utility to open that file, although I guess we could look at providing a .zip format as well.

@gdpelican

gdpelican Jan 10, 2015

Contributor

Just send it over as application/gzip with a .gz file extension, and let the user handle inflating it?

I worry about people on windows machines needing to download a utility to open that file, although I guess we could look at providing a .zip format as well.

This comment has been minimized.

@jhass

jhass Jan 10, 2015

Member

Or maybe make it a tarball, even most Windows users should have a tool for that, I think 7-zip is quite common and can handle those.

@jhass

jhass Jan 10, 2015

Member

Or maybe make it a tarball, even most Windows users should have a tool for that, I think 7-zip is quite common and can handle those.

Show outdated Hide outdated app/serializers/export/comment_serializer.rb
module Export
class CommentSerializer < ActiveModel::Serializer
attributes :text,
:author_id,

This comment has been minimized.

@jaywink

jaywink Jan 10, 2015

Contributor

author_id and author_signature are probably irrelevant for a user, even in a pod migration situation (I think). Could we instead add here author diaspora handle? That is the network wide unique identifier that is actually usable. GUID is of course too but that doesn't say anything to users looking at the export.

@jaywink

jaywink Jan 10, 2015

Contributor

author_id and author_signature are probably irrelevant for a user, even in a pod migration situation (I think). Could we instead add here author diaspora handle? That is the network wide unique identifier that is actually usable. GUID is of course too but that doesn't say anything to users looking at the export.

This comment has been minimized.

@jaywink

jaywink Jan 10, 2015

Contributor

Heh no wait, this is of course the users own comments, so no need to have a handle. BUT we need to say what post it relates to. How about building an url to the post?

@jaywink

jaywink Jan 10, 2015

Contributor

Heh no wait, this is of course the users own comments, so no need to have a handle. BUT we need to say what post it relates to. How about building an url to the post?

This comment has been minimized.

@jhass

jhass Jan 10, 2015

Member

I'd just export the post guid, URLs may change.

@jhass

jhass Jan 10, 2015

Member

I'd just export the post guid, URLs may change.

This comment has been minimized.

@jaywink

jaywink Jan 10, 2015

Contributor

Post guid is good, agreed. It can be used to make a link anyway :)

@jaywink

jaywink Jan 10, 2015

Contributor

Post guid is good, agreed. It can be used to make a link anyway :)

Show outdated Hide outdated app/serializers/export/post_serializer.rb
module Export
class PostSerializer < ActiveModel::Serializer
attributes :text,
:author_id,

This comment has been minimized.

@jaywink

jaywink Jan 10, 2015

Contributor

Again, author_id is probably irrelevant.

@jaywink

jaywink Jan 10, 2015

Contributor

Again, author_id is probably irrelevant.

Show outdated Hide outdated app/serializers/export/post_serializer.rb
:likes_count,
:comments_count,
:reshares_count,
:favorite

This comment has been minimized.

@jaywink

jaywink Jan 10, 2015

Contributor

Interesting, first time seen this and doesn't to me look like it's used. There is a toggleFavorite method in the JS but it's not called :P

Some legacy idea that never got implemented, @jhass ? I'd personally remove it if we don't use the attribute.

@jaywink

jaywink Jan 10, 2015

Contributor

Interesting, first time seen this and doesn't to me look like it's used. There is a toggleFavorite method in the JS but it's not called :P

Some legacy idea that never got implemented, @jhass ? I'd personally remove it if we don't use the attribute.

This comment has been minimized.

@jhass

jhass Jan 10, 2015

Member

Eh, I don't remember that one at all, must be from the time I didn't watch the project. It was introduced in 75d25e9

So yeah, we should just drop that.

@jhass

jhass Jan 10, 2015

Member

Eh, I don't remember that one at all, must be from the time I didn't watch the project. It was introduced in 75d25e9

So yeah, we should just drop that.

Show outdated Hide outdated app/serializers/export/comment_serializer.rb
attributes :text,
:author_id,
:author_signature,
:likes_count

This comment has been minimized.

@jaywink

jaywink Jan 10, 2015

Contributor

We don't support likes_count at the moment so this could be removed.

@jaywink

jaywink Jan 10, 2015

Contributor

We don't support likes_count at the moment so this could be removed.

This comment has been minimized.

@jhass

jhass Jan 10, 2015

Member

More importantly it's just a counter cache.

@jhass

jhass Jan 10, 2015

Member

More importantly it's just a counter cache.

@jaywink

This comment has been minimized.

Show comment
Hide comment
@jaywink

jaywink Jan 10, 2015

Contributor

Other than the comments relating to the attributes, super awesome work 🌟 A lot of people will be happy to have this working again, as it's a shame our export is currently worse than many proprietary networks ;)

🍶 for you ;)

Contributor

jaywink commented Jan 10, 2015

Other than the comments relating to the attributes, super awesome work 🌟 A lot of people will be happy to have this working again, as it's a shame our export is currently worse than many proprietary networks ;)

🍶 for you ;)

@jaywink

This comment has been minimized.

Show comment
Hide comment
@jaywink

jaywink Jan 10, 2015

Contributor

Oh also, I'm assuming this will work fine with pods using S3 for storage etc?

Contributor

jaywink commented Jan 10, 2015

Oh also, I'm assuming this will work fine with pods using S3 for storage etc?

Show outdated Hide outdated app/serializers/export/post_serializer.rb
:image_width,
:likes_count,
:comments_count,
:reshares_count,

This comment has been minimized.

@jhass

jhass Jan 10, 2015

Member

Counter caches, shouldn't be included.

@jhass

jhass Jan 10, 2015

Member

Counter caches, shouldn't be included.

This comment has been minimized.

@jaywink

jaywink Jan 10, 2015

Contributor

Doesn't it contain the current (when exported) count though? IMHO that is relevant information for the user :)

@jaywink

jaywink Jan 10, 2015

Contributor

Doesn't it contain the current (when exported) count though? IMHO that is relevant information for the user :)

This comment has been minimized.

@jhass

jhass Jan 10, 2015

Member

It should, I don't see much value in that information though.

@jhass

jhass Jan 10, 2015

Member

It should, I don't see much value in that information though.

This comment has been minimized.

@jaywink

jaywink Jan 10, 2015

Contributor

I think it could be valuable for some - imagine you want to look at some old export of your data from an old pod for example. Maybe someone will make an app to analyse exports and rank posts with most likes etc.

@jaywink

jaywink Jan 10, 2015

Contributor

I think it could be valuable for some - imagine you want to look at some old export of your data from an old pod for example. Maybe someone will make an app to analyse exports and rank posts with most likes etc.

Show outdated Hide outdated app/uploaders/exported_user.rb
end
def filename
:"#{model.username}_diaspora_data.json"

This comment has been minimized.

@jhass

jhass Jan 10, 2015

Member

Why does this need to be a Symbol?

@jhass

jhass Jan 10, 2015

Member

Why does this need to be a Symbol?

This comment has been minimized.

@gdpelican

gdpelican Jan 10, 2015

Contributor

It doesn't, I'll change it.

@gdpelican

gdpelican Jan 10, 2015

Contributor

It doesn't, I'll change it.

@jaywink

This comment has been minimized.

Show comment
Hide comment
@jaywink

jaywink Jan 10, 2015

Contributor

Also, before merging, the changelog entry for the export should probably be refreshed. I guess the export version does not need to be bumped since the current export version hasn't gone stable yet.

Contributor

jaywink commented Jan 10, 2015

Also, before merging, the changelog entry for the export should probably be refreshed. I guess the export version does not need to be bumped since the current export version hasn't gone stable yet.

@jaywink jaywink referenced this pull request Jan 10, 2015

Open

User data export / (partial) import metaissue #5343

5 of 7 tasks complete
@gdpelican

This comment has been minimized.

Show comment
Hide comment
@gdpelican

gdpelican Jan 10, 2015

Contributor

I haven't been able to test using an S3 instance, but I just aped the Carrierwave uploaders for images, so it shouldn't behave any differently.

Contributor

gdpelican commented Jan 10, 2015

I haven't been able to test using an S3 instance, but I just aped the Carrierwave uploaders for images, so it shouldn't behave any differently.

@gdpelican

This comment has been minimized.

Show comment
Hide comment
@gdpelican

gdpelican Jan 12, 2015

Contributor

Well, I b0rked the rebase a little, but merged in develop, updated the serializers & changelog, and changed that one symbol that wasn't supposed to be there.

Also made it so we export files in a .json.gz file, which 7zip should handle.

Contributor

gdpelican commented Jan 12, 2015

Well, I b0rked the rebase a little, but merged in develop, updated the serializers & changelog, and changed that one symbol that wasn't supposed to be there.

Also made it so we export files in a .json.gz file, which 7zip should handle.

@jaywink

This comment has been minimized.

Show comment
Hide comment
@jaywink

jaywink Jan 16, 2015

Contributor

Alright, no more comments to this and looks awesome to me - merging! Nice to get this in to the next release, thanks @gdpelican !

Contributor

jaywink commented Jan 16, 2015

Alright, no more comments to this and looks awesome to me - merging! Nice to get this in to the next release, thanks @gdpelican !

jaywink added a commit that referenced this pull request Jan 16, 2015

@jaywink jaywink merged commit 6513053 into diaspora:develop Jan 16, 2015

1 check passed

continuous-integration/travis-ci The Travis CI build passed
Details

@jaywink jaywink added this to the next-major milestone Jan 16, 2015

@Flaburgan

This comment has been minimized.

Show comment
Hide comment
@Flaburgan

Flaburgan Jan 16, 2015

Member

Awesome!

Member

Flaburgan commented Jan 16, 2015

Awesome!

@Flaburgan

This comment has been minimized.

Show comment
Hide comment
@Flaburgan

Flaburgan Jan 16, 2015

Member

Currently in test on diaspora-fr :)
sidekiq-export

Member

Flaburgan commented Jan 16, 2015

Currently in test on diaspora-fr :)
sidekiq-export

@marienfressinaud

This comment has been minimized.

Show comment
Hide comment
@marienfressinaud

marienfressinaud Jan 17, 2015

Contributor

@Flaburgan > I tried yesterday night but it still shows me "We are currently processing your data. Please check back in a few moments." this morning. I'm pretty sure there is a problem ;).

Contributor

marienfressinaud commented Jan 17, 2015

@Flaburgan > I tried yesterday night but it still shows me "We are currently processing your data. Please check back in a few moments." this morning. I'm pretty sure there is a problem ;).

@gdpelican

This comment has been minimized.

Show comment
Hide comment
@gdpelican

gdpelican Jan 17, 2015

Contributor

Oh noes! I'll have a poke at it tomorrow to see what's up. We're sure
sidekiq is running?

On Sat, Jan 17, 2015 at 11:22 PM, Marien Fressinaud <
notifications@github.com> wrote:

@Flaburgan https://github.com/Flaburgan > I tried yesterday night but
it still shows me "We are currently processing your data. Please check back
in a few moments." this morning. I'm pretty sure there is a problem ;).


Reply to this email directly or view it on GitHub
#5499 (comment).

Contributor

gdpelican commented Jan 17, 2015

Oh noes! I'll have a poke at it tomorrow to see what's up. We're sure
sidekiq is running?

On Sat, Jan 17, 2015 at 11:22 PM, Marien Fressinaud <
notifications@github.com> wrote:

@Flaburgan https://github.com/Flaburgan > I tried yesterday night but
it still shows me "We are currently processing your data. Please check back
in a few moments." this morning. I'm pretty sure there is a problem ;).


Reply to this email directly or view it on GitHub
#5499 (comment).

@Flaburgan

This comment has been minimized.

Show comment
Hide comment
@Flaburgan

Flaburgan Jan 17, 2015

Member

Yeah sidekiq is running, Marien and my jobs are listed in "Enqueued". They are the only two jobs in the queue.

Member

Flaburgan commented Jan 17, 2015

Yeah sidekiq is running, Marien and my jobs are listed in "Enqueued". They are the only two jobs in the queue.

@jhass

This comment has been minimized.

Show comment
Hide comment
@Flaburgan

This comment has been minimized.

Show comment
Hide comment
@Flaburgan

Flaburgan Jan 17, 2015

Member

Confirmed, adding export_user there immediately made the jobs processed. But the message on the panel is still We are currently processing your data. Please check back in a few moments.

Member

Flaburgan commented Jan 17, 2015

Confirmed, adding export_user there immediately made the jobs processed. But the message on the panel is still We are currently processing your data. Please check back in a few moments.

@marienfressinaud

This comment has been minimized.

Show comment
Hide comment
@marienfressinaud

marienfressinaud Jan 17, 2015

Contributor

Indeed but I have a different message: download my profile (Last updated at 2015-01-17 12:36:04 UTC) refresh my profile data so it seems ok for me. Nevertheless there are other bugs, should we open new specific tickets for each bug found?

Contributor

marienfressinaud commented Jan 17, 2015

Indeed but I have a different message: download my profile (Last updated at 2015-01-17 12:36:04 UTC) refresh my profile data so it seems ok for me. Nevertheless there are other bugs, should we open new specific tickets for each bug found?

@jhass

This comment has been minimized.

Show comment
Hide comment
@jhass

jhass Jan 17, 2015

Member

should we open new specific tickets for each bug found?

Yes, sure.

Member

jhass commented Jan 17, 2015

should we open new specific tickets for each bug found?

Yes, sure.

@jaywink

This comment has been minimized.

Show comment
Hide comment
@jaywink

jaywink Jan 17, 2015

Contributor

@gdpelican the sidekiq queue thingy is fixed now, no need to check that :)

Contributor

jaywink commented Jan 17, 2015

@gdpelican the sidekiq queue thingy is fixed now, no need to check that :)

@Flaburgan Flaburgan referenced this pull request Feb 19, 2015

Merged

Export user photos #5685

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment