Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON Exporter, part deux #5499

Merged
merged 1 commit into from Jan 16, 2015

Conversation

gdpelican
Copy link
Contributor

Thought I'd share this for a little feedback if you'd care to give it a look @jhass @jaywink

This currently:

  • Boots the current export off to Sidekiq
  • Uploads a json file through carrierwave
  • Sends an email notifying the user when the job's complete, which includes a link to the upload
  • Should pass tests, although we'll see when Travis gets its mitts on it.

It doesn't:

  • Have a good UI solution yet (you just click on 'download my profile' and it takes you to a blank screen with status 200; we'll have to do a flash notice or something to tell the user what's happening)
  • Include any of the heavy lifting associations we outscoped from the last PR

@@ -505,6 +518,6 @@ def clearable_fields
"created_at", "updated_at", "locked_at",
"serialized_private_key", "getting_started",
"disable_mail", "show_community_spotlight_in_stream",
"email", "remove_after"]
"email", "remove_after", "export"]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should export get cleared when we do user.clear_account!? I'd imagine so...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds right.

@jhass
Copy link
Member

jhass commented Dec 28, 2014

Speaking of the mails, we should keep in mind that sending mails is entirely optional and not configured by default.

@gdpelican
Copy link
Contributor Author

Hm, I don't think sending emails should be a hard dependency for this feature, it's more of a nice-to-have.

How about something like this:

When you visit your profile page

If you have not requested an export
There is a link to request an export

If you have requested an export and it is not complete
There is a message to the effect of 'We're still working on your export, please check back soon'

If you have requested an export and it is complete
There is a link to download your export
And a link which says 'Update my export', with a timestamp of the last created export

@Flaburgan
Copy link
Member

That looks like a nice solution ;)

@gdpelican
Copy link
Contributor Author

So I've added

  • Sending out a failure email for if export fails to upload for some reason
  • A frontend workflow which looks like this:

User who hasn't requested an export:
screenshot from 2014-12-30 01 30 53

User who has an export pending:
screenshot from 2014-12-30 01 32 15

User who has requested a completed export:
screenshot from 2014-12-30 01 31 20

Next up would be throwing in the big json chunks (posts and comments) to the serializer, and having a look at gzipping the file we generate.

@gdpelican
Copy link
Contributor Author

Alright, I've changed it so the data is GZipped and added in posts and comments to the serializers (might need some feedback on which fields want to be there)

It'd be cool to get your take on this, @jaywink ?

end

def perform_export!
export = Tempfile.new([username, '.json'], encoding: 'ascii-8bit')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm why is the encoding ascii-8bit?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ASCII-8bit is Rubys clunky way of saying BINARY (which is an alias for ASCII-8bit in fact). Remember, ASCII is a 7bit encoding ;) Since we write gzipped data here, that's correct.

@jaywink
Copy link
Contributor

jaywink commented Jan 10, 2015

Other than the comments relating to the attributes, super awesome work 🌟 A lot of people will be happy to have this working again, as it's a shame our export is currently worse than many proprietary networks ;)

🍶 for you ;)

@jaywink
Copy link
Contributor

jaywink commented Jan 10, 2015

Oh also, I'm assuming this will work fine with pods using S3 for storage etc?

:image_width,
:likes_count,
:comments_count,
:reshares_count,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Counter caches, shouldn't be included.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't it contain the current (when exported) count though? IMHO that is relevant information for the user :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should, I don't see much value in that information though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it could be valuable for some - imagine you want to look at some old export of your data from an old pod for example. Maybe someone will make an app to analyse exports and rank posts with most likes etc.

@jaywink
Copy link
Contributor

jaywink commented Jan 10, 2015

Also, before merging, the changelog entry for the export should probably be refreshed. I guess the export version does not need to be bumped since the current export version hasn't gone stable yet.

@gdpelican
Copy link
Contributor Author

I haven't been able to test using an S3 instance, but I just aped the Carrierwave uploaders for images, so it shouldn't behave any differently.

@gdpelican
Copy link
Contributor Author

Well, I b0rked the rebase a little, but merged in develop, updated the serializers & changelog, and changed that one symbol that wasn't supposed to be there.

Also made it so we export files in a .json.gz file, which 7zip should handle.

@jaywink
Copy link
Contributor

jaywink commented Jan 16, 2015

Alright, no more comments to this and looks awesome to me - merging! Nice to get this in to the next release, thanks @gdpelican !

jaywink added a commit that referenced this pull request Jan 16, 2015
@jaywink jaywink merged commit 6513053 into diaspora:develop Jan 16, 2015
@jaywink jaywink added this to the next-major milestone Jan 16, 2015
@Flaburgan
Copy link
Member

Awesome!

@Flaburgan
Copy link
Member

Currently in test on diaspora-fr :)
sidekiq-export

@marienfressinaud
Copy link
Contributor

@Flaburgan > I tried yesterday night but it still shows me "We are currently processing your data. Please check back in a few moments." this morning. I'm pretty sure there is a problem ;).

@gdpelican
Copy link
Contributor Author

Oh noes! I'll have a poke at it tomorrow to see what's up. We're sure
sidekiq is running?

On Sat, Jan 17, 2015 at 11:22 PM, Marien Fressinaud <
notifications@github.com> wrote:

@Flaburgan https://github.com/Flaburgan > I tried yesterday night but
it still shows me "We are currently processing your data. Please check back
in a few moments." this morning. I'm pretty sure there is a problem ;).


Reply to this email directly or view it on GitHub
#5499 (comment).

@Flaburgan
Copy link
Member

Yeah sidekiq is running, Marien and my jobs are listed in "Enqueued". They are the only two jobs in the queue.

@jhass
Copy link
Member

jhass commented Jan 17, 2015

@Flaburgan
Copy link
Member

Confirmed, adding export_user there immediately made the jobs processed. But the message on the panel is still We are currently processing your data. Please check back in a few moments.

@marienfressinaud
Copy link
Contributor

Indeed but I have a different message: download my profile (Last updated at 2015-01-17 12:36:04 UTC) refresh my profile data so it seems ok for me. Nevertheless there are other bugs, should we open new specific tickets for each bug found?

@jhass
Copy link
Member

jhass commented Jan 17, 2015

should we open new specific tickets for each bug found?

Yes, sure.

@jaywink
Copy link
Contributor

jaywink commented Jan 17, 2015

@gdpelican the sidekiq queue thingy is fixed now, no need to check that :)

@Flaburgan Flaburgan mentioned this pull request Feb 19, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants