New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Export user photos #5685

Merged
merged 1 commit into from Mar 4, 2015

Conversation

Projects
None yet
4 participants
@margori
Contributor

margori commented Feb 19, 2015

My solution for bug #1802

Implemented "Export photos" button in user settings, currently disabled.

I'm using 'zip' files instead of 'tar', 'cause it's widely known and easily handled.

@jhass

This comment has been minimized.

Show comment
Hide comment
@jhass

jhass Feb 19, 2015

Member

I fear there's a lot more to consider to this. Let's start with some assorted points that immediately pop into my mind.

  • We shouldn't do such an expensive (and thus potentially lengthy) operation directly in the action, it should be a background job, else it will fail and run into timeouts.
    • This then raises the question of how we inform the user that the download is ready, when mail is enabled, we could just sent a mail with the download link.
    • However if mail is not available, we need a flow for that. One imaginable scenario is to pop a message to the user that the export will be prepared and they should check back in a few minutes. This then means we should keep the last export for a configurable amount of time.
      • That means we would need a cleanup job for the exports.
      • An existing export would need to be cleaned out if the user requests a new one.
      • A download link for an existing export would need to be displayed.
    • The download URL would need to contain a token, so they're not guessable.
    • We would need to store the URL somewhere in the DB
  • For the unlikely event that I can be convinced to not put this into a background job, we should consider and investigate X-Sendfile as delivery mechanism.
  • We probably should investigate and verify whether the chosen approach works on a PaaS platform such as Heroku and whether we want to support storage locations like Amazon S3.
Member

jhass commented Feb 19, 2015

I fear there's a lot more to consider to this. Let's start with some assorted points that immediately pop into my mind.

  • We shouldn't do such an expensive (and thus potentially lengthy) operation directly in the action, it should be a background job, else it will fail and run into timeouts.
    • This then raises the question of how we inform the user that the download is ready, when mail is enabled, we could just sent a mail with the download link.
    • However if mail is not available, we need a flow for that. One imaginable scenario is to pop a message to the user that the export will be prepared and they should check back in a few minutes. This then means we should keep the last export for a configurable amount of time.
      • That means we would need a cleanup job for the exports.
      • An existing export would need to be cleaned out if the user requests a new one.
      • A download link for an existing export would need to be displayed.
    • The download URL would need to contain a token, so they're not guessable.
    • We would need to store the URL somewhere in the DB
  • For the unlikely event that I can be convinced to not put this into a background job, we should consider and investigate X-Sendfile as delivery mechanism.
  • We probably should investigate and verify whether the chosen approach works on a PaaS platform such as Heroku and whether we want to support storage locations like Amazon S3.
@Flaburgan

This comment has been minimized.

Show comment
Hide comment
@Flaburgan

Flaburgan Feb 19, 2015

Member

You can have a look at how this is handled by #5499

Member

Flaburgan commented Feb 19, 2015

You can have a look at how this is handled by #5499

@jhass

This comment has been minimized.

Show comment
Hide comment
@jhass

jhass Feb 19, 2015

Member

#5499 at least backgrounds the work and sends a mail, it doesn't have an optimal UI flow either yet nor any cleanup.

Member

jhass commented Feb 19, 2015

#5499 at least backgrounds the work and sends a mail, it doesn't have an optimal UI flow either yet nor any cleanup.

@goobertron

This comment has been minimized.

Show comment
Hide comment
@goobertron

goobertron Feb 19, 2015

However if mail is not available, we need a flow for that. One imaginable scenario is to pop a message to the user that the export will be prepared and they should check back in a few minutes. This then means we should keep the last export for a configurable amount of time.

How about: an alert that the job may take some time, and then the user gets a notification in the usual way when the export is ready, with a download link. The export could be kept until downloaded (perhaps up to a maximum time).

However if mail is not available, we need a flow for that. One imaginable scenario is to pop a message to the user that the export will be prepared and they should check back in a few minutes. This then means we should keep the last export for a configurable amount of time.

How about: an alert that the job may take some time, and then the user gets a notification in the usual way when the export is ready, with a download link. The export could be kept until downloaded (perhaps up to a maximum time).

@margori

This comment has been minimized.

Show comment
Hide comment
@margori

margori Feb 19, 2015

Contributor

I'm gonna try to create a output stream.

If can't, I'm gonna use #5499 method so user see same behavior for both exports.

Contributor

margori commented Feb 19, 2015

I'm gonna try to create a output stream.

If can't, I'm gonna use #5499 method so user see same behavior for both exports.

@margori

This comment has been minimized.

Show comment
Hide comment
@margori

margori Feb 19, 2015

Contributor

Best approach to stream a zip file is using https://github.com/fringd/zipline but it's not ported to rails 4 yet. :-(

Contributor

margori commented Feb 19, 2015

Best approach to stream a zip file is using https://github.com/fringd/zipline but it's not ported to rails 4 yet. :-(

@margori

This comment has been minimized.

Show comment
Hide comment
@margori

margori Feb 20, 2015

Contributor

Done: used #5499 way of export data.

Comments are very welcome.

Contributor

margori commented Feb 20, 2015

Done: used #5499 way of export data.

Comments are very welcome.

Show outdated Hide outdated app/models/user.rb
Zip::ZipFile.open(export.path, Zip::File::CREATE) do |zipfile|
photos.each do |photo|
photo_path = "#{Rails.root}/public/uploads/images/#{photo.remote_photo_name}"

This comment has been minimized.

@jhass

jhass Feb 20, 2015

Member

We can't assume that, it might be hosted on S3.

@jhass

jhass Feb 20, 2015

Member

We can't assume that, it might be hosted on S3.

This comment has been minimized.

@margori

margori Feb 20, 2015

Contributor

If we can't then PR #5499 requires review as well.

@margori

margori Feb 20, 2015

Contributor

If we can't then PR #5499 requires review as well.

This comment has been minimized.

@jhass

jhass Feb 20, 2015

Member

How so?

@jhass

jhass Feb 20, 2015

Member

How so?

This comment has been minimized.

@margori

margori Feb 23, 2015

Contributor

Oh! You're talking about line 331. I thought you were talking about line 317.

I assume that files are stored in same machine as web server. :-(

@margori

margori Feb 23, 2015

Contributor

Oh! You're talking about line 331. I thought you were talking about line 317.

I assume that files are stored in same machine as web server. :-(

Show outdated Hide outdated app/models/user.rb
def perform_export_photos!
export = Tempfile.new([username, '_photos.zip'])
Zip::OutputStream.open(export) { |zos| }

This comment has been minimized.

@jhass

jhass Feb 20, 2015

Member

Huh?

@jhass

jhass Feb 20, 2015

Member

Huh?

This comment has been minimized.

@margori

margori Feb 20, 2015

Contributor

Strange code seen here

@margori

margori Feb 20, 2015

Contributor

Strange code seen here

This comment has been minimized.

@jhass

jhass Feb 20, 2015

Member

Strange indeed.

@jhass

jhass Feb 20, 2015

Member

Strange indeed.

Show outdated Hide outdated config/locales/diaspora/en.yml
We’ve encountered an issue while processing your photos for download.
Please try again!
Cheers,

This comment has been minimized.

@goobertron

goobertron Feb 20, 2015

A purely style comment: I think saying 'cheers' here might seem a bit dismissive, because it's giving bad news, and 'cheers' is more of an upbeat/happy sign-off. Perhaps remove 'we're sorry' from the email title and change this line from 'Cheers,' to 'Sorry,'.

@goobertron

goobertron Feb 20, 2015

A purely style comment: I think saying 'cheers' here might seem a bit dismissive, because it's giving bad news, and 'cheers' is more of an upbeat/happy sign-off. Perhaps remove 'we're sorry' from the email title and change this line from 'Cheers,' to 'Sorry,'.

Show outdated Hide outdated config/locales/diaspora/en.yml
The diaspora* email robot!
export_photos_failure_email:
subject: "We’re sorry, there was an issue with your photos, %{name}"

This comment has been minimized.

@goobertron

goobertron Feb 20, 2015

I'd suggest rewording this title to 'There was a problem exporting your photos', move the 'sorry' bit to the end of the email and remove the user name from the title.

I realise you've followed the style of the data export emails, but I think they could also usefully be changed.

@goobertron

goobertron Feb 20, 2015

I'd suggest rewording this title to 'There was a problem exporting your photos', move the 'sorry' bit to the end of the email and remove the user name from the title.

I realise you've followed the style of the data export emails, but I think they could also usefully be changed.

@margori

This comment has been minimized.

Show comment
Hide comment
@margori

margori Feb 23, 2015

Contributor

New try. In this opportunity I'm assuming files could not be in local file system.

Contributor

margori commented Feb 23, 2015

New try. In this opportunity I'm assuming files could not be in local file system.

Show outdated Hide outdated app/models/user.rb
photos.each do |photo|
temp_photo = Tempfile.new(photo.remote_photo_name, encoding: 'ascii-8bit')
begin
photo_path = "#{Rails.root}/public/uploads/images/#{photo.remote_photo_name}"

This comment has been minimized.

@jhass

jhass Feb 23, 2015

Member

Sorry, but Rails.root does not give you the upload location or even the sites domain or anything, it gives you the absolute path on the filesystem where the application is located, so this still doesn't work for stuff stored on S3.

I would give you more pointers, but I'm myself very unsure about what the best way to handle that case would be.

@jhass

jhass Feb 23, 2015

Member

Sorry, but Rails.root does not give you the upload location or even the sites domain or anything, it gives you the absolute path on the filesystem where the application is located, so this still doesn't work for stuff stored on S3.

I would give you more pointers, but I'm myself very unsure about what the best way to handle that case would be.

Show outdated Hide outdated app/models/user.rb
if File.file?(photo_path)
temp_photo.write(IO.read(photo_path))
else
temp_photo.write(URI.parse(photo.url).read)

This comment has been minimized.

@jhass

jhass Feb 23, 2015

Member

Your tests never ran into this, there's no URI#read afaik

@jhass

jhass Feb 23, 2015

Member

Your tests never ran into this, there's no URI#read afaik

This comment has been minimized.

@margori

margori Feb 24, 2015

Contributor

You're right. Some photos in test fixture are coming...

@margori

margori Feb 24, 2015

Contributor

You're right. Some photos in test fixture are coming...

@margori

This comment has been minimized.

Show comment
Hide comment
@margori

margori Feb 25, 2015

Contributor

New try.

In this opportunity, photo file is gotten through unprocessed photo uploader.

Contributor

margori commented Feb 25, 2015

New try.

In this opportunity, photo file is gotten through unprocessed photo uploader.

@margori

This comment has been minimized.

Show comment
Hide comment
@margori

margori Feb 25, 2015

Contributor

This PR is related to #5343

Contributor

margori commented Feb 25, 2015

This PR is related to #5343

Show outdated Hide outdated app/models/user.rb
begin
zos.put_next_entry(photo.remote_photo_name)
zos.print(photo.unprocessed_image.file.read)
rescue

This comment has been minimized.

@jhass

jhass Feb 25, 2015

Member

Do you have a more specific exception that you expect? Or even a better, a way to guard the code raising to from running in that case?

@jhass

jhass Feb 25, 2015

Member

Do you have a more specific exception that you expect? Or even a better, a way to guard the code raising to from running in that case?

This comment has been minimized.

@margori

margori Feb 25, 2015

Contributor

If file doesn't exists it returns Errno::ENOENT.

@margori

margori Feb 25, 2015

Contributor

If file doesn't exists it returns Errno::ENOENT.

Show outdated Hide outdated app/controllers/users_controller.rb
end
def download_photos
send_data File.open(current_user.exported_photos_file.path).read, filename: current_user.exported_photos_file.filename

This comment has been minimized.

@jhass

jhass Feb 25, 2015

Member

File.open.read is very bad, it doesn't ensure the file is closed and thus leaks FDs. File.read would ensure that, however it reads the entire file into memory, which is bad too. This should probably use X-Sendfile or a streaming API.

@jhass

jhass Feb 25, 2015

Member

File.open.read is very bad, it doesn't ensure the file is closed and thus leaks FDs. File.read would ensure that, however it reads the entire file into memory, which is bad too. This should probably use X-Sendfile or a streaming API.

This comment has been minimized.

@margori

margori Feb 25, 2015

Contributor

I thought about using send_file but works for local files, fails on remote.

If remote then it's better to redirect.

What do you think?

@margori

margori Feb 25, 2015

Contributor

I thought about using send_file but works for local files, fails on remote.

If remote then it's better to redirect.

What do you think?

This comment has been minimized.

@jhass

jhass Feb 26, 2015

Member

Actually, why not redirect in both cases?

@jhass

jhass Feb 26, 2015

Member

Actually, why not redirect in both cases?

This comment has been minimized.

@margori

margori Feb 26, 2015

Contributor

Good idea! Let's try it.

@margori

margori Feb 26, 2015

Contributor

Good idea! Let's try it.

@margori

This comment has been minimized.

Show comment
Hide comment
@margori

margori Feb 26, 2015

Contributor

Done: redirecting when download_photos is called.

In this case, filename should be unguessable. Then new request are stored with random filename.

Contributor

margori commented Feb 26, 2015

Done: redirecting when download_photos is called.

In this case, filename should be unguessable. Then new request are stored with random filename.

Show outdated Hide outdated app/uploaders/exported_photos.rb
def secure_token(length=16)
var = :"@#{mounted_as}_secure_token"
model.instance_variable_get(var) or model.instance_variable_set(var, SecureRandom.hex(length/2))
end

This comment has been minimized.

@jhass

jhass Feb 26, 2015

Member

That's just 16**6 combinations, I see no reason to limit it that much.

@jhass

jhass Feb 26, 2015

Member

That's just 16**6 combinations, I see no reason to limit it that much.

@margori

This comment has been minimized.

Show comment
Hide comment
@margori

margori Feb 27, 2015

Contributor

Filename token increased to 128bits.

Contributor

margori commented Feb 27, 2015

Filename token increased to 128bits.

@margori margori changed the title from Exports user photos as zip file to Export user photos Feb 28, 2015

@margori

This comment has been minimized.

Show comment
Hide comment
@margori

margori Mar 2, 2015

Contributor

Re-based and code updated for new gems.

Need review. Thanks!

Contributor

margori commented Mar 2, 2015

Re-based and code updated for new gems.

Need review. Thanks!

Show outdated Hide outdated app/models/user.rb
Zip::ZipOutputStream.open(temp_zip.path) do |zos|
photos.each do |photo|
begin
photo_data = photo.unprocessed_image.file.read

This comment has been minimized.

@jhass

jhass Mar 2, 2015

Member

Looking good so far, I guess we just have to see how it behaves on S3 (+ Heroku) setup. I still don't understand how this call pulls the photo from S3 tbh. If you're able to clarify that... :)

Another last small remark I have is moving the temp_zip.close below to an ensure block.

@jhass

jhass Mar 2, 2015

Member

Looking good so far, I guess we just have to see how it behaves on S3 (+ Heroku) setup. I still don't understand how this call pulls the photo from S3 tbh. If you're able to clarify that... :)

Another last small remark I have is moving the temp_zip.close below to an ensure block.

This comment has been minimized.

@margori

margori Mar 2, 2015

Contributor

This is what a thought:

  • If I delete a post with a photo in it, photo file is deleted from my disc.
  • I don't found code that delete (unlink) the file. So there's an object that knows what to do.
  • The uploader field may know where to upload and to delete it.
  • If so, it may also know where to read it.
  • I tried it and worked, at least in my local file system.
  • I found that admin can choose where to store files by setting carrierwave config file. I didn't try over S3. Amazon wants my credit card number for a free account and... no, darling.

What I really want to know is how to create a test to simulate a S3 storage. Do you have a sample code?

@margori

margori Mar 2, 2015

Contributor

This is what a thought:

  • If I delete a post with a photo in it, photo file is deleted from my disc.
  • I don't found code that delete (unlink) the file. So there's an object that knows what to do.
  • The uploader field may know where to upload and to delete it.
  • If so, it may also know where to read it.
  • I tried it and worked, at least in my local file system.
  • I found that admin can choose where to store files by setting carrierwave config file. I didn't try over S3. Amazon wants my credit card number for a free account and... no, darling.

What I really want to know is how to create a test to simulate a S3 storage. Do you have a sample code?

This comment has been minimized.

@jhass

jhass Mar 2, 2015

Member

No, I don't have an example test for that, I'm actually not sure whether it's testable, I guess one could enable hosting on S3 in the test and then webmock should catch the network call.

Hit me up on IRC if you want credentials to a S3 bucket for your tests.

@jhass

jhass Mar 2, 2015

Member

No, I don't have an example test for that, I'm actually not sure whether it's testable, I guess one could enable hosting on S3 in the test and then webmock should catch the network call.

Hit me up on IRC if you want credentials to a S3 bucket for your tests.

This comment has been minimized.

@jhass

jhass Mar 2, 2015

Member

I think I fixed the S3 access ;)

@jhass

jhass Mar 2, 2015

Member

I think I fixed the S3 access ;)

This comment has been minimized.

@margori

margori Mar 3, 2015

Contributor

Yes, you did. :)

Downloads work as expected, but having problem with upload. CarrierWave use same file name of downloaded image for the new uploaded zip. So upload fails saying "You can't upload a jpg. A zip expected" :(

@margori

margori Mar 3, 2015

Contributor

Yes, you did. :)

Downloads work as expected, but having problem with upload. CarrierWave use same file name of downloaded image for the new uploaded zip. So upload fails saying "You can't upload a jpg. A zip expected" :(

This comment has been minimized.

@jhass

jhass Mar 3, 2015

Member

Mmh, it did upload a zip though:

@jhass

jhass Mar 3, 2015

Member

Mmh, it did upload a zip though:

This comment has been minimized.

@margori

margori Mar 3, 2015

Contributor

Good. This capture confirms that one export file is maintained.

@margori

margori Mar 3, 2015

Contributor

Good. This capture confirms that one export file is maintained.

@margori

This comment has been minimized.

Show comment
Hide comment
@margori

margori Mar 3, 2015

Contributor

New changes

  • Removed "zip" extension validation in 'exported_photos.rb'.
  • Ensured "update exporting_photos: false" in case of errors.

These changed were successfully tested in AWS S3.

Contributor

margori commented Mar 3, 2015

New changes

  • Removed "zip" extension validation in 'exported_photos.rb'.
  • Ensured "update exporting_photos: false" in case of errors.

These changed were successfully tested in AWS S3.

@jhass jhass added this to the next-major milestone Mar 4, 2015

@jhass jhass merged commit b154d87 into diaspora:develop Mar 4, 2015

1 of 2 checks passed

continuous-integration/travis-ci/pr The Travis CI build failed
Details
hound Hound has reviewed all the changes!

jhass added a commit that referenced this pull request Mar 4, 2015

@jhass

This comment has been minimized.

Show comment
Hide comment
@jhass

jhass Mar 4, 2015

Member

Alright, thank you!

Member

jhass commented Mar 4, 2015

Alright, thank you!

@Flaburgan

This comment has been minimized.

Show comment
Hide comment
@Flaburgan

Flaburgan Mar 4, 2015

Member

Nice, thank you!

Member

Flaburgan commented Mar 4, 2015

Nice, thank you!

@margori

This comment has been minimized.

Show comment
Hide comment
@margori

margori Mar 4, 2015

Contributor

UHU! Thank you too!

Contributor

margori commented Mar 4, 2015

UHU! Thank you too!

@goobertron

This comment has been minimized.

Show comment
Hide comment
@goobertron

goobertron Mar 4, 2015

Hey, thanks very much, @margori!

Hey, thanks very much, @margori!

@jaywink jaywink referenced this pull request Mar 15, 2015

Open

User data export / (partial) import metaissue #5343

5 of 7 tasks complete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment