-
-
Notifications
You must be signed in to change notification settings - Fork 3.3k
Implements streaming CSVs. #3038
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implements streaming CSVs. #3038
Conversation
Allows unlimited downloads of CSV data using ActiveRecord batch APIs and ActionController response streaming. Works on Rails 3.2+
👍 Looks great! |
Ok, I'm finished with this pull for real. I just downloaded 650,000+ records from one of my production systems. The stream starts right away now that I have the right response headers in place. |
One problem I noticed with this: if I manually stop the download, Rails just keeps chugging away generating that CSV. |
Do you know how to detect in Rails if a client disconnects? For the sake of enabling business analysts to work with ActiveAdmin that's a risk that I think is reasonable. I'm going to see what can be done about that, but is that the only barrier to merging? |
After reading rack and rails I don't know that there is a way to detect a cancelled download and stop the response. I could add some config options to enforce a max_csv_result if you're worried about creating a URL of death. |
Another trade-off to note is Rails' batch finders, like |
Much related to #2449 But wow, thanks so much for implementing this. |
I still get timed out in Heroku for large batches (>30k). Is Edit: Just tried that, and I got an empty response. |
@joallard Is your query eager loading things? I was able to get a response of 650k records with no problem. What web server are you running on? Is there some kind of response buffering on? |
I found it: it wasn't an H12 Heroku timeout, but rather the timeout in Unicorn. I added Works like a charm! |
What is the status? Is the issue that's been raised solvable? |
I'm not clear on what needs to be done to get this merged. |
Would be great if this gets merged. |
@seanlinsley Is there something I can do to make this more acceptable? Perhaps a setting to limit max_records? |
This can't be cleanly merged because of #3097. I resolved the conflict locally with these changes: $ git diff
diff --git a/lib/active_admin/resource_controller/streaming.rb b/lib/active_admin/resource_controller/streaming.rb
index c07dffa..ba99b0f 100644
--- a/lib/active_admin/resource_controller/streaming.rb
+++ b/lib/active_admin/resource_controller/streaming.rb
@@ -20,9 +20,15 @@ module ActiveAdmin
protected
- def csv_line(resource, columns)
+ def csv_line(resource, columns, options)
columns.map do |column|
- call_method_or_proc_on resource, column.data
+ s = call_method_or_proc_on resource, column.data
+
+ if options[:encoding] && s.respond_to?(:encode!)
+ s.encode! options[:encoding], options[:encoding_options]
+ else
+ s
+ end
end
end
@@ -37,7 +43,7 @@ module ActiveAdmin
self.response_body = Enumerator.new do |csv|
csv << CSV.generate_line(columns.map(&:name), options)
collection.find_each do |resource|
- csv << CSV.generate_line(csv_line(resource, columns), options)
+ csv << CSV.generate_line(csv_line(resource, columns, options), options)
end
end
end |
Conflicts: app/views/active_admin/resource/index.csv.erb
It would be nice to be able to both set a hard limit as a developer, & choose to download only a certain number of records as a user (#2626), but this PR is a great improvement on its own. Thanks so much for this ❤️ |
+1 thank you! |
🍻 |
+1, when to release on RubyGems? |
I just migrated from ActiveAdmin 0.6 to master branch. Although this CSV export is really good, but it assumes that the primary key will be included in the collection. In my current setup I have customised the collection in a way that primary key cannot be included. Is there a way we can use pagination scopes to get the same effect as |
@kushkella off the top of my head I'm not sure what might work. Would you mind sharing the code / explaining why you can't include the primary key? |
Also instead of commenting on a merged pull request it would be better to create a new ticket. |
@seanlinsley sorry for commenting on the old pull request. Please look at the new issue and let me know if there is a way to solve this problem. #3269 |
Loving this commit! It works great locally; cool idea. Starts streaming the file instead of serving the csv from the web server. For production on heroku I get H12 errors trying to download around 10K records with children. Are there any reasons this would not work on Heroku currently? I'll be happy to do any research and provide any other info about my implementation. activeadmin (1.0.0.pre) |
@jhblacklock Sounds like the timeout value in Unicorn. That, or stop having children download your CSVs and have it done by an adult! |
@joallard That is definitely a problem of mine (children) but they don't seem to be the issue I'm having with streaming CSVs ^_^. My unicorn timeout value is hardcoded at 600. I also run unicorn local and do not experience any issues. The initial web request hits in 10ms but the server doesn't initiate the download before the server times out in 30 seconds.
My csv block is lengthy but I wouldn't think the timeout would occur on the first iteration.
|
Yar. I feel sheepish. 😊 My Procfile was not registering with the heroku deploy so it was defaulting to webrick. One process = not working. Thanks for everything! |
Methinks we should document this issue seeing how frequent it is |
It's also important to mention that if an exception occurs during the stream, the stack trace will be appended to the resulting file, but the request will only end after your web server times out. That's not necessarily a problem in production (and it might be very hard to fix), but in development it can make debugging much more difficult. Which is why I created #3535 BTW @jhblacklock you can use syntax highlighting to make your code easier to read: https://guides.github.com/features/mastering-markdown/#GitHub-flavored-markdown |
Actually, it is a problem in production, because the exception doesn't bubble up. So the only way you'll find out is by the users of your app reporting that their CSV exports don't include all the records they should. |
@seanlinsley , I faced with this problem, have no logs to find out why csv was not downloaded. |
This PR might be old, but we are running into the same issue as @seanlinsley , so is there any approch to it? |
I downloaded active admin 1.0.0.pre4, but would like to know how to implement the export CSV streaming? Tks |
@seanlinsley Thanks for your implementing. but I have some questions. |
Allows unlimited downloads of CSV data using ActiveRecord batch APIs and ActionController response streaming.
Works on Rails 3.2+