-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stream files from the network #45
Comments
Some preliminary testing shows that this is not too difficult. Fortunately both the FileReader API and AJAX requests are asynchronous, so the groundwork is already laid to parse chunks of a CSV file asynchronously. We just need to build in support for using something like this: $.ajax("/plu_codes.csv", {
type: "GET",
headers: {
"Range": "bytes=0-1024"
}
}).done(function(data)
{
// ... treat data just as if it was a chunk read from the FileReader
}); |
Thinking out loud here... Already in the current version of Papa Parse, downloading a file and parsing it (assuming it is not too huge and can fit in memory well -- say, under 100 MB) works as easy as this: $.get("some_file.csv", function(data) {
var results = $.parse(data);
}); But that doesn't "stream" the file: if that file is too big for the browser tab to handle, say even 1 GB, then this would just cause the browser the lock up. In order to download and parse huge files, while keeping Papa easy to use, what about invoking Papa so that it uses the Range header as described above like this: $.parse("some_file.csv", {
ajax: true,
step: function(data, jqxhr) {
console.log(data.results);
}
}); So you specify Two things to work out still:
Since this is for 3.0, I'm willing to make big breaking changes to keep Papa easy to use. |
Okay, I think I've resolved both those things. $.get.parse("files/asdf.csv", {
config: {
step: function(data, handle) {
console.log(data, handle);
// handle gives access to pause(), resume(), jqxhr, etc.
}
},
complete: function(data) {
console.log("Done!");
}
}); Calling Number (2) above is resolved because I've decided that the AJAX request done by Papa will be a simple GET request. However, the internal functions that perform the network requests, file reading, and do the parsing will be exposed so the user can utilize them at a lower level if desired. |
Still have some tweaking and optimizing to do, but this is now done. |
We can stream from file input elements, but how about the network? Would the server hosting the file have to support the Range HTTP header?
This could be awesome...
The text was updated successfully, but these errors were encountered: