Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow mechanism to "stream" data to the widget in chunks #90

Closed
rodryquintero opened this issue Dec 4, 2013 · 9 comments
Closed

Allow mechanism to "stream" data to the widget in chunks #90

rodryquintero opened this issue Dec 4, 2013 · 9 comments

Comments

@rodryquintero
Copy link

What I am trying to ask for is a way for PivoTable to be "fed" data continuously, in a streaming fashion, and have the UI update itself as the data is processed.

The reason I am asking for this is, again, to try to overcome browsers limits on the size of the JSON objects that it receives from sources (usually ajax responses).

I managed to mitigate this by sending ajax requests in chunks. However, the resulting data object (the one that gathers all the data from the source) is still huge and the browser crashes.

By feeding data to the widget in this fashion, I assume, there would not be a need to store the data in one single large array/json object.

@nicolaskruchten
Copy link
Owner

I see what you're trying to do, but unfortunately this will not be possible: the entire dataset must be scanned for every re-render. How much data are you trying to put through here, in megabytes and in number of records?

@rodryquintero
Copy link
Author

It is about 230MB. I ended up breaking the data into separate json text
files residing on the server, but, as you said, the data must be
consolidated into a single object before passing it to to the widget.

This sounds like it would take a major rewrite, but I had to ask :)

Thanks.

On Wed, Dec 4, 2013 at 7:46 AM, Nicolas Kruchten
notifications@github.comwrote:

I see what you're trying to do, but unfortunately this will not be
possible: the entire dataset must be scanned for every re-render. How much
data are you trying to put through here, in megabytes and in number of
records?


Reply to this email directly or view it on GitHubhttps://github.com//issues/90#issuecomment-29801704
.

**** Ivan Quintero

@nicolaskruchten
Copy link
Owner

Yeah, sorry... You may have some luck in transmitting the data in CSV format to the browser and having the browser decode it into JS objects, similar to http://nicolaskruchten.github.io/pivottable/examples/local.html

@fastcatch
Copy link
Contributor

Couple ideas FWIW -- they may or may not be applicable to your situation:

  • In some cases I also have too-large-to-my-liking data sets (around 10MBs JSON). I have been thinking of creating a mechanism whereby the data is aggregated on the server and this component is a front-end only. It's like sending the params for aggregation via AJAX, receive the response, re-build the pivot from little data. When you have a lot of data it may be even faster (though it's a hunch only, I haven't tested). It may work reasonably well provided that your aggregation mechanism is supported by well optimized engines (such as summing up values in a properly indexed relational database). (Alas, you can even take this step further: an OLAP DB's bread and butter is to slice-and-dice through much data. And they usually have decent pivot components.)
  • There is a completely different way, though: writing your own aggregators that are capable of incremental processing. I haven't tried this but many aggregation operations lend themselves to incremental algorithms, such as count or sum. Whatever already processed can be stored separately or fetched back from the pivot table; it can then be combined with the chunk received when re-building the pivot for the combined data set. It may not be very simple but I have the gut feeling that it is doable. (Be warned: I haven't tried. And of course there may be aggregations which are hard or impossible to do without the whole data set.)

@rodryquintero
Copy link
Author

FYI. In my particular problem with large JSON datasets, I found that
Firefox (latest version) renders the data without much problems and
relatively fast.

Chrome and Safari crash when performing the same data fetch and processing.

Have not tried out Explorer.

By the way, I fetch the data from server JSON text files using the
javascript promises pattern.

On Wed, Dec 4, 2013 at 4:45 PM, fastcatch notifications@github.com wrote:

Couple ideas FWIW -- they may or may not be applicable to your situation:

In some cases I also have too-large-to-my-liking data sets (around
10MBs JSON). I have been thinking of creating a mechanism whereby the data
is aggregated on the server and this component is a front-end only. It's
like sending the params for aggregation via AJAX, receive the response,
re-build the pivot from little data. When you have a lot of data it may be
even faster (though it's a hunch only, I haven't tested). It may work
reasonably well provided that your aggregation mechanism is supported by
well optimized engines (such as summing up values in a properly indexed
relational database). (Alas, you can even take this step further: an OLAP
DB's bread and butter is to slice-and-dice through much data. And they
usually have decent pivot components.)

There is a completely different way, though: writing your own
aggregators that are capable of incremental processing. I haven't tried
this but many aggregation operations lend themselves to incremental
algorithms, such as count or sum. Whatever already processed can be
stored separately or fetched back from the pivot table; it can then be
combined with the chunk received when re-building the pivot for the
combined data set. It may not be very simple but I have the gut feeling
that it is doable. (Be warned: I haven't tried. And of course there may be
aggregations which are hard or impossible to do without the whole data set.)


Reply to this email directly or view it on GitHubhttps://github.com//issues/90#issuecomment-29848813
.

**** Ivan Quintero

@nicolaskruchten
Copy link
Owner

What you say about browsers is interesting... I've found bigger performance issues in Firefox than in Chrome, personally, and I've happily loaded more than 50 megs of data without an issue. (Edit, sorry, I misread my benchmark: this should read 50k records)

@rodryquintero
Copy link
Author

I am also surprised by this. I always done testing on Chrome, but it choked
when doing big loads of JSON data. They I tried Firefox and it worked.

Go figure.

On Thu, Dec 5, 2013 at 7:51 AM, Nicolas Kruchten
notifications@github.comwrote:

What you say about browsers is interesting... I've found bigger
performance issues in Firefox than in Chrome, personally, and I've happily
loaded more than 50 megs of data without an issue.


Reply to this email directly or view it on GitHubhttps://github.com//issues/90#issuecomment-29894884
.

**** Ivan Quintero

@nicolaskruchten
Copy link
Owner

How many records are in your 230Mb? And how many attributes per record? Also please note my edit in my last comment above: I don't think I've tried loading 50-meg CSVs, it was 50k-record CSVs I was working with.

You may also find the discussion in issue #36 to be enlightening.

@rodryquintero
Copy link
Author

Hi nicholas. There are about 500k JSON records, but the records have a
considerable amount of attributes (around 25).

Cheers,

On Thu, Dec 5, 2013 at 8:29 AM, Nicolas Kruchten
notifications@github.comwrote:

How many records are in your 230Mb? And how many attributes per record?
Also please note my edit in my last comment above: I don't think I've tried
loading 50-meg CSVs, it was 50k-record CSVs I was working with.

You may also find the discussion in issue #36https://github.com/nicolaskruchten/pivottable/issues/36to be enlightening.


Reply to this email directly or view it on GitHubhttps://github.com//issues/90#issuecomment-29897196
.

**** Ivan Quintero

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants