New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Provide a method to return the filtered data set #95
Conversation
Note that you can already retrieve the filtered records using dimension.top (sorted by the dimension’s natural order). Is there a particular reason why that wouldn’t be useful in your scenario? |
This scenario is actually related to my request to be able to retrieve only the filtered records. We are using DC.js and thus CrossFilter to provide a dynamic reporting environment to our end users. Once the users have filtered down their views we want to provide a CSV download to them of only the records they are looking at. If I can get to the filtered record's only then I can extract out the record ID's, send them back to the server, and have it serve up the corresponding CSV file. Using dimension.top doesn't work in this case as the filters come from different locations. The crossfilter groupAll() method returns a proper group and the .all() method returns the correct # of records affected by all filters, but I can't find a dimension that would then return only those records. Another solution would be to add an each operator to the groupAll() output group allowing iteration of affected records, thus providing a paged way of getting to each record. This may possibly reduce the overhead, don't know. If I'm wrong, please point me in the right direction :) |
Can you explain what you mean by the filters coming from different locations? |
Perhaps you’re confusing the behaviour of dimension.top with groups, which exclude their associated dimension’s filter for computing reduce values. |
I thought the dimension only included a subset of the data associated with the root. Using something similar to the follow structure:
IE: If I have a dimension on "date" then I would only see the date column in the dimension. So performing a top(Infinity) on that dimension wouldn't get me back the _id column as an example. In the above I would have a group all to get the top. And say I had a dimension on date and state. Now if I filter based on a state of MO || KS and a date of > 3 days ago my all count goes down to whatever... Saying that my root crossfilter is called ndx, then my root ndx records value will have everything including the filtered out records. My goal would be to get only the records that match the filters MO || KS > 3 days ago. Of course I don't really know what the user has filtered to :) |
dimension.top is my current workaround. Although it seems odd to grab an arbitrary dimension and call .top(Infinity) on it just to retrieve all the filtered records. The semantic link between the operation and the output feels weak, but it certainly works. |
@jdarling Read the documentation for dimension.top (or try it for yourself); you’ll see that it returns the records, not the computed dimension values. |
Just tried it with our live data, and of course you are correct :). I think it would be useful to have this documented explicitly though as I searched for an answer for quite some time before chasing this on GitHub. Hopefully others find the thread if they run into the same problem. Great work all over. |
Is there any way to all filtered values, but mantaining original data order? |
Would it be possible to have |
I'm also interested in that. What I'm actually looking for is a way to return the filtered data and keeping the initial index of the dataset. So instead of removing the filtered rows setting them to blank (in csv: ,,,,). Any help, workaround would be appreciated. Thanks |
+1 |
So I found a solution. But I could imagine that there are much better, simple, faster ways of doing this. Any improvements are welcome. First I created an Index for the loaded data before inputting the data into crossfilter (I'm working with CSV data):
Now I'm calling this function each time crossfilter is used:
This creates either an empty object (if it is filtered) or puts the data in the right spot through the .splice method. It works well for me (I'm dealing with 25.000 objects) but I'm sure there are better methods. Best, Kim |
Thanks for your contributions and sorry for silence on this side. As discussed in #151 an active fork is being developed in a new Crossfilter Organization. Please consider rebasing and opening your PR there (if you haven't already) where it should be warmly welcomed by the new maintainers. Cheers! |
Also reference in package.json
This PR exposes
crossfilter.all
which will return all the original records in the dataset with any filters applied.