Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide a method to return the filtered data set #95

Closed
wants to merge 1 commit into from

Conversation

jrideout
Copy link

This PR exposes crossfilter.all which will return all the original records in the dataset with any filters applied.

@jasondavies
Copy link
Collaborator

Note that you can already retrieve the filtered records using dimension.top (sorted by the dimension’s natural order). Is there a particular reason why that wouldn’t be useful in your scenario?

@jdarling
Copy link

This scenario is actually related to my request to be able to retrieve only the filtered records. We are using DC.js and thus CrossFilter to provide a dynamic reporting environment to our end users. Once the users have filtered down their views we want to provide a CSV download to them of only the records they are looking at.

If I can get to the filtered record's only then I can extract out the record ID's, send them back to the server, and have it serve up the corresponding CSV file.

Using dimension.top doesn't work in this case as the filters come from different locations. The crossfilter groupAll() method returns a proper group and the .all() method returns the correct # of records affected by all filters, but I can't find a dimension that would then return only those records.

Another solution would be to add an each operator to the groupAll() output group allowing iteration of affected records, thus providing a paged way of getting to each record. This may possibly reduce the overhead, don't know.

If I'm wrong, please point me in the right direction :)

@jasondavies
Copy link
Collaborator

Using dimension.top doesn't work in this case as the filters come from different locations.

Can you explain what you mean by the filters coming from different locations? dimension.top(Infinity) does exactly the same as your patch, except that the records would be ordered by the dimension value (in descending order; dimension.bottom(Infinity) orders them in ascending order). In other words, all filters are applied (including any on the dimension you’re sorting by) before returning the records.

@jasondavies
Copy link
Collaborator

Perhaps you’re confusing the behaviour of dimension.top with groups, which exclude their associated dimension’s filter for computing reduce values.

@jdarling
Copy link

I thought the dimension only included a subset of the data associated with the root.

Using something similar to the follow structure:

{
  _id: <Identity>,
  date: Date(),
  state: <state text>,
  ...
}

IE: If I have a dimension on "date" then I would only see the date column in the dimension. So performing a top(Infinity) on that dimension wouldn't get me back the _id column as an example.

In the above I would have a group all to get the top. And say I had a dimension on date and state. Now if I filter based on a state of MO || KS and a date of > 3 days ago my all count goes down to whatever...

Saying that my root crossfilter is called ndx, then my root ndx records value will have everything including the filtered out records.

My goal would be to get only the records that match the filters MO || KS > 3 days ago. Of course I don't really know what the user has filtered to :)

@jrideout
Copy link
Author

dimension.top is my current workaround. Although it seems odd to grab an arbitrary dimension and call .top(Infinity) on it just to retrieve all the filtered records. The semantic link between the operation and the output feels weak, but it certainly works.

@jasondavies
Copy link
Collaborator

@jdarling Read the documentation for dimension.top (or try it for yourself); you’ll see that it returns the records, not the computed dimension values.

@jdarling
Copy link

Just tried it with our live data, and of course you are correct :). I think it would be useful to have this documented explicitly though as I searched for an answer for quite some time before chasing this on GitHub. Hopefully others find the thread if they run into the same problem. Great work all over.

@pedroteixeira
Copy link

Is there any way to all filtered values, but mantaining original data order?

@wilzbach
Copy link

Would it be possible to have all as just an alias to Infinity?
It is very common and just costs line.

@kimalbrecht
Copy link

Is there any way to all filtered values, but mantaining original data order?

I'm also interested in that.

What I'm actually looking for is a way to return the filtered data and keeping the initial index of the dataset. So instead of removing the filtered rows setting them to blank (in csv: ,,,,).

Any help, workaround would be appreciated.

Thanks

@robertlevy
Copy link

+1

@kimalbrecht
Copy link

So I found a solution. But I could imagine that there are much better, simple, faster ways of doing this. Any improvements are welcome.

First I created an Index for the loaded data before inputting the data into crossfilter (I'm working with CSV data):

  data.forEach(function(d,i) {
    d.index = i;
    d.x = +d.x;
    d.y = +d.y;
    .....
  });

Now I'm calling this function each time crossfilter is used:

  function passValues() {

    var infos = index.top(Infinity);

    var len = data.length;

    var nodes = [];

    while(len--){

      if(typeof infos[len] != 'undefined') {

        nodes.splice(infos[len].index, 0, infos[len]);

      }
      else {

        nodes.push('');

      }
    }

    raw_data_function(nodes);
  } 

This creates either an empty object (if it is filtered) or puts the data in the right spot through the .splice method.

It works well for me (I'm dealing with 25.000 objects) but I'm sure there are better methods.

Best, Kim

@RandomEtc
Copy link
Collaborator

Thanks for your contributions and sorry for silence on this side. As discussed in #151 an active fork is being developed in a new Crossfilter Organization. Please consider rebasing and opening your PR there (if you haven't already) where it should be warmly welcomed by the new maintainers. Cheers!

@RandomEtc RandomEtc closed this Mar 14, 2016
jdar pushed a commit to jdar/crossfilter that referenced this pull request Nov 19, 2019
Also reference in package.json
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants