Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

predicates and filters #251

Merged
merged 8 commits into from
Mar 27, 2023
Merged

predicates and filters #251

merged 8 commits into from
Mar 27, 2023

Conversation

vijithassar
Copy link
Owner

Implements predicates and then uses the predicates to implement filter transforms. This is logically equivalent to running specification.data.values.filter(), but it's all controlled through JSON configuration.

Don't assume all transforms will be calculate.
Rename the existing transform() function to transformDatum() to make it clearer that it operates on a single data point and clear the way for transforms that operate on an entire data set.
Generate predicate functions for use with array filters based on JSON definitions in the input specification.

This is defined in a separate module from the filter transforms because the predicates can also be used for other purposes, such as conditional encodings.
Filter out data points from the data set based on predicate definitions from the specification.
Despite the new wrapper layer in the data() function, it's actually not possible to run the transforms there. This is because they operate at the raw data layer, whereas by the end of the data() function the specification data has been converted by layout generators. Since those layout generators use static keys, it's actually impossible to map them back for the purposes of a transform defined in terms of the source data.

The alternative is to insert transformValues() into all uses of the values() function. This is injecting a lot of complexity into what was previously a simple lookup, but memoizing the values() helper and bypassing transformValues() entirely with the identity function when there are no transform filters to apply should both help keep things fast and lightweight.
Apply calculate transforms before running filter transforms, so the fields created by the calculate transform can be used as targets for the predicate functions in the filter.
…points

The previous implementation of this mapped over the input data with transformDatum() before running the filter. This has the unfortunate effect of actually mutating the data points since the calculate transforms are spread onto a new object when the calculate transform function returns. And that matters, since throughout the library memoization is applied by reference, and datum references are used to look up series values from the category helper.

To get similar functionality without mutation, we can change the functions instead of changing the data. Instead of mapping over the data points, map over the predicate function, and replace each with a wrapper function that runs the original predicate and then also runs the original predicate a second time while examining an altered version of the datum object as created by the transformDatum() function.

This is logically comparable to the way encodingValue() falls back to checking against the output of the calculate transform if no matching fields are found on the datum object.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant