Add dataset split and percentage selection for local explanations, update KernelShap sampling to use fractions#313
Merged
Conversation
…ion and improve parameter descriptions
…ction functionality
cristian-tamblay
approved these changes
Oct 2, 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request introduces support for selecting a dataset split (train, test, validation, or all) and a percentage of that split to use when generating local explanations in the DashAI platform. This allows users to control which subset of the data is used for explainability, improving flexibility and efficiency, especially for large datasets. The changes span both the backend and frontend, including schema updates, API modifications, UI enhancements, and logic for handling splits and percentages in the explanation process.
Backend changes:
scopefield (as a JSON object) to the local explainer schema, database model, API endpoint, and job logic to specify which split and percentage of the dataset to use. [1] [2] [3]scope, including validation and sample selection. [1] [2] [3]Frontend changes:
SplitSelectorUI component that allows users to choose the dataset split and percentage, displaying the number of rows selected. [1] [2]scopefield when creating a local explainer. [1] [2]KernelShap explainer improvements:
n_background_samples) to a fraction (background_fraction), making the sampling proportional to the dataset size and more intuitive for users. [1] [2] [3] [4] [5] [6] [7] [8]