Skip to content

Add dataset split and percentage selection for local explanations, update KernelShap sampling to use fractions#313

Merged
cristian-tamblay merged 7 commits into
developfrom
fix/explainability-module
Oct 2, 2025
Merged

Add dataset split and percentage selection for local explanations, update KernelShap sampling to use fractions#313
cristian-tamblay merged 7 commits into
developfrom
fix/explainability-module

Conversation

@Irozuku
Copy link
Copy Markdown
Collaborator

@Irozuku Irozuku commented Sep 29, 2025

This pull request introduces support for selecting a dataset split (train, test, validation, or all) and a percentage of that split to use when generating local explanations in the DashAI platform. This allows users to control which subset of the data is used for explainability, improving flexibility and efficiency, especially for large datasets. The changes span both the backend and frontend, including schema updates, API modifications, UI enhancements, and logic for handling splits and percentages in the explanation process.

Backend changes:

  • Added a new scope field (as a JSON object) to the local explainer schema, database model, API endpoint, and job logic to specify which split and percentage of the dataset to use. [1] [2] [3]
  • Updated the local explainer job logic to use the specified split and percentage from scope, including validation and sample selection. [1] [2] [3]

Frontend changes:

  • Added a SplitSelector UI component that allows users to choose the dataset split and percentage, displaying the number of rows selected. [1] [2]
  • Integrated the split/percentage selection into the local explainer creation flow, including default values and state management. [1] [2] [3] [4] [5]
  • Updated the API calls and modal logic to send the scope field when creating a local explainer. [1] [2]
image

KernelShap explainer improvements:

  • Changed background data sampling parameters from an absolute number (n_background_samples) to a fraction (background_fraction), making the sampling proportional to the dataset size and more intuitive for users. [1] [2] [3] [4] [5] [6] [7] [8]
image

@cristian-tamblay cristian-tamblay merged commit d9f25ea into develop Oct 2, 2025
5 checks passed
@cristian-tamblay cristian-tamblay deleted the fix/explainability-module branch October 2, 2025 14:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants