Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Fixes incorrect feature importance visualization for Data Frame Analytics classification #150816

Merged
merged 7 commits into from
Feb 14, 2023

Conversation

qn895
Copy link
Member

@qn895 qn895 commented Feb 9, 2023

Summary

This PR fixes #146122 where the decision graph could briefly exceed 1.0 in for prediction probability. The issue was with the 'true' or 'false' string values not matching correctly with title-cased 'True'/'False'.

Before:

image

After:

image

Checklist

@qn895 qn895 requested a review from a team as a code owner February 9, 2023 21:52
@qn895 qn895 self-assigned this Feb 9, 2023
@elasticmachine
Copy link
Contributor

Pinging @elastic/ml-ui (:ml)

@@ -42,12 +43,20 @@ export const ClassificationDecisionPath: FC<ClassificationDecisionPathProps> = (
const [currentClass, setCurrentClass] = useState<string>(
getStringBasedClassName(topClasses[0].class_name)
);
const selectedClass = topClasses.find(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The description of the fix refers to True and False base classes, but is this change also the cause of the change in behavior here with the bank_classification_1 MLQA bootstrap job where before I was seeing values greater than 1 for yes and no classes:

Before:

image

With this fix:

image

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICS, the fix makes total sense here: for the second row, the model predicts the class "no" with the probability of 0.986. This means that the prediction probability of "yes" is 0.014. This is precisely what the screenshot above shows now 🚀

The screenshot before this fix shows that the decision graph briefly exceeds 1.0 in for prediction probability, which is nonsense.

Copy link
Contributor

@peteharverson peteharverson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested and LGTM.

When testing this, looks like there is another unrelated issue where the popover only populates on page 1 of the results. On any other page it is empty for me:

image

@qn895
Copy link
Member Author

qn895 commented Feb 10, 2023

When testing this, looks like there is another unrelated issue where the popover only populates on page 1 of the results. On any other page it is empty for me:

Thanks for catching that. Looks like this was a regression that was introduced in #125023. I've added a fix as well as functional tests to check that the popover shows up correctly after table pagination is updated.

@qn895 qn895 requested a review from pheyos February 10, 2023 18:07
@qn895
Copy link
Member Author

qn895 commented Feb 13, 2023

@elasticmachine merge upstream

@qn895 qn895 mentioned this pull request Feb 13, 2023
14 tasks
@qn895
Copy link
Member Author

qn895 commented Feb 13, 2023

@elasticmachine merge upstream

Copy link
Contributor

@alvarezmelissa87 alvarezmelissa87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM ⚡

@qn895
Copy link
Member Author

qn895 commented Feb 14, 2023

@elasticmachine merge upstream

@qn895 qn895 enabled auto-merge (squash) February 14, 2023 02:35
@kibana-ci
Copy link
Collaborator

💚 Build Succeeded

Metrics [docs]

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
ml 3.5MB 3.5MB +130.0B

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

cc @qn895

kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Feb 14, 2023
…Analytics classification (elastic#150816)

Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>
(cherry picked from commit c2476d2)
@kibanamachine
Copy link
Contributor

💚 All backports created successfully

Status Branch Result
8.7

Note: Successful backport PRs will be merged automatically after passing CI.

Questions ?

Please refer to the Backport tool documentation

kibanamachine added a commit that referenced this pull request Feb 14, 2023
…Frame Analytics classification (#150816) (#151094)

# Backport

This will backport the following commits from `main` to `8.7`:
- [[ML] Fixes incorrect feature importance visualization for Data Frame
Analytics classification
(#150816)](#150816)

<!--- Backport version: 8.9.7 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sqren/backport)

<!--BACKPORT [{"author":{"name":"Quynh Nguyen
(Quinn)","email":"43350163+qn895@users.noreply.github.com"},"sourceCommit":{"committedDate":"2023-02-14T02:43:46Z","message":"[ML]
Fixes incorrect feature importance visualization for Data Frame
Analytics classification (#150816)\n\nCo-authored-by: Kibana Machine
<42973632+kibanamachine@users.noreply.github.com>","sha":"c2476d240e5a5a979af215057bb7f2bd40b9f6fe","branchLabelMapping":{"^v8.8.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:fix",":ml","Feature:Data
Frame
Analytics","v8.7.0","v8.8.0"],"number":150816,"url":"#150816
Fixes incorrect feature importance visualization for Data Frame
Analytics classification (#150816)\n\nCo-authored-by: Kibana Machine
<42973632+kibanamachine@users.noreply.github.com>","sha":"c2476d240e5a5a979af215057bb7f2bd40b9f6fe"}},"sourceBranch":"main","suggestedTargetBranches":["8.7"],"targetPullRequestStates":[{"branch":"8.7","label":"v8.7.0","labelRegex":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"main","label":"v8.8.0","labelRegex":"^v8.8.0$","isSourceBranch":true,"state":"MERGED","url":"#150816
Fixes incorrect feature importance visualization for Data Frame
Analytics classification (#150816)\n\nCo-authored-by: Kibana Machine
<42973632+kibanamachine@users.noreply.github.com>","sha":"c2476d240e5a5a979af215057bb7f2bd40b9f6fe"}}]}]
BACKPORT-->

Co-authored-by: Quynh Nguyen (Quinn) <43350163+qn895@users.noreply.github.com>
@qn895 qn895 deleted the dv-fix-classification-path branch February 15, 2023 15:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[ML] Incorrect feature importance visualization for classification
7 participants