optimizing naive_bayes_classifier #1223

afinit · 2023-07-06T22:21:01Z

Describe your changes

This is to help optimize the naive_bayes_classifier a bit. I think it is a little more readable as well.

The piece of this function that is taking the most time is the calculation of joint probabilities, so i focused on changes in that space. After several iterations, I settled on the approach here where we're grabbing the indices for the PDFs separately from the loops. We are now doing this once per function call instead of for every class and this allows us to use numpy index slicing and array multiplication.

I did all the testing on this using the Naive Bayes Tutorial notebook in the plantcv-binder repo.
After getting all of the changes in, I ran the original code 30x and the new code 30x to get runtimes in milliseconds. The screenshot is pulled from that notebook where we see a speedup of about 4x.

This example contains 4 classes that we're processing here, so I imagine this optimization hangs on number of classes.

Type of update
feature enhancement

Associated issues
#1158

For the reviewer
See this page for instructions on how to review the pull request.

PR functionality reviewed in a Jupyter Notebook
All tests pass
Test coverage remains 100%
Documentation tested
New documentation pages added to plantcv/mkdocs.yml
Changes to function input/output signatures added to updating.md
Code reviewed
PR approved

plantcv/plantcv/naive_bayes_classifier.py

codecov · 2023-07-07T17:35:09Z

Codecov Report

Merging #1223 (5cad126) into release-4.0 (17d8c37) will not change coverage.
The diff coverage is 100.00%.

@@              Coverage Diff              @@
##           release-4.0     #1223   +/-   ##
=============================================
  Coverage       100.00%   100.00%           
=============================================
  Files              161       161           
  Lines             6957      6952    -5     
=============================================
- Hits              6957      6952    -5

Flag	Coverage Δ
unittests	`100.00% <100.00%> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
plantcv/plantcv/naive_bayes_classifier.py	`100.00% <100.00%> (ø)`

plantcv/plantcv/naive_bayes_classifier.py

optimizing naive_bayes_classifier

9bde078

afinit added the ready to review label Jul 6, 2023

afinit requested review from nfahlgren and HaleySchuhl July 6, 2023 22:21

afinit commented Jul 6, 2023

View reviewed changes

plantcv/plantcv/naive_bayes_classifier.py Outdated Show resolved Hide resolved

Merge branch 'release-4.0' into release-4.0-speedup-naive-bayes

987d584

nfahlgren self-assigned this Jul 11, 2023

Merge branch 'release-4.0' into release-4.0-speedup-naive-bayes

faae07d

nfahlgren reviewed Jul 11, 2023

View reviewed changes

plantcv/plantcv/naive_bayes_classifier.py Outdated Show resolved Hide resolved

nfahlgren added 2 commits July 11, 2023 16:39

Linearize arrays

617105c

Vectorize class masking

5cad126

nfahlgren removed their assignment Jul 11, 2023

nfahlgren added this to Pull Requests in PlantCV4 via automation Jul 13, 2023

nfahlgren added this to the PlantCV v4.0 milestone Jul 13, 2023

nfahlgren approved these changes Jul 13, 2023

View reviewed changes

nfahlgren merged commit 08bfb96 into release-4.0 Jul 13, 2023
5 of 6 checks passed

PlantCV4 automation moved this from Pull Requests to Done Jul 13, 2023

nfahlgren deleted the release-4.0-speedup-naive-bayes branch July 13, 2023 01:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

optimizing naive_bayes_classifier #1223

optimizing naive_bayes_classifier #1223

afinit commented Jul 6, 2023 •

edited by nfahlgren

codecov bot commented Jul 7, 2023 •

edited

optimizing naive_bayes_classifier #1223

optimizing naive_bayes_classifier #1223

Conversation

afinit commented Jul 6, 2023 • edited by nfahlgren

codecov bot commented Jul 7, 2023 • edited

Codecov Report

afinit commented Jul 6, 2023 •

edited by nfahlgren

codecov bot commented Jul 7, 2023 •

edited