Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

switch plotting code to makie #29

Merged
merged 41 commits into from
May 20, 2021
Merged

switch plotting code to makie #29

merged 41 commits into from
May 20, 2021

Conversation

SimonDanisch
Copy link
Contributor

@SimonDanisch SimonDanisch commented May 10, 2021

  • port all recipes & tests
  • documentation
  • percy visual regression tests
  • improved themeabiltity will do this in another PR

Visual changes from this PR (first image -> this PR, second image -> master)

All together 1

all_together_makie
all_together_plots

All together 2

all_together_2_makie
all_together_2_plots

Confusion matrix :Column

confusion_col_makie
confusion_col_plots

Confusion matrix :Row

confusion_row_makie
confusion_row_plots

Kappas IRA

kappas_ira_makie
kappas_ira_plots

Kappas no IRA

kappas_no_ira_makie
kappas_no_ira_plots

PR Curves

pr_makie
pr_plots

PRG curves

prg_makie
prg_plots

Reliability calibration

reliability_calibration_makie
reliability_calibration_plots

ROC Curves

roc_makie
roc_plots

@SimonDanisch
Copy link
Contributor Author

Alright, a preview of the docs for the plotting functions is up at:
https://beacon-biosignals.github.io/Lighthouse.jl/previews/PR29/plotting/

@hannahilea, maybe you can push to this PR to fill in some more details about what those plotting functions do (if desired^^) ? Those details should go into the docstrings here:

plot_confusion_matrix(args...; kw...) = axisplot(plot_confusion_matrix!, args; kw...)

src/plotting.jl Outdated Show resolved Hide resolved
@hannahilea
Copy link
Contributor

@hannahilea, maybe you can push to this PR to fill in some more details about what those plotting functions do (if desired^^)

Let's not block this PR on docs, since we already know those are not in great shape in this repo and adding them isn't top priority (yet).

@SimonDanisch
Copy link
Contributor Author

I tried a few things to inline the plots somewhere, but it wasn't that easy without uploading the images anywhere...
So I guess for the comparison of Plots.jl + Makie.jl plots, one has to download this HTML file:
https://gist.github.com/SimonDanisch/1eec17096c9bef5637b103761c29edd1#file-plots_vs_makie-html

@ericphanson
Copy link
Member

that comparison looks good! My thoughts:

  • The text in the Makie ones seems a lot more legible, with the white text when the color is dark, and also on the bars in the barplots in a much more readable position.
  • I think the legend in all_together_2 in the Makie would might work better as having 2 or 3 columns instead of 1, but that's kind of a minor thing (and it's definitely better than the weird empty axis in the Plots one!).
  • most of the lines in the plots between the Plots vs Makie seem the same, but there are some differences in the numbers in the legends, like the AUC values in many of the plots, and in kappas_no_ira the "Multiclass" one is much larger for Makie (0.047 vs 0.011). Is that expected or is there a bug?

@SimonDanisch
Copy link
Contributor Author

SimonDanisch commented May 18, 2021

Hm, I let this run multiple times and got pretty different values every time, which is indeed surprising considering that the plots do look pretty much like a 1:1 match.
I also experimented with using StableRNGs to no avail.
This is the output after running ]test 2 times:
all_together_2
all_together_2

This will of course also make percy tests hard... @hannahilea do you think we are missing resetting an RNG somewhere, or is the training and metrics output hard to make fully deterministic?

@hannahilea
Copy link
Contributor

@hannahilea do you think we are missing resetting an RNG somewhere

Yep! We use the global rng in majority(...), which means we should reseed the global rng at the start of testing:

Random.seed!(0)

@hannahilea hannahilea self-requested a review May 18, 2021 14:14
@SimonDanisch
Copy link
Contributor Author

I checked that one explicitly and it looks like we use a local RNG in the tests:
https://github.com/beacon-biosignals/Lighthouse.jl/blob/sd/makie/test/learn.jl#L93

@hannahilea
Copy link
Contributor

majority(...) is called by the default elected param in learn(...), so tests that call learn(...) without passing in a non-default elected arg end up using that default rng. Looks like that is happening at least here [1], hence needing the global rng reseeded.

[1]

Lighthouse.learn!(model, logger, () -> train_batches, () -> test_batches, votes;

@SimonDanisch
Copy link
Contributor Author

Oh ha, of course you're right :) I thought I looked at all call sites of majority and made sure RNGs get passed through...


deploydocs(repo="github.com/beacon-biosignals/Lighthouse.jl.git",
devbranch="main")
devbranch="main", push_preview=true)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yay :)

@hannahilea
Copy link
Contributor

It's an easy one to miss...I was thrown by some nondeterminism in the LighthouseFlux tests that was due to this callsite, so I'm well familiar with it 😅

@hannahilea
Copy link
Contributor

hannahilea commented May 18, 2021

Ahhhh Simon this is looking so good! :D

Looking at the side-by-side plots, here are a few things I think need tweaking before we're good to go:

  • kappas_ira.png I think the plot classes are inverted---the labels vs the colors are switched, so previously the grey bars were the expert-expert and now they are the algorithm-vs-expert.
  • kappas_ira.png Could we make sure that the expert-expert bars are above the expert-algorithm bars (as in the original version)? This is intentional, to match the data formatting of publications that we want to easily compare against.
  • kappas_ira.png The title should be the same as the old title
  • kappas_ira.png Looks like we've lost the grid/minor x ticks; probably want those back pending intentional design choices around internal styling.
  • reliability_calibration.png Would it be easy to make the scatter plot point outlines be the color of the line rather than black? The black is a little distracting.
  • all_together.png/all_together_2.png The titles of some of the component plots are centered rather than left-justified; they were originally left-justified to make it easier to differentiate from x-axis labels.
  • Missing plot: binary_discrimination_calibration_curves Looks like this one just doesn't exist as a solo comparison plot or in the all_together.png plots

Also, style improvements that can absolutely wait until future PRs, unless they're easy and you want to:

  • all individual plots (ROC, PR, calibration, confusion) would ideally be square

Also, if at all possible, would be nice to have the side-by-side comparison images in this PR directly, rather than a link-out, so that we don't lose the history if you delete your gist at some point, and so that users can see the changes in this PR to know what they're getting when they bump versions. Makes sense to wait until it's the final version, though, if you prefer!

@hannahilea
Copy link
Contributor

rng doesn't broadcast on 1.3

So I think if you wanted you could instead bump the required Julia version to 1.6. We're already going to be bumping the Lighthouse minor version, and I don't think any internal Beacon users are using <1.6 at this point.... Your call!

@SimonDanisch
Copy link
Contributor Author

  • The missing decorations in kappas were intentional to remove noise, considering that we already have the labels. I added them back for now, but didn't update the x ticks yet (so its still just 0 + 1).

I will try to figure out how to best upload a report here, so its easier to view... Its somehow surprisingly hard to just upload an html file somewhere and be able to directly look at it.

test/learn.jl Outdated Show resolved Hide resolved
Co-authored-by: Hannah Robertson <hannahilea@users.noreply.github.com>
@SimonDanisch
Copy link
Contributor Author

Alright, new comparison: https://serene-jang-51d2d2.netlify.app/
Note, sharpness comparisons aren't very useful with this, since those are PNGs and they're not shown at their native resolution...
Before we merge, I can collect the comparison into the PR description, so that one can see the differences right away ;)

src/plotting.jl Outdated Show resolved Hide resolved
Copy link
Contributor

@hannahilea hannahilea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Barring the one tiny suggested change, this looks good to go!!

Before we merge, I can collect the comparison into the PR description, so that one can see the differences right away ;)

Sounds good to me!

Copy link
Member

@ericphanson ericphanson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great!

@SimonDanisch SimonDanisch merged commit afffc4b into main May 20, 2021
@SimonDanisch SimonDanisch deleted the sd/makie branch May 20, 2021 10:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants