Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing NonlinearENA #9

Open
snotskie opened this issue Aug 4, 2023 · 3 comments
Open

Missing NonlinearENA #9

snotskie opened this issue Aug 4, 2023 · 3 comments
Labels
consensus needed Further information is requested enhancement New feature or request
Milestone

Comments

@snotskie
Copy link
Owner

snotskie commented Aug 4, 2023

NonlinearENA was cut since 0.1.0, to allow me time to refactor it to match the 0.5.0 codebase, as well as to put in the much needed graphic design work to make the Nonlinear plots actually look good

@snotskie snotskie added the enhancement New feature or request label Aug 4, 2023
@snotskie snotskie added this to the 0.9.9 milestone Sep 14, 2023
@snotskie snotskie added the consensus needed Further information is requested label Oct 8, 2023
@snotskie
Copy link
Owner Author

snotskie commented Oct 8, 2023

one way I've imagined this is, nonlinear is a layer that adds (1) some metadata, (2) some plots, and (3) some tests. mechanically, it isn't a replacement for a linear rotation, just an intermediate layer for doing other tasks, aka, a "view" of an underlying linear model

it could look something like this:

# Not implemented
model = ENAModel(...)
clustered = NonlinearClusterView(
    model,
    reduceBy=X -> UMAP_(X, ...),
    reduceDimsIn=3, # use first three dims from the model, ignore the rest
    reduceExtraPredictors=[:Day, :Grade => 0.8], # but also include some extra predictors, with optional weighting
    clusterBy=X -> hdbscan(X, ...),
    clusterKey=:ClusterGroup,
    clusterValue=i -> "Group $i"
)
plot(clustered) # will include nonlinear plot in addition to all plots model does
rotated_on_clusters = ENAModel(clustered, rotateBy=MulticlassRotation(:ClusterGroup)
plot(rotated_on_clusters) # will not include nonlinear plot

But I'm not too happy with that format yet

My thinking in any case with a "view" is it:

  • is a well-defined pipeline
  • is for a well-defined set of tasks
  • creates new metadata (the "view" of the model)
    • which is then used in the new plots and tests
      • which are in turn used to accomplish those downstream well-defined tasks
  • any operation that can be performed on a model can be performed on a view

@snotskie
Copy link
Owner Author

snotskie commented Oct 8, 2023

worth noting that this looks a lot like a job for https://github.com/alan-turing-institute/MLJ.jl, besides the particular plotting different views might require, and the wild west that MLJ would create

@snotskie
Copy link
Owner Author

snotskie commented Oct 9, 2023

talking with folks at icqe23, a simpler syntax would be preferred, eg

pc = pointcloud(model, dims=1:4, format=:wide, metadata=[:Pretest, :Posttest], znorm=true)
um = UMAP_(pc.X, ...)
hdb = hdbscan(transform(um, pc.X), ...)
nl = NonlinearENAModel(model, pc, um, hdb)
plot(nl) # just the nl plot groupedby :ClusterNumber
display(nl) # all the nl specific config and goodness tests, eg. silhouettes
mc = ENAModel(nl, rotateBy=MulticlassRotation(:ClusterNumber))
plot(mc) # just linear plots as-is

this looks a lot more like the python approach, with julia type wrapping to manage ENA integration

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
consensus needed Further information is requested enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant