Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Baseline Required -- Information Preserved by Graph #27

Open
Micky774 opened this issue Sep 14, 2021 · 2 comments
Open

Baseline Required -- Information Preserved by Graph #27

Micky774 opened this issue Sep 14, 2021 · 2 comments

Comments

@Micky774
Copy link
Collaborator

One question worth asking is in the current implementation, is there sufficient information preserved through the construction of the graphs? That is to say, does the process of going from image to graph lose important info? One way to test this is to test the potency of baseline models on both the raw images and the computed graphs. I propose that we choose some basic algorithms (e.g. SVM+kernel, Random Forest, MLP) and compare their performance on derived features both pre/post graph. Post graph, these features can be as simple as a the collection of means/variances, or more dynamic such as features extracted from an out-of-the-box RNN/LSTM. The pre-graph features can be extracted via a simple convolution network over the raw images, fully-connected at the end to output a vector of features the same length as the post-graph features.

The critical component here is controlling for complexity. Neither of these processing steps ought to be too complex, and both should be similarly complex to ensure that it doesn't become a test of the processing methods rather than the image-to-graph step.

@rmattson1008
Copy link
Contributor

Related to this - I would be curious to see if the graphs are actually capturing any regional info that is more useful than local info. Hyper-fusion, fragmentation, and a normal cell cycle are all morphologies that show up in local protein movement (you can see it happening even if you zoom in). Could we just pit 50x50 pixel samples against the graphs?

@Micky774
Copy link
Collaborator Author

That's actually a wonderful point! If local and global information are highly correlated then we should be able to just observe a local patch like you mentioned and viably infer global patterns (i.e. capturing global patterns would be redundant). Let's set up a simple CNN for that then.

Also, @magsol I'm wondering if it would be better to have our post-graph features--whether that's the literal GMM parameters, or the derived spectrum--be embedded by an MLP into a fixed dimension space, that way we can also control for latent-dimensionality. That way if we decide that the latent dim size ought to be, say 10, then we can embed the graph/post-graph features into that space and also embed whatever CNN-based spatial feature map representation into that space using some pooling/FC layers.

@rmattson1008 rmattson1008 removed their assignment May 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants