Baseline Required -- Information Preserved by Graph #27

Micky774 · 2021-09-14T15:10:52Z

One question worth asking is in the current implementation, is there sufficient information preserved through the construction of the graphs? That is to say, does the process of going from image to graph lose important info? One way to test this is to test the potency of baseline models on both the raw images and the computed graphs. I propose that we choose some basic algorithms (e.g. SVM+kernel, Random Forest, MLP) and compare their performance on derived features both pre/post graph. Post graph, these features can be as simple as a the collection of means/variances, or more dynamic such as features extracted from an out-of-the-box RNN/LSTM. The pre-graph features can be extracted via a simple convolution network over the raw images, fully-connected at the end to output a vector of features the same length as the post-graph features.

The critical component here is controlling for complexity. Neither of these processing steps ought to be too complex, and both should be similarly complex to ensure that it doesn't become a test of the processing methods rather than the image-to-graph step.

rmattson1008 · 2021-09-17T15:09:43Z

Related to this - I would be curious to see if the graphs are actually capturing any regional info that is more useful than local info. Hyper-fusion, fragmentation, and a normal cell cycle are all morphologies that show up in local protein movement (you can see it happening even if you zoom in). Could we just pit 50x50 pixel samples against the graphs?

Micky774 · 2021-09-17T16:06:10Z

That's actually a wonderful point! If local and global information are highly correlated then we should be able to just observe a local patch like you mentioned and viably infer global patterns (i.e. capturing global patterns would be redundant). Let's set up a simple CNN for that then.

Also, @magsol I'm wondering if it would be better to have our post-graph features--whether that's the literal GMM parameters, or the derived spectrum--be embedded by an MLP into a fixed dimension space, that way we can also control for latent-dimensionality. That way if we decide that the latent dim size ought to be, say 10, then we can embed the graph/post-graph features into that space and also embed whatever CNN-based spatial feature map representation into that space using some pooling/FC layers.

Micky774 added this to the Create Baseline Models milestone Sep 14, 2021

Micky774 assigned rmattson1008 Sep 14, 2021

Micky774 mentioned this issue Sep 14, 2021

Baseline Required -- Useful Features Generated by Graph Representation #28

Open

rmattson1008 removed their assignment May 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Baseline Required -- Information Preserved by Graph #27

Baseline Required -- Information Preserved by Graph #27

Micky774 commented Sep 14, 2021

rmattson1008 commented Sep 17, 2021

Micky774 commented Sep 17, 2021

Baseline Required -- Information Preserved by Graph #27

Baseline Required -- Information Preserved by Graph #27

Comments

Micky774 commented Sep 14, 2021

rmattson1008 commented Sep 17, 2021

Micky774 commented Sep 17, 2021