Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tune figures #37

Closed
10 of 11 tasks
huddlej opened this issue Jun 12, 2023 · 5 comments
Closed
10 of 11 tasks

Tune figures #37

huddlej opened this issue Jun 12, 2023 · 5 comments
Assignees

Comments

@huddlej
Copy link
Collaborator

huddlej commented Jun 12, 2023

Considerations for Altair figures:

  • Remove grid lines from individual panels
  • Plot the divergence tree instead of the time tree (since genetic distance is what we care about most and not time), also push new Docker image
  • Draw branches on tree panels to clearly communicate that those panels reflect phylogenies and not just a scatterplot that happens to look like a tree
  • Plot the "stress" value for each MDS embedding as text on the panel or as text on the x-axis title

Considerations for Euclidean vs. genetic distance figures:

  • Remove bootstrap-base std dev (sample counts are so high for these analyses, that bootstrapping doesn't reveal any meaningful variation)
  • Fit linear model between genetic and Euclidean distance and plot the corresponding slope and intercept values on each panel as text alongside the R^2 values.

For the H3N2 HA/NA analysis:

  • Figure out a better figure layout than the current 5-row implementation, since this figure runs off the page now
    • Move current 5-row figure (flu-2016-2018-ha-na-embeddings-by-mcc.png) to the supplement and rename to flu-2016-2018-ha-only-vs-ha-na-embeddings-by-mcc.png
    • Create a 5-panel main figure (using the same name as the original main figure: flu-2016-2018-ha-na-embeddings-by-mcc.png) with tree and PCA, MDS, t-SNE, and UMAP embeddings from HA/NA sequences and VI distances annotated in the titles from both the HA-only and HA/NA clusters
  • Assign MCCs colors by first sorting MCCs in descending order of total samples, assigning colors from the standard color schemes TSV, and then sorting by MCC name again for clearer figure legends in Auspice

For late SARS-CoV-2 analysis:

  • Remove the reference/root sequence (Wuhan-1/2019) from the tree after TreeTime step, to avoid large gap in Altair plots for the tree
@nandsra21 nandsra21 self-assigned this Jun 30, 2023
@huddlej huddlej changed the title Tune figures for H3N2 HA Tune figures Aug 31, 2023
This was referenced Aug 31, 2023
@nandsra21
Copy link
Collaborator

nandsra21 commented Sep 15, 2023

Questions:

  • would it be best to add stress as possible metadata from the pathogen-embed command so people can use it / any other suggestions that would work?
  • where do I get either parent_y information from or what do I plot on the x for divergence?
  • where are the colors from the ncov color palette stored so I can assign them for HA/NA
  • how do I remove a train after treetime? Do I run another augur refine, or just take it out manually?

@nandsra21
Copy link
Collaborator

leave annotated_embeddings at the tip level, table.tsv will have internal nodes for plotting the divergence tree. Add another parameter for the dataframe table.tsv (add true/false column for internal node, add a filter where necessary)

@nandsra21
Copy link
Collaborator

nandsra21 commented Sep 15, 2023

root_and_prune_tree (under ha/na scripts) - use for removing strain (btw augur tree and refine)

@nandsra21
Copy link
Collaborator

filter by size, order, color by MCC after

@huddlej
Copy link
Collaborator Author

huddlej commented Jan 25, 2024

Although we can generate stress values for MDS, I don't think including stress in the figures is a blocker to submission, so I'm closing this issue.

@huddlej huddlej closed this as completed Jan 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants