Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ORF9b addition #21

Closed
zach-hensel opened this issue Mar 15, 2023 · 4 comments · Fixed by #22
Closed

ORF9b addition #21

zach-hensel opened this issue Mar 15, 2023 · 4 comments · Fixed by #22

Comments

@zach-hensel
Copy link

Great work! I made a small local hack to output fitness for ORF9b for a quick check of XBB.1.16 which has a few ORF9b mutations that might be an interesting addition.

covSpectrum query: https://cov-spectrum.org/explore/World/AllSamples/AllTimes/variants?nucMutations=T12730A%2CT28297C%2CA28447G&

Changes made:

  1. Modified gtf file and replaced download in relevant rule with copying the local modified file:
NC_045512v2	ncbiGenes.genePred	transcript	28284	28577	.	+	.	gene_id "ORF9b.1"; transcript_id "ORF9b.1"; 
NC_045512v2	ncbiGenes.genePred	exon	28284	28577	.	+	.	gene_id "ORF9b.1"; transcript_id "ORF9b.1"; exon_number "1"; exon_id "ORF9b.1";
NC_045512v2	ncbiGenes.genePred	CDS	28284	28577	.	+	0	gene_id "ORF9b"; transcript_id "1"; exon_number "1"; exon_id "ORF9b.1";
NC_045512v2	ncbiGenes.genePred	start_codon	28284	28286	.	+	0	gene_id "ORF9b"; transcript_id "ORF9b.1"; exon_number "1"; exon_id "ORF9b.1";
NC_045512v2	ncbiGenes.genePred	stop_codon	28575	28577	.	+	0	gene_id "ORF9b"; transcript_id "ORF9b.1"; exon_number "1"; exon_id "ORF9b.1";
  1. Modified first cell in aamut_fitness.py.ipynb to not remove overlapping N and ORF9b mutations.
    .query("not is_overlapping or gene=='N;ORF9b'")

Result: Only examined briefly. ORF9b I5T in XBB.1.16 ranks highly.

image

@jbloom
Copy link
Member

jbloom commented Mar 15, 2023

Great, thanks @zach-hensel. Marc Johnson had also been asking about ORF9b.

I am going to make a pull request that adds this into the pipeline. Just adding a few more notes re ORF9b mostly for myself while doing this:

My pull request will automate the modification of the GTF that you did manually above to place it in workflow of larger pipeline.

@zach-hensel
Copy link
Author

Awesome and thank you for double checking. The nsp numbers here also come indirectly from me hastily copying some things and I overlooked the frameshift.

I am not working with this anymore so here are a couple other observations. First, nsp6 L37 would be interesting to unmask. I suppose it's masked because of a combination of occuring early and artifacts in Orf1a sequencing. Second, I made a quick script to rank mutations from the CSV one can export from a cov-spectrum query and looked at XBB.1.16. A reversion of one of the three nucleotides mutated in BA.2 for Orf6 61 popped up with a major fitness increase and you may want to mask this site as it appears to be an artifact. Lastly, it might be interesting to look at predicted fitness of RecCA and particularly for U-to-C mutations in that direction. Orf8 L84S is also interesting in that respect.

@jbloom
Copy link
Member

jbloom commented Mar 16, 2023

Great, thanks. I will look at the ORF6 site. For the other masked mutations, including L84S, I had to mask all mutations to the Wuhan-Hu-1 reference as nearly all of them have unrealistically high counts that likely indicate some sort of bioinformatics issue such as calling uncovered sites to reference.

@jbloom jbloom linked a pull request Mar 16, 2023 that will close this issue
@jbloom
Copy link
Member

jbloom commented Mar 16, 2023

OK, I added estimates and summarized results here: https://twitter.com/jbloom_lab/status/1636470443493449728

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants