-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Analyze yeast cytosolic 8x dicodon screen data #3
Comments
|
@kychen37 Ok. I remember you showing some data in the short updates, but I don't remember ever us discussing this. Can you update the issue whenever you finish some analysis (I think you used to do this previously?) I can give more useful feedback if I have some time to think about your results than just looking at them for the first time during group meeting. |
Prelim analysis, just making a heatmap and starting to look at inserts that are still missing despite good representation: Code: https://github.com/rasilab/rqc_aggregation_aging/blob/master/analysis/deepseq/20220329_exp51_cyto_8x_dicodon/scripts/plot_insert_mrna_levels.ipynb |
Note to self: my branches are kind of messed up right now, my most recent analyses were done in master (I had switched to doing these analyses in deepseq and forgot, then continued in master). Pulling from master didn't update the deepseq branch, I may need to 'sync' or something, for now the master branch is the most up to date for the scripts of this analysis only |
@rasi Update I will revisit this data next week to see what is pertinent for my committee meeting |
@kychen37 Post the figures as Issue comments so that we can come back to them easily (the commit links above are enough to figure out where the figure is). Why is the data not centered around 0 unlike here: https://github.com/rasilab/lab_analysis_code/blob/master/rasi/analysis/deepseq/20210703_pb_8xdicodon_resequence_grna_mrna/scripts/plot_human_8xdicodon_effects_files/figure-gfm/unnamed-chunk-6-1.png? |
@rasi I've been trying to figure that out but I'm not sure. Could it have to do with the fact that the mean of all the dipeptide lfcs is slightly negative? |
@kychen37 The mean and median cannot be this different. First, calculate simple log fold change without bootstrapping and make sure you understand what is going on before doing the bootstrap. The bootstrap is just for error bars. |
@rasi sorry I forgot that my |
After median-normalization, the median = 0 and mean = -0.19 |
Either way, the above plot should be approximately centered around 0 if everything is done correctly. |
Can you also add horizontal lines for each amino acid similar to what is in Phil's paper? |
@rasi I replotted the dipeptide heatmap using your code because I realized I wasn't looking at missing dipeptides correctly, turns out there is a group of dipeptides that are missing (in grey): |
@rasi Does bootstrapping involve normalizing lfcs to the mean (I thought it didn't)? My data appears to skew negative (unnormalized lfcs range from -5 to 0.1), and the bootstrap function takes from these. Should I have it take from median-normalized lfcs instead (which are centered on zero)? It didn't look like you had done that in Phil's data so I didn't do it for mine |
@kychen37 Ok, your explanation makes sense. It looks like I did not median normalize because my mRNA and gRNA had almost equal counts. But it will be good to do it for yours since your read counts are skewed. |
@kychen37 Most of the missing dipeptides are hydrophobic or bulky. Maybe these are just toxic as you noticed in your plasmid library. Can you calculate the bootstrap error bars for the remaining dipeptides, so that you can highlight ones that are atleast 2xSD below median value? Also, can you make the above plot for all codons? I assume Arg is so destabilizing primarily because it has two rare codons CGG and CGA, but this will be good to see. |
The codon plot looks good! Will be useful to spread out the Y-axis a bit. |
I recommend making frame plots similar to 1E in Phil's paper and a schematic that mirrors 1A as closely as possible. |
@rasi I'm trying to figure out what the effect of The way I had plotted dipeptides before (which did not include |
@kychen37: It should not drop any dipeptides since you are grouping by dipeptides in the previous step. |
Looks good and better than what I might have guessed :-) It will be useful to carefully look at the off-diagonal elements of the above plot to see whether it is noise or something interesting. Try the same frame plot for codon pairs as well. Look through the language in Phil's paper and see whether you can place the observations above in the context of what is known about translation in yeast (lot more than in human). See some of my TODO's at the top comment. Add to them if you think of additional experiments to do (this can be a useful checklist to come back to as you dig deeper). |
@rasi bootstrapped error bars for dipeptide levels of just the VK/FK dipeptides + some controls. I'm going to show this one because the full dipeptide plot is too big |
Sounds good. Add end caps to the error bars to make it a bit easier to see. |
Breakdown of dicodons that are missing from this sequencing run, code
For the 180 dicodons that are missing from linkage sequencing, this is the dipeptide breakdown (what proportion of inserts for each dipeptide were missing in linkage sequencing, top ~20): For the 180 dicodons that are missing from linkage sequencing of the integrating plasmid library, 130 of these dicodons are also missing in the dual-barcode 2um plasmid library. The dipeptide breakdown of these 130 common missing dicodons (top ~20): |
Update Next week I will look into endogenous inserts from this sequencing run and plan remaining experiments + paper |
Remake Fig1A from Presnyak 2015 to make sure the CSCs I calculated matched there's.
Correlating the Coller lab CSCs with my LFCs
Plotting all data together |
@rasi there is a moderate correlation between Coller lab CSCs and my average codon LFCs, there is basically no correlation between Coller lab AASCs and my average amino acid effects though, see comments above |
@Katharine Just wanted to say, that colored codon plot looks really good. I wouldn't worry too much about the amino acid level effects not matching up well; AA-level effects in your library are probably driven more by localized repeats, whereas they're looking at transcriptome-wide aggregate effect measurements, so it's a very different thing. Also, if you're doing literature comparisons you should totally compare your data to Gamble2016, Adjacent Codons Act in Concert to Modulate Translation Efficiency in Yeast from Stan Fields' group. That's probably the closest published experimental measurement to what you've done. Also, they only look at protein levels, so even if things don't match up well it might still be interesting (eg. some dicodons effect on mRNA but not protein level). |
Thanks @phiburke ! My memory of Gamble2016 is that they just found a bunch of rare codons but it's been a while so definitely worth a re-look! |
Update |
Update & TODO
|
@rasi
|
|
Summary of different types of data filteringRead cutoffs based on insert & barcode-level plots
Read cutoffs using Burke2022 code
Minimal read cutoffs -> bootstrap by dipeptide -> filter by SD
@rasi let me know if you have any thoughts on the proper/best way to plot this |
Destabilized dipeptides vs frameshifts
|
GanttStart: 2022-03-29
Background
Transformed at high efficiency using Bloom lab's liquid recovery protocol: https://github.com/rasilab/rqc_aggregation_aging/issues/98
Library prepped: https://github.com/rasilab/rqc_aggregation_aging/issues/104
Ideas TODO
Analysis Links
Brief conclusion
The text was updated successfully, but these errors were encountered: