-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
where would I find the equation or lognorm fit parameters to the Ks distributions? #54
Comments
Hello tamsen, Thanks for reaching out! There are two output files where details of the mixture model fits are stored, you can refer to the docs for their description. See here for the exponential-lognormal mixture model, here for the lognormal mixture model and here under Concerning the number of species to set up your analysis, I'm not completely sure I understand your goal and what you mean with Understood that this may affect the rate adjustments. Could you re-elaborate? :) Best, |
Thank you so much for the prompt reply over the holidays! That's great. The "species_parameters" files look like what I will need. I'll try your suggestions and let you know if I have any follow up questions. :-) Thanks again |
Hi Cecilia, I do have a follow up question! Thanks to your help, I am now able to generate the fit parameters I was looking for, but I would like to make sure that I am using them correctly to reconstruct the lognorm components. As a test, I used the "elaeis" example data, and turned off the co-linearity option to activate the exponential-lognormal mixture model. During the analysis, "model 2" was chosen by the software as the best fit. In the “elmm_elaeis_parameters.txt” file I find the gaussian parameters "Normal_Mean, Normal_SD,Normal_Weight" for each iteration and model. Choosing the last iteration of model 2 ( Initialization=10, model=2), I get -"1.2342795183, 0.2428038069, 0.2551137529" for the "Normal_Mean, Normal_SD,Normal_Weight" of the first component of the mixture model. The next row is the parameters for the next component and so on. [Please do correct me if I understand any of that wrong.] With these values for the txt file, I can recreate the plot for the Gaussians for Model 2 shown in the “elmm_elaeis_models_data_driven.pdf” . Thank you so much for your time! elmm_elaeis_parameters.txt |
Hi tamsen, Sorry for my late reply! While How is the scaling factor computed? From docstrings of the function above:
After having computed the scaling factor, the next code line plots the components onto the final figure by using the scaling factor. Do you think you can try to reuse this information and these lines of code to recreate the plot yourself? Apologies for this rather long explanation! |
Hi Cecilia, Thank you for your long explanation! :-) I appreciate it. And I hope you have a nice winter holiday. Yes, my hope would be to use the information from ksrates to recreate the fit plot myself in Ks space, in order to apply some mathematical models to the fitted lognorms. Given your explanation, I see two options to get the output I want A) Given that ~ or ~ B) Locally modify your code to output/log the scaling factors I need, as they are calculated. Then I could use those scaling fators, combined with the parameters in the elmm_elaeis_parameters.txt file, to rereate the fits lognorm in Ks space. (For simplcity, I'd prefer not to modify ks rates, if it can be avoided). If you get a chance please let me know if this sounds right, or if you have a better suggestion. Thank you again so much, |
Hi tamsen, I've been looking at the source code to remind myself about how the deconvolution is implemented, and I have to correct my previous message:
The original Ks dataset is redundant, so it does not have the same length of the deconvoluted data. The length of the deconvoluted dataset is instead equal to the sum of the weights of the original Ks data (only considering Ks up to 5, as per Example:
Solution 1: if you're fine with the approximation "sum(weights)=len(deconv_data)", you can 1) compute the sum of weights from the Solution 2: I assembled in a Python script the ksrates functions involved in the deconvolution (see attached file), which you can run by providing the path to your Cheers, |
Hi Tamsen and Cecilia, log_ks <- log(ks_list) I get values which have a mode close to what is expected (though not exactly) but the distribution has a different shape and even negative values. Could you please suggest me what am I missing or share your script to reproduce the aforementioned plot? Thank you very much in advance! |
Hi cvargas88, Thanks for reaching out and sorry about my late replay! I'm afraid I need some more "story" to understand what your goal is and how you are achieving it. For example, could you describe the reasoning behind your code and what Some general comments that might help.
Cheers, |
Dear Cecilia, Thank you very much in advance! |
Hi Cecilia,
This looks like a great application! I have a few questions:
Does the Ksrates output provide the equation and/or lognorm fit parameters for the mixture-model fits? I'd like to be able to see the actual fit parameters and be able to reproduce the distribution & fit.
This software can also be run on a single genome, correct? It does not require a species trio? (Understood that this may affect the rate adjustments)
Thanks so much,
Tamsen
The text was updated successfully, but these errors were encountered: