Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

species QC to ChlRei #1067

Merged

Conversation

izabelcavassim
Copy link
Member

@izabelcavassim izabelcavassim commented Oct 27, 2021

QC for Chlamydomonas reinhardtii #890, #863, #950

  • fill out the test stubs
  • delete the pytest.mark.skip lines that make the tests not run
  • make sure they pass, talking to the original author to figure out discrepancies

@aays Can you explain where you got your generation time from? For now I don't know the source.
Thanks.

@izabelcavassim
Copy link
Member Author

izabelcavassim commented Oct 27, 2021

OK, so we are disagreeing on the mutation rate estimates @aays. The Ness 2015 et al, gives me a genome-wide mutation rate that is: 9.63 * 1e-10. Where did you find the per chromosome mutation rate? If not cited then I believe you should also upload it to figshare and change the citation source for the mutation rate (I can do that).

Thank you.

@petrelharp
Copy link
Contributor

Where did you find the per chromosome mutation rate? If not cited then I believe you should also upload it to figshare and change the citation source for the mutation rate.

I think that @aays has, at this link, and it's cited already? (ps nice job tracking that down, @aays!)

@izabelcavassim
Copy link
Member Author

ah my bad, I think I was reviewing the first version of his PR.

@codecov
Copy link

codecov bot commented Oct 28, 2021

Codecov Report

Merging #1067 (52dc0c2) into main (cfd3bf8) will increase coverage by 0.00%.
The diff coverage is n/a.

❗ Current head 52dc0c2 differs from pull request most recent head 5f93436. Consider uploading reports for the commit 5f93436 to get more accurate results
Impacted file tree graph

@@           Coverage Diff           @@
##             main    #1067   +/-   ##
=======================================
  Coverage   99.52%   99.52%           
=======================================
  Files          92       91    -1     
  Lines        2934     2941    +7     
  Branches      350      354    +4     
=======================================
+ Hits         2920     2927    +7     
  Misses          6        6           
  Partials        8        8           
Impacted Files Coverage Δ
stdpopsim/catalog/ChlRei/species.py 100.00% <ø> (ø)
stdpopsim/__init__.py 94.73% <0.00%> (-0.27%) ⬇️
stdpopsim/slim_engine.py 99.73% <0.00%> (-0.01%) ⬇️
stdpopsim/genomes.py 100.00% <0.00%> (ø)
stdpopsim/catalog/AnaPla/species.py 100.00% <0.00%> (ø)
stdpopsim/dfe.py
stdpopsim/models.py 99.11% <0.00%> (+0.27%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cfd3bf8...5f93436. Read the comment docs.

@izabelcavassim
Copy link
Member Author

OK I think I am done here, @petrelharp you can merge this if you will!

@izabelcavassim
Copy link
Member Author

izabelcavassim commented Oct 28, 2021

closes #890

def test_qc_population_size(self):
assert self.species.population_size == -1
assert self.species.population_size == 1.4 * 1e-7

@pytest.mark.skip("Generation time QC not done yet")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whoops, you've still got a skipped QC test here?

Copy link
Contributor

@petrelharp petrelharp Oct 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see - no reference for this to check yet.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh wait - I see

        stdpopsim.Citation(
            author="Vitova et al.",
            year=2011,
            doi="https://doi.org/10.1007/s00425-011-1427-7",
            reasons={stdpopsim.CiteReason.GEN_TIME},
        ),

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, but I don't see a natural generation time in that paper on a quick skim anyhow

tests/test_ChlRei.py Outdated Show resolved Hide resolved
Comment on lines 87 to 88
print("This is the mutation rate")
print(self.genome.get_chromosome(name).mutation_rate)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
print("This is the mutation rate")
print(self.genome.get_chromosome(name).mutation_rate)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whats is the change here? spacing?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, sorry - I was suggesting you delete the print( )s, since they looked like they were just left over from debugging?

@petrelharp petrelharp mentioned this pull request Oct 28, 2021
14 tasks
@izabelcavassim
Copy link
Member Author

izabelcavassim commented Oct 29, 2021 via email

@izabelcavassim
Copy link
Member Author

OK @aays , I am almost there, I just need to find out where you got the generation time from. Once you do give me the citation I will finish my modifications.

@aays
Copy link
Contributor

aays commented Oct 30, 2021

Very sorry for not getting to this sooner, the past two days were a bit more hectic than I'd have liked! Thanks very much for your comments and review so far @izabelcavassim and @petrelharp.

The generation time was calculated from Vitova et al's estimate of the doubling time at 20 degrees Celsius of 0.10 mitoses / h (p. 604 in the paper, listed as 'doubling / h').

From that, I obtained 876 generations per year (0.10 doublings/h * 24 h * 365 days) hence 1/876 as the generation time. C. reinhardtii can also undergo meiosis, but is expected to have a meiotic cycle once every ~840 generations (e.g. only once a year) based on this work of ours.

This is the best data I know of on C. reinhardtii's generation time, I'm afraid. Please let me know if this sounds reasonable!

@petrelharp
Copy link
Contributor

Ok - so it should be clear somewhere that generation time is time between mitotic divisions. And, since mutation rate and recombination rate are in units of "per generation", are those correct here?

@izabelcavassim
Copy link
Member Author

also, don't we assume generation time to be in years?

@aays
Copy link
Contributor

aays commented Nov 1, 2021

Ok - so it should be clear somewhere that generation time is time between mitotic divisions. And, since mutation rate and recombination rate are in units of "per generation", are those correct here?

They are! Mutation rate was calculated directly from MA experiments, but the recombination rates listed here are population recombination rates (e.g. Ne * r calculated from genetic diversity) instead of estimates from lab crosses. I elaborated how I dealt with facultative sex a bit further in #863, reproduced here:

An important note re: the recombination rate - sex takes place once every ~840 generations in the species (it's facultatively sexual), and the recombination rate estimates in here reflect that (e.g. they are r = rho / 2Ne, where rho was estimated directly from natural populations). To get 'true' physical recombination rate estimates, each of the values would be divided by the rate of sex, 0.001194 (explained further in the results/discussion here) which I've left as a note in the script. I figure that if simulating wild populations it makes more sense to have effective recombination rates in there as opposed to the physical rates, since that would otherwise inflate the amount of recombination happening in the simulation.

So these are still technically per-generation per-bp recombination rates (e.g. the mean_r_realized column in the figshare dataset) where the infrequency of sex is 'baked into' the values via Ne and so they are accordingly lower. I figured this was the easiest way to do handle this based on the discussion in #857.

Re: generation time, given the above calculation of 876 doublings in a year I set the generation time to 1/876 years, following how the generation time was set for E. coli (26280 doublings per year yielding a 1/26280 generation time).

izabelcavassim added a commit to izabelcavassim/stdpopsim that referenced this pull request Nov 1, 2021
updated based on comments in popsim-consortium#1067
@izabelcavassim
Copy link
Member Author

ok @petrelharp made the edits you suggested and fixed a tiny thing on the generation time reference. I guess you can merge when you will.

@petrelharp petrelharp merged commit 0fc1dbe into popsim-consortium:main Nov 2, 2021
@petrelharp
Copy link
Contributor

Awesome, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants