Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added pseudocount to raw codon counts in syn_selection_by_codon #40

Merged
merged 6 commits into from
Nov 11, 2019

Conversation

fwelsh
Copy link
Contributor

@fwelsh fwelsh commented Nov 8, 2019

No description provided.

@jbloom
Copy link
Member

jbloom commented Nov 9, 2019

@frances: You should make several changes to this:

  1. Make the value of the pseudocount an optional argument in the function signature, and set a reasonable default. Such as pseudocount=0.5. Then expand the documentation to explain what the pseudocount argument means.

  2. Do not use the pseudocount for the Fisher's exact test. That test should be performed on the raw values as the Fisher's test already works fine with zero counts. The pseudocounts should only be used for computing the odds ratios. Implementation-wise, this probably means doing the Fisher test and then adding the pseudocount. Clearly explain this in the docs.

  3. You'll have to fix so it passes the doctest (right now it fails, see here). Note that if you do point (2) above, you should be changing the odds ratios in the doctest output but not the P-values, since the P-value counts won't change.

Pseudocount is now added after calculating Fisher P-values, for manual calculation of odds ratios only.
Overlooked second pseudocount addition that led to failed doctest.
Copy link
Member

@jbloom jbloom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@frances: Looks good to me.

@skhilton: Do you have any comments? Otherwise I'll go ahead and merge.

Copy link
Collaborator

@skhilton skhilton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jbloom, @fwelsh: Looks good to me! You might consider setting a default value of the psuedocounts, especially if you think that 0.5 is going to be the best value for the majority of use cases. You can do this by changing https://github.com/jbloomlab/dms_tools2/blob/syn_selection_updates/dms_tools2/syn_selection.py#L16 to def syn_selection_by_codon(counts_pre, counts_post, pseudocount=0.5):

Also, very small note, but I don't think you need the parenthesis around df in the function return statement. https://github.com/jbloomlab/dms_tools2/blob/syn_selection_updates/dms_tools2/syn_selection.py#L157

@jbloom
Copy link
Member

jbloom commented Nov 11, 2019

OK, I agree with @skhilton's comments especially about default. @fwelsh, do you want to push that, and then I'll merge.

@jbloom
Copy link
Member

jbloom commented Nov 11, 2019

@fwelsh: Sorry, one more change I realized that is also needed:

  1. Can you update the version number here to 2.6.3.

  2. Can you add to the CHANGELOG a short explanation of the change. Then I can merge this as the new version.

@fwelsh
Copy link
Contributor Author

fwelsh commented Nov 11, 2019

Sorry I didn't notice that, but we should be set now!

@jbloom jbloom merged commit d658fed into master Nov 11, 2019
@jbloom jbloom deleted the syn_selection_updates branch November 11, 2019 13:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants