Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new additional_nodes API #2162

Merged
merged 1 commit into from Apr 19, 2023
Merged

Conversation

GertjanBisschop
Copy link
Member

@GertjanBisschop GertjanBisschop commented Mar 28, 2023

First pass at new additional_nodes API for Hudson, SMC, SMC' and multi-merger coalescents.
@benjeffery would you want to take a look at how I solved the asdict representation of NodeType as required for the provenance recording?
EDIT: ignore that Ben. I think I figured it out. Is this the way to do it?

def asdict(self):
        return {"__class__": "msprime.NodeType", "value": self.value}

@codecov
Copy link

codecov bot commented Mar 28, 2023

Codecov Report

Merging #2162 (d91fa85) into main (3d31227) will increase coverage by 0.00%.
The diff coverage is 91.66%.

❗ Current head d91fa85 differs from pull request most recent head 80cce06. Consider uploading reports for the commit 80cce06 to get more accurate results

Impacted file tree graph

@@           Coverage Diff           @@
##             main    #2162   +/-   ##
=======================================
  Coverage   91.45%   91.46%           
=======================================
  Files          20       20           
  Lines       11170    11214   +44     
  Branches     2268     2281   +13     
=======================================
+ Hits        10216    10257   +41     
  Misses        521      521           
- Partials      433      436    +3     
Flag Coverage Δ
C 91.46% <91.66%> (+<0.01%) ⬆️
python 98.68% <90.00%> (-0.07%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
msprime/ancestry.py 98.63% <89.65%> (-0.47%) ⬇️
lib/msprime.c 86.11% <92.30%> (+0.04%) ⬆️
msprime/__init__.py 100.00% <100.00%> (ø)
msprime/_msprimemodule.c 88.18% <100.00%> (+0.01%) ⬆️

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3d31227...80cce06. Read the comment docs.

@GertjanBisschop
Copy link
Member Author

GertjanBisschop commented Mar 29, 2023

We probably need a more elegant solution for specifying the full_arg option using the enum.Flag. Currently I have specified FULL_ARG as one of the possible NodeTypes. It is probably better to use something like this:

class NodeType(enum.Flag):
    RECOMBINANT = 1 << 17
    COMMON_ANCESTOR = 1 << 18
    MIGRANT = 1 << 19
    GENE_CONVERSION = 1 << 21
    PASS_THROUGH = 1 << 22
    
    @classmethod
    def full_arg(cls, model=None):
        if model is None:
            return cls.RECOMBINANT | cls.COMMON_ANCESTOR | cls.MIGRANT | cls.GENE_CONVERSION
        elif model.lower() == "hudson":
            return cls.full_arg()
        else:
            raise ValueError

@benjeffery
Copy link
Member

The asdict you have looks ok. Do you want me to do a full review here? One thing that jumped is that I think store_unary should be deprecated with a warning and not removed - otherwise isn't this a breaking change for users?

@GertjanBisschop
Copy link
Member Author

GertjanBisschop commented Mar 29, 2023

I don't think store_unary has been part of an actual msprime release. So should be ok on that front.
Would be great if you could do a code review!

@benjeffery
Copy link
Member

benjeffery commented Mar 29, 2023

I don't think store_unary has been part of an actual msprime release. So should be ok on that front.

Ahh great - I missed that point.

Copy link
Member

@jeromekelleher jeromekelleher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great @GertjanBisschop. Some quick comments on how to be type safe with fixed size types, which we need for the flags, because they go over 16 bits. Technically, int types are only guaranteed to be 16 bits long (but will usually be 32).

Just annoying legacy C stuff.

lib/msprime.h Outdated
@@ -430,7 +432,8 @@ int msp_set_simulation_model_sweep_genic_selection(msp_t *self, double position,
int msp_set_start_time(msp_t *self, double start_time);
int msp_set_store_migrations(msp_t *self, bool store_migrations);
int msp_set_store_full_arg(msp_t *self, bool store_full_arg);
int msp_set_store_unary(msp_t *self, bool store_unary);
int msp_set_additional_nodes(msp_t *self, int additional_nodes);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the additional nodes argument is a flags value, we want it to be a fixed size and with an unsigned type (there can be confusion with the sign bit, otherwise). So, change the type here to uint32_t.

@@ -1784,7 +1784,8 @@ Simulator_init(Simulator *self, PyObject *args, PyObject *kwds)
Py_ssize_t num_populations = 1;
int store_migrations = false;
int store_full_arg = false;
int store_unary = false;
int additional_nodes = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Annoyingly, the Python C API doesn't support fixed width types like uint32_t. So, the simplest thing to do then is to use a type that's guaranteed to be at least this size, and to cast to uint32_t later. So,

unsigned long additional_nodes

We use k for unsigned long in the Python C API

@@ -1926,8 +1927,9 @@ Simulator_init(Simulator *self, PyObject *args, PyObject *kwds)
goto out;
}
}
msp_set_additional_nodes(self->sim, additional_nodes);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here then we do

msp_set_additional_nodes(self->sim, (uint32_t) additional_nodes);

@jeromekelleher
Copy link
Member

LMK when this is ready for a final look @GertjanBisschop

@GertjanBisschop
Copy link
Member Author

Ready for a final look @jeromekelleher. I hope the idling lint tests will run again after a squash and rebase.

@benjeffery
Copy link
Member

The lint failing needs a PR to fix. I'll get to it.

@benjeffery
Copy link
Member

@Mergifyio rebase

@mergify
Copy link
Contributor

mergify bot commented Apr 12, 2023

rebase

✅ Branch has been successfully rebased

Copy link
Member

@jeromekelleher jeromekelleher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Just need to remove the new parameters (all of them) from simulate, and update the CHANGELOG and we're done.

Can you follow up with some issues to track what needs to be done next, and also scan to see if there's any we should close now as well after merging this PR?

msprime/ancestry.py Show resolved Hide resolved
@jeromekelleher
Copy link
Member

Ready to go once CI is fixed - great work @GertjanBisschop!

@mergify mergify bot merged commit e9b2ebe into tskit-dev:main Apr 19, 2023
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants