-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API for 'finite_sites'? #880
Comments
Do we want the resulting TS to "know" in some way that it is a "FiniteSitesTreeSequence"? This was mooted in tskit-dev/tskit#146 (comment) with no particular resolution. |
I don't see any value in FiniteSitesTreeSequence, it's just complexity for no gain. If a reference sequence exists, then tskit should use it and if not it should continue to behave as it does. That's a separate issue to generating mutations/recombinations under a finite sites model. |
Sorry - I may have put that wrongly (or the comment in the issue I linked to is not strictly relevant to the title of the issue). I'm not thinking at all about reference sequences. The question is whether the TS should know, internally, whether it is a finite sites tree sequence. For example if you generate a TS with finite recombinations, and then call |
No, I don't think it should. |
Right, "finite-sites" recombination + infinitely-many sites mutation is actually key to Hudson's original formulation of the process, and also to various methods for simulating odd things like microsatellites. |
So we've gotten off the track here with the last few comments above which are actually about tskit. The question is, should msprime have an option |
On Wed, Feb 5, 2020 at 10:01 AM Jerome Kelleher ***@***.***> wrote:
So we've gotten off the track here with the last few comments above which
are actually about tskit.
The question is, should msprime have an option finite_sites=True which is
a short cut for specifying a finite sites recombination model *and* a
finite sites mutation model, or whether there should be a single
finite_sites_recombination option. Or, we can keep it as it is now where
finite sites recombination functionality is accessed via specifying the
recombination map.
I was trying to address this second point--the mutation model and the
recombination model have no necessary relationship.
… —
You are receiving this because you are on a team that was mentioned.
Reply to this email directly, view it on GitHub
<#880?email_source=notifications&email_token=ABQ6OH3S36N45UAKXZ4WHOTRBL5BFA5CNFSM4KQMDQMKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEK4MGKA#issuecomment-582533928>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABQ6OH37YXLS5TSKDUMMJZTRBL5BFANCNFSM4KQMDQMA>
.
|
True enough, but I'm just trying to make it easy to do what most people would want. By default, we have infinite sites both, and we can have either as we like (well, when we actually implement finite sites in mutations). It seems to me that most people, who are running simulations of (say) a human chromosome will want finite sites mutations. It's then pretty tedious writing |
Fine, but then I don't think that we should raise a warning or error when specifying On the original point, if we are using keyword-only arguments, could we have [1] Does anyone else find it confusing that the parameter name in |
I vote for the
I hear you and wouldn't be opposed to there being some way for the user to have a finite sites recombination and infinite sites mutation model together, but I suspect that both of these uses are a bit more niche, so I don't think it should be the default behaviour. |
I like this. We should make it as easy for users to simulate in discrete coordinates as in continuous ones. I propose that the option is called |
So we can't change the default behaviour @gtsambos, but you're right, we'll still support all combinations of discrete and continuous coordinates. I like the |
With #946 merged we should do this. |
NOTE: This commit changes simulate to make all arguments after the first keyword only. This is a good change to get in for 1.0. Closes tskit-dev#880
Should we change this to |
|
NOTE: This commit changes simulate to make all arguments after the first keyword only. This is a good change to get in for 1.0. Closes tskit-dev#880
NOTE: This commit changes simulate to make all arguments after the first keyword only. This is a good change to get in for 1.0. Closes tskit-dev#880
Yeah, that's good option too. |
NOTE: This commit changes simulate to make all arguments after the first keyword only. This is a good change to get in for 1.0. Closes tskit-dev#880
Sounds good. I think I like |
NOTE: This commit changes simulate to make all arguments after the first keyword only. This is a good change to get in for 1.0. Closes tskit-dev#880
OK, |
Following discussion in tskit-dev#880.
With #862 merged it's now possible to simulate recombination in discrete or continuous coordinates by specifying the
discrete
parameter to RecombinationMap. This means we can have a "finite sites" or "infinite sites" model of recombination (although strictly, we would have to check and enforce the infinite sites recombination thing). It's probably a good idea to have a high-level API for this insimulate
so we don't force users to specify a RecombinationMap if they want to simulate in discrete coordinates.One idea is to have a new
finite_sites=True
argument which would apply to both recombination and mutation. We haven't implemented the mutation bit yet, but we could just ValueError for now iffinite_sites
is true andmutation_rate > 0
. It seems to me that users would probably want finite sites semantics in both cases, and that this would be a user friendly option in the long run.pinging @tskit-dev/all for opinions on this one.
The text was updated successfully, but these errors were encountered: