-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update table sorting API to allow for callback #616
Comments
The way it works at the moment is that table_collection_sort is a thin wrapper for the table_sorter private "class". My suggestion would be that we keep the existing API for In terms of making hooks available, one thing we could do is have tsk_table_sorter_t sorter;
ret = tsk_table_sorter_init(sorter, 0);
error_check(ret);
sorter.sort_edges = my_fancy_edge_sort_func;
ret = table_sorter_run(&sorter, tables, bookmark, TSK_NO_INTEGRITY_CHECK);
error_check(ret); It's a bit more long-winded, but seems quite flexible and extensible. Will this work with C++ ABI/calling convention issues @molpopgen? |
BTW, we'd change around the implementation of the |
Should. You can set the sorter to do nothing like I was thinking, and then run your own separately. The issue is that only a small subset of functional types in C++ convert to C function pointers without a lot of extra work to write trampolines. |
It should be possible wrap the C++ lambda fanciness with a simple function call though, right? So, you define your function int
edge_sort_func(tsk_table_sorter_t *self, tsk_size_t start, tsk_flags_t options)
{
// Define fancy C++ stuff in here!
return ret;
} That should all just compile down to something simple enough then? |
In order for that to work, then your lamba expressions will need to do awful things like capture global variables. What is usually needed is the reverse of your example. The closure should have the signature of the simpler function, and itself called some more complex function that takes advantage of captured variables. |
Oh, OK, I was thinking too much like Python then. Would you mind pasting in an outline of what this would look like ideally from the C++ client? |
Will do. I'll try to post an example that actually compiles and does
things, which means I'll make a mock of the setup, or at least some minimal
abstraction of it. It'll take a bit of doing.
…On Fri, May 15, 2020, 9:40 AM Jerome Kelleher ***@***.***> wrote:
Oh, OK, I was thinking too much like Python then. Would you mind pasting
in an outline of what this would look like ideally from the C++ client?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#616 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABQ6OHZDY5GDPJUFXGVY3VLRRVVRVANCNFSM4NBAKRDA>
.
|
I think the |
The void • is normally provided to the callback, not by it. Often, it is
something about the internal state of the function calling the callback.
We need to do more than just satisfy slim here. We need it to work in
general, which is a bit harder.
…On Fri, May 15, 2020, 9:45 AM Ben Haller ***@***.***> wrote:
I think the void * tbd in @molpopgen <https://github.com/molpopgen>'s
original proposal would suffice for SLiM. From that we could recover the
simulation object, which would get us to the table collection, which would
let us do the sort in parallel ourselves. BTW, it would be good for the
comparator function to be public in tskit, if it isn't already, so that one
would not have to hard-code the details of that in the client. (Although
one might choose to do so anyway, for efficiency; but one shouldn't *have*
to.)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#616 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABQ6OH5DY3Q7ADBANB4GPZLRRVWN3ANCNFSM4NBAKRDA>
.
|
Oh, sorry, I misunderstood the purposed to the |
@bhaller -- if you want access to the pop object, then your callback becomes a closure, meaning that you bind (by one of the 3 or 4 possible mechanisms) your population, leaving a final signature of If it helps, I can write in the comments of the example how function composition works? It'll take some time to write, so may not happen this week. |
No need; that is not what I was suggesting. I would simply pass in a |
Let's get clarification from @jeromekelleher -- is the intent that the |
Assuming for the moment that I originally over-thought the problem, meaning @bhaller is correct and that the The case I was originally envisioning, with tskit sending some internal state object to the callback, is a lot more complex to handle. #include <vector>
#include <iostream>
#include <functional>
// Mock some tskit stuff
struct tsk_mock_table_collection
{
};
using tsk_mock_edge_sort_callback = int (*)(tsk_mock_table_collection *, void *);
struct tsk_mock_table_sorter
{
tsk_mock_edge_sort_callback cback;
};
int
tsk_mock_default_edge_sorting_fxn(tsk_mock_table_collection *tables, void *)
{
std::cout << "default callback\n";
return 0;
}
int
tsk_mock_table_sorter_init(tsk_mock_table_sorter *sorter)
{
sorter->cback = &tsk_mock_default_edge_sorting_fxn;
return 0;
};
int
tsk_mock_sort_tables(tsk_mock_table_sorter *sorter, tsk_mock_table_collection *tables,
void *client_data)
{
if (sorter->cback != nullptr)
{
sorter->cback(tables, client_data);
}
else
{
std::cout << "not sorting the tables, so hope you did your homework!\n";
}
return 0;
}
// Mock some client stuff
int
vanilla_client_callback(tsk_mock_table_collection *, void *)
{
std::cout << "vanilla client callback\n";
return 0;
}
struct _edge
{
// left, right, parent, child,
// used for sorting
};
int
cpp_callback_1(tsk_mock_table_collection *, void *)
{
std::cout << "cpp callback 1\n";
std::vector<_edge> edges;
// copy from tables to edges
// sort via std::sort
// copy back
return 0;
}
int
cpp_callback_2(tsk_mock_table_collection *, void *data)
{
std::cout << "cpp callback 2\n";
auto edges = static_cast<std::vector<_edge> *>(data);
if (edges == nullptr)
{
return -1;
}
edges->clear(); // re-use those big allocations
// copy/sort/copy
return 1;
}
// Typdef for a C++ callback via std::function
using std_function_callback = std::function<int(tsk_mock_table_collection *, void *)>;
int
main(int argc, char **argv)
{
tsk_mock_table_sorter sorter;
tsk_mock_table_sorter_init(&sorter);
tsk_mock_table_collection tables;
tsk_mock_sort_tables(&sorter, &tables, nullptr);
// I promise that I know what I am doing
sorter.cback = nullptr;
tsk_mock_sort_tables(&sorter, &tables, nullptr);
// Now, explore more complex setups
sorter.cback = &vanilla_client_callback;
tsk_mock_sort_tables(&sorter, &tables, nullptr);
// A non-capturing lambda suffices
const auto lambda_callback_0 = [](tsk_mock_table_collection *, void *) -> int {
std::cout << "lambda callback 0\n";
return 0;
};
sorter.cback = lambda_callback_0;
tsk_mock_sort_tables(&sorter, &tables, nullptr);
const auto lambda_callback_1
= [](tsk_mock_table_collection *tables, void *tbd) -> int {
std::cout << "lambda callback 1 dispatches to";
return cpp_callback_1(tables, tbd);
};
sorter.cback = lambda_callback_1;
tsk_mock_sort_tables(&sorter, &tables, nullptr);
sorter.cback = &cpp_callback_2;
std::vector<_edge> edges;
tsk_mock_sort_tables(&sorter, &tables, &edges);
} |
I never did anything using the |
Looks good to me @molpopgen, I'll try to get a PR with a proposal for the tskit API soon. |
Sounds good. I think I was being completely thick originally and had forgotten everything about C callbacks. |
Sorting an edge table is quite expensive, and the current implementation can be out-performed by client code. The main motivation here is to speed up forward simulations.
It has been proposed to allow a callback to be passed to
tsk_table_collection_sort
that would look something like:I'd like to propose adding the following callback as a "built in" in tskit:
The idea is that this call back does nothing, meaning that client code is assuming 100% responsibility here and accepting that the integrity checker will double-check their work. It turns out to be very fiddly to get complex callbacks in other languages to work when the calling conventions differ from C. (Read: my attempts to mock this up so far all segfault.). Thus, allowing the step to be just skipped internally would be useful.
The text was updated successfully, but these errors were encountered: