constify rest of public C API and tsk_tree_t types #940

castedo · 2020-10-27T01:09:23Z

Builds on top of #939 and final PR for #844.

Consitfies more or less everything else except haplotype_matching.h. Wasn't clear whether that header is used enough to be worth constifying. I've barely read it.

This the final PR for where a few final touches in trees.h cause a bunch of changes across many files.

AdminBot-tskit · 2020-10-27T01:10:59Z

📖 Docs for this PR can be previewed here

codecov · 2020-10-27T02:49:10Z

Codecov Report

Merging #940 into main will increase coverage by 0.00%.
The diff coverage is 97.29%.

@@           Coverage Diff           @@
##             main     #940   +/-   ##
=======================================
  Coverage   93.40%   93.40%           
=======================================
  Files          25       25           
  Lines       20357    20362    +5     
  Branches      825      825           
=======================================
+ Hits        19014    19019    +5     
  Misses       1306     1306           
  Partials       37       37

Impacted Files	Coverage Δ
c/tskit/genotypes.c	`94.66% <94.44%> (+0.06%)`	⬆️
c/tskit/convert.c	`92.04% <100.00%> (ø)`
c/tskit/haplotype_matching.c	`95.09% <100.00%> (ø)`
c/tskit/stats.c	`85.15% <100.00%> (ø)`
c/tskit/tables.c	`90.63% <100.00%> (ø)`
c/tskit/trees.c	`94.75% <100.00%> (ø)`
_tskitmodule.c	`90.21% <0.00%> (+<0.01%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update bd5eb19...8aaf2b2. Read the comment docs.

castedo · 2020-11-02T21:50:42Z

Congrats on reaching the last of 4 commits to constify almost all of tskit! Here's some high level commentary.

Firstly, I have not tried to const literally everything in tskit. As mentioned I've completely skipped haplotype_matching.h/c. Most everything in all the other files has been constified. Everything in the "tree API" is now const correct. But surely there are some other smaller stuctures and functions which I have skipped. I was mainly working towards full constification of the tree API and was opportunistic about other code if it looked relatively easy.

So, more so than the other commits, this one touches more files even though it's less lines changed. Until now, everything I've changed I doubt actually changed what machine code is compiled in any non-trivial way. It's mostly all been 'compiler type sugar'.

But in this commit there are two exceptions.

a function in test_stats.c needed a minor logic change which I don't think adversely affects the test.
The tsk_vargen_t type in genotypes.h has a variable change and tsk_vargen_init() in genotypes.c is implemented differently.

The old code has bool sample_index_map_allocated to track whether tsk_id_t *sample_index_map points to the one inside the tree sequence or whether it's a copy of the alternative sample index map that is optionally passed to tsk_vargen_init. With const correct typing this is problematic because a pointer to the sample index map inside the tree sequence should be const. And trying to use tsk_safe_free with a const pointer and strict const warnings turning into errors opens a can of worms.

So I went with the solution of dropping the bool and have two separate pointers to a sample index map. One is a const pointer which always points to the sample index map to use and the other pointer is optional and is what needs to get freed if it is there. In the process of making this change, I went I ahead and split out some init logic into a separate function.

jeromekelleher

Looks great, thanks again for all the hard work here @castedo. One minor comment above.

c/tskit/genotypes.c

c/tskit/genotypes.h

castedo · 2020-11-03T12:48:43Z

All resolved. Good to merge.

jeromekelleher

LGTM, thanks @castedo!

- stats.{h,c} - newick tree converter - tsk_vargen_t - kc_distance and tsk_diff_iter

castedo · 2020-11-03T14:34:44Z

@jeromekelleher and @benjeffery, so now that this is finally all merged in. I'm curious what worked well or not so well as far as having lots of branches?

Definitely one take away is to only have ONE branch marked as non-draft when there is a chain of dependent branches/PR.

Did this feel like too many small merges? I was tilting towards lots of small PRs as an experiment.

With a chain of changes like this, was it worth it to have the "future" branches available as GitHub PRs rather than just topic branchs that can mentioned and then "manually" diffed?

One take way I'm thinking of following is that there is never more than one draft PR among the "future" branchs. So even though there might be a logical chain of 3+ branches, one is non-draft read-to-review, the next one is draft, and all further one exist as non-PR topic branches that are in-waiting and can be mentioned/linked to, but aren't yet github PRs.

benjeffery · 2020-11-03T23:57:35Z

I think it worked better than having one huge PR. Was managing the branches annoying? Multiple open PRs isn't an issue as long as they are ready to discuss.

jeromekelleher · 2020-11-04T09:24:58Z

From the maintainers perspective having a sequence of PRs is infinitely better than one giant one. I didn't really pay attention to how you were managing the sequencing @castedo, I just focused on the sequentially-ready ones. This worked well for me.

castedo · 2020-11-11T20:59:41Z

@benjeffery

Was managing the branches annoying?

I did not find managing multiple non-rebased "topic" branches annoying. However I did find attempting to have multiple rebased pr/... branches corresponding to each topic branch, all at the same time, was too much.

Another thing I discovered is that my git-prepr script does not gracefully handle mergebot doing a force push of a pr/... branch. I'm probably going to try and make a helper script on top of git-isograft to better handle that scenario.

So my take-away from this experiment is the following:

Assuming one wants to preserve git history for a topic branch (it's easier to not), manage the topic branch with this life-cycle:

create topic branch named, say, foobar (and don't bother immediately creating a pr/foobar branch)
once there is reason to discuss, create draft PR and branch pr/foobar that just follows foobar (and don't bother immediately rebasing-squashing-grafting pr/foobar)
once the PR is non-draft and next to be merged, then rebase-squash-graft pr/foobar, a la git-prepr
use TBD helper script to clean up after mergebot does a forced push of the pr/foobar branch

That last step 6) is needed so that all other topic branches depending on foobar get handled properly with git merge.

castedo force-pushed the pr/const_c_api branch 2 times, most recently from 610f23b to 64a6525 Compare October 27, 2020 01:20

castedo marked this pull request as draft October 27, 2020 15:41

castedo mentioned this pull request Oct 28, 2020

constify C stats api #939

Merged

castedo force-pushed the pr/const_c_api branch 2 times, most recently from 93db149 to d6e1f3a Compare November 2, 2020 21:47

castedo marked this pull request as ready for review November 2, 2020 21:50

castedo requested a review from jeromekelleher November 2, 2020 21:50

jeromekelleher reviewed Nov 3, 2020

View reviewed changes

c/tskit/genotypes.c Outdated Show resolved Hide resolved

c/tskit/genotypes.h Show resolved Hide resolved

castedo force-pushed the pr/const_c_api branch from d6e1f3a to c364619 Compare November 3, 2020 12:38

jeromekelleher approved these changes Nov 3, 2020

View reviewed changes

jeromekelleher added the AUTOMERGE-REQUESTED label Nov 3, 2020

constify rest of C API including types, plus

8aaf2b2

- stats.{h,c} - newick tree converter - tsk_vargen_t - kc_distance and tsk_diff_iter

AdminBot-tskit force-pushed the pr/const_c_api branch from c364619 to 8aaf2b2 Compare November 3, 2020 13:54

mergify bot merged commit 1c6ad9c into tskit-dev:main Nov 3, 2020

mergify bot removed the AUTOMERGE-REQUESTED label Nov 3, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

constify rest of public C API and tsk_tree_t types #940

constify rest of public C API and tsk_tree_t types #940

Uh oh!

castedo commented Oct 27, 2020

Uh oh!

AdminBot-tskit commented Oct 27, 2020

Uh oh!

codecov bot commented Oct 27, 2020 •

edited

Loading

Uh oh!

castedo commented Nov 2, 2020

Uh oh!

jeromekelleher left a comment

Uh oh!

Uh oh!

Uh oh!

castedo commented Nov 3, 2020

Uh oh!

jeromekelleher left a comment

Uh oh!

castedo commented Nov 3, 2020 •

edited

Loading

Uh oh!

benjeffery commented Nov 3, 2020

Uh oh!

jeromekelleher commented Nov 4, 2020

Uh oh!

castedo commented Nov 11, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

constify rest of public C API and tsk_tree_t types #940

constify rest of public C API and tsk_tree_t types #940

Uh oh!

Conversation

castedo commented Oct 27, 2020

Uh oh!

AdminBot-tskit commented Oct 27, 2020

Uh oh!

codecov bot commented Oct 27, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

castedo commented Nov 2, 2020

Uh oh!

jeromekelleher left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

castedo commented Nov 3, 2020

Uh oh!

jeromekelleher left a comment

Choose a reason for hiding this comment

Uh oh!

castedo commented Nov 3, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

benjeffery commented Nov 3, 2020

Uh oh!

jeromekelleher commented Nov 4, 2020

Uh oh!

castedo commented Nov 11, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

codecov bot commented Oct 27, 2020 •

edited

Loading

castedo commented Nov 3, 2020 •

edited

Loading