SPIKE: 2019 01 04 codecov returns #813

thedavidmeister · 2019-01-04T00:33:37Z

progress report/squirl 2019-01-07

deeper rabbit hole than expected, originally was thinking it would be a simple matter of dependency hell then pinning some versions

something that has changed recently, either in tarpaulin or in our code, is changing the way that state setup/teardown is treated

current rust nightly is 2018-12-26
only version of tarpaulin that supports this nightly is 0.6.11

originally we had a problem due to nightly features and transitive dependencies, which is why we disabled tarpaulin: xd009642/tarpaulin#179

the transitive deps seem to be resolved

3 new issues appeared:

instrumentation clashes reported by the compiler
segfaults
bad context state access

instrumentation clashes

this appears to be caused by incremental compilation

RN the only solution is to disable incremental compilation, or accept the clashes

apparently the clashes aren't fatal, they indicate potentially inaccurate measurements due to LOC mappings being off, no idea how major/minor the problem might be (didn't get that far tbh)

the instrumentation clash warnings may not be errors as such, a multiple lines of code may map to the same addresses and that just warns on occurrences of that which may affect the accuracy of results

xd009642/tarpaulin#190 (comment)

segfaults

dunno, seeking help upstream xd009642/tarpaulin#190

I hope it has nothing to do with zmq pytorch/pytorch#2507

definitely never seen segfaults in anything else i've done with rust, but here's a guide that looks helpful to debug https://jvns.ca/blog/2017/12/23/segfault-debugging/

bad context state access

this is the bit i think we need to clean up first

actually i'm not sure exactly why our regular cargo test runs aren't failing

what i see is that we initialise a lot of "live" things (loops n' channels n' globals n' shit) in our tests into a "sort of" initialised state, then throw logic at them that assumes something fully initialised - these tests should fail, tarpaulin is doing the correct thing here i believe

some examples of what i mean:

Context::new creates None action channels, then reducers unwrap() action channels
the "mock" network has live ticks that can panic if the context state isn't ready for network logic
the action channels also have live ticks that can panic if the context state isn't ready for actions

this becomes extra gnarly to debug when we throw concurrency and inability to cleanly tear down our own state into the mix (i guess cargo test is forcefully dropping our state for us "fast enough" to prevent it blowing up, while tarpaulin keeps event loops "long enough" to panic 🤷‍♂️ )

on circle, and locally when running hc-tarpaulin i can setup a live ticking network that allows the test it was intended for to pass, then see that same ticking network blow up a None action loop 5 tests later because the context wasn't torn down properly in the original test (or is there some global state being reused here? 🤔 ) - i tested this by creating a mock network with no-ops to the test that died on a network tick

these are pretty common problems and we identified them a while ago, but now it's blocking us for whatever reason

proposal

i'm hoping none of this is controversial... i see this as a relatively straightforward tech debt cleanup thing

real question is when/how to prioritise it considering it seems to be a requirement for getting codecov back up and running (unless we get kcov working, that is also a potential alternative but a lot more complex from an ops perspective because it relies on kcov being installed locally correctly)

split everything cleanly into unit tests and integration/scenario tests
- unit tests are stateless and make up 90% of our coverage
  - fast, reliable, focused, deterministic, etc.
  - is what TDD is talking about
  - use real mocks/fakes/stubs or whatever we want to call it https://en.wikipedia.org/wiki/Mock_object
  - deterministic, stateless, presents only the interface to what we are trying to test (no internals)
  - simplest example is implementing traits (e.g. NetWorker) where every method is a no-op, so we can satisfy the compiler when we're not actively testing that trait (e.g. not testing the network)
  - a mock context would be great, simulating the whole state but always returning known test data but every method is a no-op or at least free of side effects
- integration/scenario tests may have state but must cleanly teardown any/all state and be minority of tests overall
  - slow, stateful, complex, high maintenance overhead over time
  - "shotgun" approach to testing, often overestimates codecov etc. simply by booting (often test frameworks allow integration tests to be ignored for codecov calculations)
  - harder to make deterministic
  - often need to block/wait on a lot of moving parts to progress to the next step
  - a little goes a long way https://testing.googleblog.com/2015/04/just-say-no-to-more-end-to-end-tests.html
  - clean state teardown is surprisingly not easy with low level rust primitives only (e.g. Arc) because you can't go "the other way" to trigger a teardown for everything holding a reference
  - should use fully initialised state to avoid false positive/negatives caused by simulating impossible states while missing real-world behaviour (e.g. don't have a live network ticking over without an action channel set 😅 ), we shouldn't write tests against things we know can't happen and should be writing tests for things we know will happen
  - no shared/global state for anything that can be setup/torn down between tests!
gracefully handle the case where optional state is None without panicking
- e.g. the unwrap at the top of get_entry_from_dht is pointing to an action channel that defaults to None
- pretty much any unwrap in the codebase that is pointing to a stateful Option should be treated with suspicion at this point
- this goes the other way too, if we're unwrap()ing something, is it really optional?
deterministic tests
- if the tests were deterministic, changing the testing harness would not break the test logic

prior art

unit test reducers zome tests for reducers should call the reducer directly instead of using a live global instance reduce #199
non-determinism in scenario tests flaky app spec test for networking #688
shutdown containers properly WIP: Proper shutdown of network and instances when stopping Container #803
other work @maackle is doing RN? links?
isolating the action loop logic better isolate side effects of instance action loop so they can be unit tested #169
signals/waiting/determinism at scenario test level
- Add waitFor util function for scenario tests #703
- Tracing (Internal) Signals #732
using type system better for Context setup/config Clean up interface for creating Container; leverage the type system #739

codecov · 2019-01-04T04:25:17Z

Codecov Report

❗ No coverage uploaded for pull request base (develop@0887587). Click here to learn what that means.
The diff coverage is n/a.

@@            Coverage Diff             @@
##             develop     #813   +/-   ##
==========================================
  Coverage           ?   64.05%           
==========================================
  Files              ?      145           
  Lines              ?     5003           
  Branches           ?        0           
==========================================
  Hits               ?     3204           
  Misses             ?     1799           
  Partials           ?        0

Impacted Files	Coverage Δ
net_connection/src/protocol.rs	`61.91% <ø> (ø)`
core/src/network/reducers/get_entry.rs	`69.57% <ø> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0887587...2c5126a. Read the comment docs.

codecov · 2019-01-04T04:25:17Z

Codecov Report

❗ No coverage uploaded for pull request base (develop@cd80cff). Click here to learn what that means.
The diff coverage is 34.62%.

@@            Coverage Diff             @@
##             develop     #813   +/-   ##
==========================================
  Coverage           ?   67.56%           
==========================================
  Files              ?      154           
  Lines              ?     5490           
  Branches           ?        0           
==========================================
  Hits               ?     3709           
  Misses             ?     1781           
  Partials           ?        0

Impacted Files	Coverage Δ
cmd/src/cli/test_context.rs	`0% <ø> (ø)`
container_api/src/context_builder.rs	`92.86% <ø> (ø)`
core/src/network/reducers/get_entry.rs	`65.39% <ø> (ø)`
net/src/mock_worker.rs	`77.64% <ø> (ø)`
core/src/nucleus/ribosome/callback/genesis.rs	`100% <ø> (ø)`
core/src/dht/actions/add_link.rs	`76.93% <ø> (ø)`
core/src/network/reducers/send_direct_message.rs	`60% <ø> (ø)`
core/src/network/reducers/get_links.rs	`72% <ø> (ø)`
core/src/network/reducers/publish.rs	`47.68% <ø> (ø)`
net/src/p2p_config.rs	`27.59% <0%> (ø)`
... and 5 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cd80cff...115271e. Read the comment docs.

…ochain-rust into 2019-01-04-codecov-returns

thedavidmeister · 2019-01-06T11:37:07Z

seeing this local:

.......................................thread '<unnamed>' panicked at 'Action channel not initialized', src/libcore/option.rs:1008:5
free(): invalid pointer
Error a segfault occured when executing test

first hit on google is something about zmq, hope that's a red herring -_-

pytorch/pytorch#2507

maackle · 2019-01-07T19:32:51Z

I have some more general thoughts I'd like to share, but I'll do that in a separate comment after this.

To reply to the immediate concerns here, I do feel it's very important to be able to shut things down completely, especially for testing. We're building a creature that is supposed to easily glom onto other similar creatures and start talking to them as soon as they're visible over a resilient, unenclosable network. That's exactly what we don't want for isolated tests! It seems we may already be running into some issues of lingering networks bleeding into each other, and it's only going to get worse.

The two ways that tests can interfere with each other are across time (the next test starts before the last one's network stopped) and across threads. To fully remedy that interference, we need

a way to shut down everything (partial work done in WIP: Proper shutdown of network and instances when stopping Container #803)
a way to make networks reliably independent per test case
- this is partially handled by the practice of jostling the DNA UUID per test, but we currently run all mock networks on the same MockSingleton so that's a further way for them to be coupled. Add "network name" as mock network config #820 allows for separate mock networks but is not yet merged. It's not the best because it still depends on the user to make the networks unique by choosing a unique string, but at least this makes it possible.

maackle · 2019-01-07T19:54:38Z

As promised, here are a lot more thoughts. TL;DR, this is all stuff that I think will help us prevent more of the kinds of problems in @thedavidmeister's report, but most of it is not immediately actionable in that we could use it to fix these tests in the next few weeks.

clean state teardown is surprisingly not easy with low level rust primitives only (e.g. Arc) because you can't go "the other way" to trigger a teardown for everything holding a reference

It's true, my PR for shutdown behavior was an attempt to shut things down "pretty well", but it's hard to be confident that it's a 100% clean shutdown as soon as we introduce Cloneable references like Arc and (indirectly in networking) Sender. Sometimes these are unavoidable but sometimes they are. In any case we can't be compiler-sure of shutdown unless we can actually deallocate and drop an entity, which is not possible for these cloneable references. The system is not super duper complex yet so someone who knows the architecture in and out could probably write some solid shutdown functionality (without the help of the compiler), but I'm not that person yet.

As things get more complex that shutdown ability might get hairy to maintain. @thedavidmeister and I talked about writing lifecycle management tools eventually. As soon as things get hairy (is that now?) we might consider that.

pretty much any unwrap in the codebase that is pointing to a stateful Option should be treated with suspicion at this point

I actually secretly spent an evening last month trying to see if I could refactor Context to remove state and manually inject the state in places where it's actually needed. It became clear that it's a huge undertaking; the state Arc in Context is deeply rooted. My takeaway was that State and Context are circularly dependent on each other, and that State doesn't really belong in Context.

Everything I've said so far points to a strong lesson I've learned from working with Rust and other languages: when working with a strong type system, trust the type system! If something turns out to be awkward to express, or feels downright wrong (many Option::unwraps), it's easy to blame it on the language and find a way to work around it more conveniently or comfortably. But really those difficulties are a blessing, they give us a chance to rethink the design to make it less stateful and prone to runtime inconsistencies. Rust has been hard to learn, it is a shame if we don't fully leverage its power. It slows us down, but in a good way (unless we don't heed its advice, and then it's just plain slow for no reason).

For instance, any time we have a type that contains an Option that's not really optional, we introduce a possible invalid state. One simple way we can avoid that is by splitting the type into two separate types, for instance Context and ContextWithState, so that we are forced use the one we need, and then all other code will have to reflect those two possible meanings of Context, which is more explicit and verbose, but also perfectly illustrates exactly what's going on with no question. Sometimes by restructuring things it falls out even more cleanly than that. If you look at paritytech's jsonrpc code, they make extensive use of the type system, doing stuff I can barely understand, to make strong compiler guarantees.

With well-placed use of types we can make invalid states uncompilable and hence impossible. That's so much more powerful than just being careful that we always initialize things in the right order. The closer we can get to this ideal the more resilient the code will be, the less we'll have to rely on integration tests, and the more we'll be able to onboard new people more quickly and collaborate in general.

I know a lot of code was written and established while still learning Rust, when it was hard to even get anything to compile. Maybe y'all have learned similar lessons already. I just wanted to say it explicitly even if we all know it now.

This has been a bit of a rant about something seemingly tangential, but I say it to back up @thedavidmeister's point that we could stand to be more disciplined and rigorous in testing. If we're also more rigorous in our use of the type system, we can greatly reduce the number of moving parts and our tests will cover more of the actual space of possible outcomes. If we do it to the utmost, we won't have to worry about whether something is partially or fully or incorrectly initialized; if the object exists then it is good to go. Then we can focus more on unit tests and all the logic the compiler can't check.

real question is when/how to prioritise it considering it seems to be a requirement for getting codecov back up and running

Yes, this is the real question. Most of what I suggested is more long-term and not things we can change in the code right now. I think in the short term, I think deterministic tests are high priority, and to get those we need clean shutdown and truly independent mock networks. As far as the correctness of state initialization, we just have to be careful, and moving forward, I'd like to see more of that carefulness offloaded to the type system.

thedavidmeister · 2019-02-08T21:31:56Z

FYI all following this thread

there is some progress on the upstream segfaults:

xd009642/tarpaulin#190 (comment)

minimal repro identified
happens when there is more than one test with threads
probable cause identified through debug output

thedavidmeister · 2019-05-15T19:24:02Z

PR to just deal with segfaults and instrumentation issues in tarpaulin at #1405

thedavidmeister added 2 commits January 4, 2019 11:29

wip on tarpaulin support

68c57f9

try codecov on circle

a72cf36

thedavidmeister added the review label Jan 4, 2019

thedavidmeister added 12 commits January 4, 2019 11:34

run code-coverage job on circle

bcde228

build wasm in hc-codecov

a2311a4

use nix properly in codecov on circle

0190862

rollback date change in nix

b89d153

skip broken test in tarpaulin

157963b

debug tarpaulin on circle

8a42ff1

debug tarpaulin circle

bd0ae5a

debug tarpaulin circle

37f2609

debug tarpaulin

c819d3f

upload codecov even if fails

64088e3

debug codecov

075a4f4

tweak broken tests for tarpaulin

2c5126a

thedavidmeister added 5 commits January 4, 2019 15:30

lower codecov to 64%

dccd122

include codecov in the warm tools for docker

6e5ce82

install tarpaulin only in tools warming for docker

3427a7f

do a tarpaulin run in the nightly docker build

8e0cf8a

lint

1dc9d20

thedavidmeister changed the title ~~WIP: 2019 01 04 codecov returns~~ 2019 01 04 codecov returns Jan 4, 2019

thedavidmeister added 7 commits January 4, 2019 16:52

tweak codecov target

18125bf

don't incrementally compile codecov

fadc655

Merge branch 'develop' into 2019-01-04-codecov-returns

5316763

tweaking tarpaulin args

096e6b4

Merge branch '2019-01-04-codecov-returns' of github.com:holochain/hol…

8af299e

…ochain-rust into 2019-01-04-codecov-returns

remove reduce_get_entry_timeout_test as broken

e6c5a00

more broken tests in tarpaulin

d291799

thedavidmeister added 18 commits January 6, 2019 18:20

Update commit.rs

3498f33

Update debug.rs

a5d339c

Update get_entry.rs

2194d71

Update get_links.rs

ebc3bc9

Update link_entries.rs

7104e0b

Update genesis.rs

cb53f52

Update receive.rs

fa8cb24

allow dockers to build

9c46739

get_entry_roundtrip is broken (timeout)

06b9722

lint

34a9689

WIP on mock mock network

7f43d34

use mock mock network in test context

cc859ff

debug tarp

da625fd

debug

bb75696

lint

c8318d2

debug

819d5bc

fix reduce_publish_test

45a874d

stop running dockers on tarp branch

f845801

thedavidmeister changed the title ~~WIP: 2019 01 04 codecov returns~~ SPIKE: 2019 01 04 codecov returns Jan 7, 2019

thedavidmeister and others added 2 commits January 7, 2019 17:39

debug test concurrency issues

c909e30

Merge branch 'develop' into 2019-01-04-codecov-returns

5f52abe

rustfmt

115271e

zippy added the upstream-action-needed upstream action need label May 6, 2019

thedavidmeister closed this May 15, 2019

zippy deleted the 2019-01-04-codecov-returns branch July 5, 2019 13:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SPIKE: 2019 01 04 codecov returns #813

SPIKE: 2019 01 04 codecov returns #813

thedavidmeister commented Jan 4, 2019 •

edited

Loading

codecov bot commented Jan 4, 2019

codecov bot commented Jan 4, 2019 •

edited

Loading

thedavidmeister commented Jan 6, 2019

maackle commented Jan 7, 2019

maackle commented Jan 7, 2019 •

edited

Loading

thedavidmeister commented Feb 8, 2019

thedavidmeister commented May 15, 2019

SPIKE: 2019 01 04 codecov returns #813

SPIKE: 2019 01 04 codecov returns #813

Conversation

thedavidmeister commented Jan 4, 2019 • edited Loading

progress report/squirl 2019-01-07

instrumentation clashes

segfaults

bad context state access

proposal

prior art

codecov bot commented Jan 4, 2019

Codecov Report

codecov bot commented Jan 4, 2019 • edited Loading

Codecov Report

thedavidmeister commented Jan 6, 2019

maackle commented Jan 7, 2019

maackle commented Jan 7, 2019 • edited Loading

thedavidmeister commented Feb 8, 2019

thedavidmeister commented May 15, 2019

thedavidmeister commented Jan 4, 2019 •

edited

Loading

codecov bot commented Jan 4, 2019 •

edited

Loading

maackle commented Jan 7, 2019 •

edited

Loading