fix: improving flooding #270

ErikBjare · 2022-03-20T09:37:24Z

Changes the behavior of aw-server-rust flooding to work the same as flooding in aw-core.

Part of the issue is described here: ActivityWatch/activitywatch#602 (comment)

Better tests
~~Forward flooding instead of meet-in-middle (aw-core uses flood-from-longest though)~~
- Tbh I'm not sure what is best, but forward flooding feels like it introduces some duration-bias towards later events, which seems unfair.
- Flood-from-longest will lead to the lowest relative error, I guess, and they have some sort of higher probability to occur anyway.

TODO

More tests (goal: ~100% coverage for flood.rs)

ErikBjare · 2022-03-20T09:39:38Z

aw-transform/src/flood.rs

-                // Extend e1 to the middle between e1 and e2
-                e1.duration = e1.duration + (gap / 2);


This was a major issue, as if the gap is negative that could lead to e1.duration getting shorter.

I guess that's true, but if you have negative gaps the data is already unreliable as it is. So there's not much of an improvement, since the raw data is already "bad".

It's intended to fix this: ActivityWatch/activitywatch#602

Which I consider a significant improvement, as now it should handle it as well as aw-server-python.

aw-transform/src/flood.rs

codecov · 2022-03-20T09:41:27Z

Codecov Report

Merging #270 (bb96465) into master (6c62a7b) will increase coverage by 0.03%.
The diff coverage is 52.63%.

@@            Coverage Diff             @@
##           master     #270      +/-   ##
==========================================
+ Coverage   50.56%   50.60%   +0.03%     
==========================================
  Files          45       45              
  Lines        6259     6280      +21     
==========================================
+ Hits         3165     3178      +13     
- Misses       3094     3102       +8

Impacted Files	Coverage Δ
aw-transform/src/flood.rs	`60.25% <52.63%> (-8.17%)`	⬇️
aw-models/src/duration.rs	`25.61% <0.00%> (+0.82%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6c62a7b...bb96465. Read the comment docs.

ErikBjare · 2022-03-21T11:12:46Z

Hoping coverage will improve with that last commit...

ErikBjare · 2022-03-21T12:12:26Z

Can't figure out why coverage is how it is (more should be covered than is reported? or am I just dumb?), and unable to generate reports locally...

@johan-bjareholt could you take a look?

ErikBjare · 2022-03-21T12:13:10Z

aw-transform/src/flood.rs

+            debug_assert!(gap >= chrono::Duration::seconds(0));
+
+            // If data is the same, we should merge them.
+            if e1.data == e2.data {
                // Choose the longest event and set the endtime to it


Suggested change

// Choose the longest event and set the endtime to it

// Set the endtime of e1 to whichever endtime is last of [e1, e2], drop e2.

Or am I reading the diff wrong?

That's correct, but it's an very unlikely scenario that e2 starts before e1. You would have to have a very crappy watcher to accomplish that.
So good to fix, but probably won't make much (any?) difference.

very unlikely scenario

Unfortunately not too unlikely. Not sure if they appear due to some faulty queuing, heartbeat logic, or race condition, but they pop up every now and then...

johan-bjareholt · 2022-03-21T19:33:00Z

@ErikBjare I don't know why, but the coverage has never been very reliable. You could pretty easily run it with gdb and step through if you really want to be sure for a specific test.

ErikBjare · 2022-03-22T09:10:28Z

@johan-bjareholt My best theory is that Rust somehow optimizes away a codepath (it notably shows lines that are later duplicated in a different if-clause as "oncovered") and that leads to it now showing in the coverage?

Anyway, would be really nice to have coverage working on the long run. Spent a couple hours trying to get it working better yesterday, but not much luck.

Also, you should check out ActivityWatch/activitywatch#724, might provide better background for some of the things I'm trying to make more robust.

ErikBjare · 2022-03-22T09:33:16Z

According to https://shift.click/blog/github-actions-rust/#coverage-with-cargo-tarpaulin, cargo-tarpaulin should be a lot more reliable.

Not LLVM-based, nor supports branch coverage, but might still be better than the status quo... (where entire tests seem to not show up in the coverage)

Tried it out, here the output:

Mar 22 10:33:17.663  INFO cargo_tarpaulin::report: Coverage Results:
|| Tested/Total Lines:
|| aw-client-rust/src/lib.rs: 54/61
|| aw-client-rust/tests/test.rs: 52/53
|| aw-datastore/src/datastore.rs: 314/396
|| aw-datastore/src/legacy_import.rs: 0/95
|| aw-datastore/src/worker.rs: 191/235
|| aw-datastore/tests/datastore.rs: 275/279
|| aw-models/src/bucket.rs: 9/9
|| aw-models/src/duration.rs: 4/4
|| aw-models/src/event.rs: 18/20
|| aw-models/src/key_value.rs: 4/4
|| aw-models/src/timeinterval.rs: 47/56
|| aw-models/src/tryvec.rs: 59/72
|| aw-query/src/datatype.rs: 119/154
|| aw-query/src/functions.rs: 247/287
|| aw-query/src/interpret.rs: 110/135
|| aw-query/src/lexer.rs: 50/54
|| aw-query/src/lib.rs: 7/10
|| aw-query/src/parser.rs: 127/136
|| aw-query/tests/query.rs: 255/286
|| aw-server/src/config.rs: 24/53
|| aw-server/src/device_id.rs: 0/8
|| aw-server/src/dirs.rs: 26/31
|| aw-server/src/endpoints/bucket.rs: 71/98
|| aw-server/src/endpoints/cors.rs: 16/16
|| aw-server/src/endpoints/export.rs: 12/14
|| aw-server/src/endpoints/hostcheck.rs: 65/68
|| aw-server/src/endpoints/import.rs: 15/15
|| aw-server/src/endpoints/mod.rs: 58/64
|| aw-server/src/endpoints/query.rs: 15/15
|| aw-server/src/endpoints/settings.rs: 37/40
|| aw-server/src/endpoints/util.rs: 21/31
|| aw-server/src/logging.rs: 0/29
|| aw-server/src/main.rs: 0/59
|| aw-server/tests/api.rs: 317/321
|| aw-server/tests/macros.rs: 4/4
|| aw-sync/src/main.rs: 0/9
|| aw-sync/src/sync.rs: 57/151
|| aw-sync/tests/sync.rs: 73/76
|| aw-transform/src/chunk.rs: 27/27
|| aw-transform/src/classify.rs: 103/104
|| aw-transform/src/filter_keyvals.rs: 61/61
|| aw-transform/src/filter_period.rs: 42/42
|| aw-transform/src/find_bucket.rs: 27/28
|| aw-transform/src/flood.rs: 157/157
|| aw-transform/src/heartbeat.rs: 62/66
|| aw-transform/src/merge.rs: 51/52
|| aw-transform/src/period_union.rs: 42/42
|| aw-transform/src/sort.rs: 24/24
|| aw-transform/src/split_url.rs: 31/33
||
82.76% coverage, 3380/4084 lines covered

82.7% is a lot better than the ~50% reported by kcov...

ErikBjare · 2022-03-22T09:37:19Z

Anyway, merging this. Might open another PR to switch to tarpaulin.

ErikBjare commented Mar 20, 2022

View reviewed changes

aw-transform/src/flood.rs Show resolved Hide resolved

ErikBjare commented Mar 20, 2022

View reviewed changes

aw-transform/src/flood.rs Outdated Show resolved Hide resolved

fix: WIP fixing flooding

93f99a8

ErikBjare force-pushed the dev/fixing-flooding branch from c4d2cc7 to 93f99a8 Compare March 20, 2022 09:46

fix: further improved flooding logic and tests

682fafc

test: added flooding test

bb96465

ErikBjare requested a review from johan-bjareholt March 21, 2022 12:12

ErikBjare commented Mar 21, 2022

View reviewed changes

ErikBjare changed the title ~~fix: WIP fixing flooding~~ fix: improving flooding Mar 22, 2022

ErikBjare merged commit a8b8a20 into master Mar 22, 2022

ErikBjare deleted the dev/fixing-flooding branch March 22, 2022 09:38

ErikBjare mentioned this pull request Mar 22, 2022

ci: switched to tarpaulin for code coverage #271

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: improving flooding #270

fix: improving flooding #270

ErikBjare commented Mar 20, 2022 •

edited

ErikBjare Mar 20, 2022 •

edited

johan-bjareholt Mar 21, 2022

ErikBjare Mar 22, 2022

codecov bot commented Mar 20, 2022 •

edited

ErikBjare commented Mar 21, 2022

ErikBjare commented Mar 21, 2022

ErikBjare Mar 21, 2022 •

edited

johan-bjareholt Mar 21, 2022

ErikBjare Mar 22, 2022

johan-bjareholt commented Mar 21, 2022

ErikBjare commented Mar 22, 2022

ErikBjare commented Mar 22, 2022 •

edited

ErikBjare commented Mar 22, 2022

		// Extend e1 to the middle between e1 and e2
		e1.duration = e1.duration + (gap / 2);

	// Choose the longest event and set the endtime to it
	// Set the endtime of e1 to whichever endtime is last of [e1, e2], drop e2.

fix: improving flooding #270

fix: improving flooding #270

Conversation

ErikBjare commented Mar 20, 2022 • edited

TODO

ErikBjare Mar 20, 2022 • edited

Choose a reason for hiding this comment

johan-bjareholt Mar 21, 2022

Choose a reason for hiding this comment

ErikBjare Mar 22, 2022

Choose a reason for hiding this comment

codecov bot commented Mar 20, 2022 • edited

Codecov Report

ErikBjare commented Mar 21, 2022

ErikBjare commented Mar 21, 2022

ErikBjare Mar 21, 2022 • edited

Choose a reason for hiding this comment

johan-bjareholt Mar 21, 2022

Choose a reason for hiding this comment

ErikBjare Mar 22, 2022

Choose a reason for hiding this comment

johan-bjareholt commented Mar 21, 2022

ErikBjare commented Mar 22, 2022

ErikBjare commented Mar 22, 2022 • edited

ErikBjare commented Mar 22, 2022

ErikBjare commented Mar 20, 2022 •

edited

ErikBjare Mar 20, 2022 •

edited

codecov bot commented Mar 20, 2022 •

edited

ErikBjare Mar 21, 2022 •

edited

ErikBjare commented Mar 22, 2022 •

edited