Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tracking of data received from remote peers #345

Merged
merged 12 commits into from
Sep 19, 2023
Merged

Conversation

hannahhoward
Copy link
Collaborator

Goals

Provide more insight into how much data we're downloading during retrievals from each peer

Implementation

Uses ipfs/boxo#308 to which peers send blocks for Bitswap

  • create a DataReceivedEvent
  • dispatch DataReceivedEvent in all protocols for each time more data arrives
  • in aggregate event recorder, add data received events for each peer and collect into a total data transferred value for each retrieval attempt
  • for Bitswap, I am not attempting to do per peer first byte recording, cause it just doesn't make a lot of sense. This does mean the retrieval attempts may look a bit weird cause there will be a 'Bitswap' attempt and then attempts for each peer. We may want to drop the Bitswap one, though it's where the over Bitswap error gets recorded.
  • update tests

@codecov-commenter
Copy link

codecov-commenter commented Jul 4, 2023

Codecov Report

Merging #345 (7e37b00) into main (70fc1d1) will increase coverage by 0.52%.
The diff coverage is 96.34%.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #345      +/-   ##
==========================================
+ Coverage   76.06%   76.58%   +0.52%     
==========================================
  Files          84       85       +1     
  Lines        6320     6377      +57     
==========================================
+ Hits         4807     4884      +77     
+ Misses       1256     1240      -16     
+ Partials      257      253       -4     
Files Changed Coverage Δ
pkg/types/types.go 81.08% <ø> (+2.70%) ⬆️
pkg/retriever/bitswapretriever.go 95.97% <88.46%> (+0.10%) ⬆️
...g/aggregateeventrecorder/aggregateeventrecorder.go 90.62% <100.00%> (+0.85%) ⬆️
pkg/events/blockreceived.go 100.00% <100.00%> (ø)
pkg/internal/testutil/verifier.go 96.49% <100.00%> (+1.36%) ⬆️
pkg/retriever/graphsyncretriever.go 86.33% <100.00%> (+0.52%) ⬆️
pkg/retriever/httpretriever.go 86.82% <100.00%> (+0.10%) ⬆️
pkg/server/http/ipfs.go 90.59% <100.00%> (+1.28%) ⬆️
pkg/server/http/servertimingssubscriber.go 100.00% <100.00%> (ø)

... and 9 files with indirect coverage changes

@rvagg
Copy link
Member

rvagg commented Jul 5, 2023

Nice, looks good so far. What's left to do?

I do wonder whether "DataReceived" should be "BlockReceived", make it block-focused all round rather than pretending that there's some other unit of "data" that could be received. Aside from my comments in verifiedcar.go about this, I'm thinking that it would be good to wire this up to the CLI because currently we print out a summary with "Blocks" and "Bytes" but we get the "Blocks" count from the CarWriter Put() count, which includes duplicates, but the "Bytes" from the RetrievalStats, so there's a discrepancy that we don't explain, which is especially clear when you get a CAR with 3 blocks but it says 8 were received. We could listen to the "DataReceived" event and accumulate both bytes and block counts from there and then do some nicer, more verbose reporting, but it'd be a bit funny listening for "DataReceived" and incrementing a "block" counter—except for the fact that we happen to know there's a 1:1 relationship between those events and blocks.

@kylehuntsman
Copy link
Contributor

kylehuntsman commented Jul 7, 2023

I do wonder whether "DataReceived" should be "BlockReceived", make it block-focused all round rather than pretending that there's some other unit of "data" that could be received.
...
We could listen to the "DataReceived" event and accumulate both bytes and block counts from there and then do some nicer, more verbose reporting, but it'd be a bit funny listening for "DataReceived" and incrementing a "block" counter—except for the fact that we happen to know there's a 1:1 relationship between those events and blocks.

I agree and I'd be down for a BlockReceived event that had the block's byte length. How would graphsync work with a BlockReceived event? I only see it talking in bytes.

Copy link
Contributor

@kylehuntsman kylehuntsman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this approach looks good to me aside from whether or not we track blocks for bytes.

@rvagg
Copy link
Member

rvagg commented Sep 15, 2023

Revived and updated.

Depends on ipld/go-trustless-utils#8

I dealt with the aggregate event "Bitswap" attempt problem by treating it as a parent attempt, so every bitswap peer also has their bytes added to the "Bitswap" attempt. We're just going to have to remember this when inspecting the table for that kind of information.

@rvagg
Copy link
Member

rvagg commented Sep 15, 2023

Renamed DataReceived to BlockReceived.

@rvagg
Copy link
Member

rvagg commented Sep 15, 2023

Suspicious persistent failures both here and #420 on windows relating mainly, I think, to timeouts. Original form of this branch before I force pushed updates was also failing, for reference that's here: https://github.com/filecoin-project/lassie/actions/runs/5462484065/job/14787992551 - interestingly those failures are on the http retriever; the failures on #420 are on bitswap.

Perhaps this adds some major inefficiencies?

rvagg added a commit to filecoin-project/lassie-event-recorder that referenced this pull request Sep 18, 2023
rvagg added a commit to filecoin-project/lassie-event-recorder that referenced this pull request Sep 19, 2023
@rvagg rvagg merged commit 948b6a9 into main Sep 19, 2023
8 of 9 checks passed
@rvagg rvagg deleted the feat/add-bitswap-tracking branch September 19, 2023 11:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants