-
Notifications
You must be signed in to change notification settings - Fork 275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New values for DHT Provider Record Republish and Expiration (22h/48h, RFM17) #451
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bumping Republish and Expiration sounds sensible ("reduce the overhead without interfering with the performance and reliability").
@mxinden any concerns from non-IPFS side of things?
Co-authored-by: Marcin Rataj <lidel@lidel.org>
Co-authored-by: Marcin Rataj <lidel@lidel.org>
Co-authored-by: Marcin Rataj <lidel@lidel.org>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wonderful to see these optimizations based on comprehensive studies!
Can you bump the revision of the specification at the top of the document?
kad-dht/README.md
Outdated
to prevent storing potentially outdated address information. Implementations that choose | ||
to keep the network address (i.e., the `multiaddress`) of the providing peer should do it for **the | ||
first 10 mins** after the provider record (re-)publication. The setting of 10 mins follows | ||
the DHT Routing Table refresh interval. After that, peers provide | ||
the provider's `peerID` only, in order to avoid pointing to stale network addresses | ||
(i.e., the case where the peer has moved to a new network address). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I recall correctly, this is the status quo in the Golang implementation, correct? Is there any data backing up this decision? What I am surprised by is that addresses go stale so quickly. Is that really the case on IPFS today?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is the status quo in the Golang implementation, correct?
Yup.
Is there any data backing up this decision?
Nope :) But plan to start some investigation asap. See: ipfs/kubo#9264, protocol/prodeng#22 and probe-lab/thunderdome#91 as an optimisation.
What I am surprised by is that addresses go stale so quickly. Is that really the case on IPFS today?
Given that IPFS DHT servers have public addresses, I doubt that they go stale so quickly. This might change when hole punching is widespread, but the optimisation proposed in the above issues won't hurt, I believe :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that we are not sure this optimization is a good idea, how about only documenting it in this specification once we know it is a good idea?
Otherwise new implementations like Iroh (//CC @dignifiedquire) would have to implement this despite not being an optimization in the first place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact, we've figured out that this has changed to 30mins: libp2p/go-libp2p@c282179 - I've rephrased the text accordingly to make it more generic and mention this value for the kubo implementation.
There's also this: ipfs/kubo#9264 - any views more than welcome! :)
kad-dht/README.md
Outdated
and nodes that store and serve provider records need to make sure that the Multihashes whose | ||
records they store are still served by the content provider. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do they do this? Is this happening today on the IPFS DHT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is done (somewhat indirectly) through the expiration of the provider record: if the content provider does not republish the record (within the republish interval), then nodes do not serve those provider records after the expiration interval (assuming that the content provider is not interested having this content live anymore).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the above phrasing implies this being an active process on the provider record storage node. What do you think of rephrasing this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've rephrased to the following - let me know if it reads better:
"Content needs to be reachable, despite peer churn;
and nodes that store and serve provider records should not serve records for stale content,
i.e., content that the original provider does not wish to make available anymore."
kad-dht/README.md
Outdated
remain online when clients ask for the record. In order to | ||
guarantee this, while taking into account the peer churn, content providers | ||
republish the records they want to provide every 24 hours. | ||
2. **Provider Record Expiration Interval (48hrs):** The network needs to provide |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be harmonized with the PUT_VALUE expiration time as well, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yiannisbot while we're changing the times any reason not to do this too? Seems like it'd be reasonable since expiration is based on the same properties. It'd also make things easier to reason about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean in the PR? Yes, the intention is to change both the republish and the expiration interval. Otherwise it would make no sense (or well, it would be confusing). Or do you mean something else?
cc: @cortze who is working to submit the relevant PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Linking here the PR libp2p/go-libp2p-kad-dht#793 to increase the expiration time of the PRs to 48h
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean in the PR
No, I mean to change the expiration time for PUT_VALUE
records in addition to ADD_PROVIDER
records.
i.e. for the IPFS Public DHT the expiration time for PUT_VALUE
(i.e. IPNS and the deprecated public key records) is 36hrs
https://github.com/libp2p/go-libp2p-kad-dht/blob/dae5a9a5bd9c7cc8cfb5073c711bc308efad0ada/internal/config/config.go#L117. It seems like this could be 48hrs as well rather than 36.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cortze should we change that too then? It makes sense. Do you want to create the PR and set it to 48hrs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just created it! here is the link to the PR -> go-libp2p-kad-dht#794
kad-dht/README.md
Outdated
remain online when clients ask for the record. In order to | ||
guarantee this, while taking into account the peer churn, content providers | ||
republish the records they want to provide every 24 hours. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps this has already been considered, but given this is an implicit protocol change it might help implementers to know what they should use as the republish interval. 24hrs matches the current expiration time so reproviding every 24hrs while the network is (slowly) upgrading might not be great. Perhaps it's fine in networks like the IPFS Public DHT since some nodes will upgrade quickly (e.g. Hydras, people autodeploying the latest kubo Docker containers, etc.) but just wanted to flag this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, so you suggest not having the new republish interval the same as the old expiration interval as this will become confusing and might have side effects as well? I guess a valid workaround is to have the republish interval set to something a bit smaller (say, 20hrs?) for the transition period? Any better approaches?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've set this to 22hrs. @mxinden @aschmahmann let me know if that works for you.
@lidel @mxinden @aschmahmann I've addressed all comments, apart from the one suggesting to have the PR on IPFS before merge this. Can you have another look to see if everything is ready? In the meantime, we'll work to get the PR ready on the IPFS side of things - feel free to do so as well if you wish :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the follow-ups @yiannisbot.
Other than the specification revision bump and the kubo pull request, this looks good to me.
It is also worth noting that the keys for provider records are multihashes. This | ||
is because: | ||
|
||
- Provider records are used as a rendezvous point for all the parties who have | ||
advertised that they store some piece of content. | ||
- The same multihash can be in different CIDs (e.g. CIDv0 vs CIDv1 of a SHA-256 dag-pb object, | ||
or the same multihash but with different codecs such as dag-pb vs raw). | ||
- Therefore, the rendezvous point should converge on the minimal thing everyone agrees on, | ||
which is the multihash, not the CID. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🙏
Here is the PR to change the republish interval in kubo: ipfs/kubo#9326 |
Here is the second PR libp2p/go-libp2p-kad-dht#793 to increase the expiration time of the PRs from |
License: MIT Signed-off-by: Marcin Rataj <lidel@lidel.org>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mxinden @marten-seemann bumped revision in the header, mind giving this final review?
Kubo team wants to ship and with next Kubo 0.18 release, but want to make sure we have specs merged first
- reprovide interval in kubo is merged: feat: increase default Reprovider.Interval ipfs/kubo#9326
- expiration interval in feat: increase expiration time for Provider Records to 48h (RFM17) go-libp2p-kad-dht#793 and feat: increase the max record age to 48h (PUT_VALUE, RFM17) go-libp2p-kad-dht#794 is waiting for review/release
- @marten-seemann lmk if it is ok for me to do it
- this spec is in sync with the two above
fwiw I've updated revision as requested, and made it easier to find/eyeball both numbers:
Heil Hydra, I guess. If the records are staying around that long, we should do this. |
There is also a general DX/UX improvement that comes with raising the ceiling of expiration. Trying to keep IPNS website alive with spotty internet access is tricky. In effort to include this in Kubo 0.18.0-rc1 before holidays, I've released go-libp2p-kad-dht v0.20.0 with 48h expiration.
|
Should we go ahead and update js-libp2p-kad-dht (and js-ipfs' reprovide interval)? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks to everyone involved here.
Tracked here libp2p/rust-libp2p#3229. Contributions are always welcome. That said, not a requirement to move forward here. Thanks for stewarding this @lidel. |
Applies changes from libp2p/specs#451 New defaults are: Record Expiration: 48h Record Republish Interval: 22h Closes #3229
This patch applies changes from libp2p/specs#451. In particular, the new defaults are: - Record Expiration: 48h - Record Republish Interval: 22h Closes #3229. Pull-Request: #3230.
This PR updates the description of the Provider Record settings and most importantly proposes new values for both the republish interval and the expiration interval. The new proposed values are:
and they are based on the comprehensive study published here: https://github.com/protocol/network-measurements/blob/master/results/rfm17-provider-record-liveness.md