Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gateway tracking whether requested content is in Database #14

Open
vasco-santos opened this issue Feb 10, 2022 · 8 comments
Open

Gateway tracking whether requested content is in Database #14

vasco-santos opened this issue Feb 10, 2022 · 8 comments
Assignees
Labels
kind/enhancement A net-new feature or improvement to an existing feature need/triage Needs initial labeling and prioritization

Comments

@vasco-santos
Copy link
Member

We want to know if gateway requested CIDs are root CIDs stored in Content table (and also if they are Pinned).

Requirements:

  • Keep state with counter of:
    • requested CIDs stored
    • requested CIDs pinned
    • requested CIDs pinQueued
    • requested CIDs not stored
@vasco-santos vasco-santos added kind/enhancement A net-new feature or improvement to an existing feature need/triage Needs initial labeling and prioritization labels Feb 10, 2022
@vasco-santos
Copy link
Member Author

@dchoi27 let me know if you have other thoughts/ideas of things we should look into in this context.

Probably a special case if we fail to request but content is in the DB?
Or track some kind of relationship on how "old" is the content that is being requested? Maybe an histogram with like 0.5h, 1h, 2h, 4h, 12h, 24h, 3 days, ... + Inf

@dchoi27
Copy link
Contributor

dchoi27 commented Feb 10, 2022

Yes for sure how old the content is (when it was requested vs. when it was first uploaded)
Could you tell me more about "if we fail to request but content is in the DB"? Like if a user requests data we have but we can't fetch it?

Can we track the metrics around the response for each of the groups above? E.g. if it's pinQueued, does it take longer / less reliable to fetch?

@vasco-santos
Copy link
Member Author

Could you tell me more about "if we fail to request but content is in the DB"? Like if a user requests data we have but we can't fetch it?

Yes, so this would be targeting the incomplete uploads.

Can we track the metrics around the response for each of the groups above? E.g. if it's pinQueued, does it take longer / less reliable to fetch?

Yes, that's a good idea

@dchoi27 dchoi27 changed the title Gateway tracking wether requested content is in Database Gateway tracking whether requested content is in Database Feb 11, 2022
@dchoi27
Copy link
Contributor

dchoi27 commented Feb 11, 2022

Awesome, SGTM

@JeffLowe
Copy link

This sounds very similar to the needs and plans we have for niftysave (discussed as recently as today with @mikeal ). I'm pulling in @the-simian here. You two may sync up on roadmap to implement this to meet both needs.

@olizilla
Copy link
Contributor

@dchoi27 how important are these stats to us? In order to make this work nftstorage/nft.storage#1386 adds logic to hit the nftstorage db for every single CID that is requested from the gateway. That seems like an amplification point where a spike in traffic to the gateway cause a spike in requests to the nftstorage db... two systems that are currently isolated from each other become co-dependent.

in the worst case, a sustainable increase in gateway trafffic could be an unsustainable increase in nftstorage db reads... we can and will continue to optimse and grow that db, but I'd feel more comfortable if we ditched these metrics and kept the gateways sparate from the nft.storage api

Also notable adding these stats makes the current gateway impl less reusable / in need of more customisation to be used as a web3.storage gateway.

@dchoi27
Copy link
Contributor

dchoi27 commented Mar 17, 2022

So I think the main goals of these stats would be to:

  • See if we can draw patterns for when we have performance issues (i.e., get some more visibility into Cluster as a black box)
  • Understand user behavior so we can better optimize for it when warming the cache

The former probably gets solved by IPFS Elastic Provider in the long-run, so if there are good reasons not to do a live lookup for every CID to understand its pin status at the time, it's probably not worth doing. But for the latter, it'd be great to at least be able to have periodic datasets with samples showing a CID and when it was requested vs. when it was uploaded if there's a way to do that asynchronously, and in a way that doesn't risk the performance of the entire database.

@vasco-santos
Copy link
Member Author

if there's a way to do that asynchronously, and in a way that doesn't risk the performance of the entire database.

The solution here is going through logs and get metrics from a different analyser, like a Digital Ocean App similar to checkup tool Alan built

@vasco-santos vasco-santos transferred this issue from nftstorage/nft.storage Apr 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/enhancement A net-new feature or improvement to an existing feature need/triage Needs initial labeling and prioritization
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants