Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ponder fetches & indexes more logs than necessary #25

Closed
Tracked by #55
0xOlias opened this issue Dec 29, 2022 · 0 comments · Fixed by #100
Closed
Tracked by #55

Ponder fetches & indexes more logs than necessary #25

0xOlias opened this issue Dec 29, 2022 · 0 comments · Fixed by #100

Comments

@0xOlias
Copy link
Collaborator

0xOlias commented Dec 29, 2022

This issue has two related parts.

  1. Ponder currently fetches all events emitted by contracts included in ponder.config.js, regardless of which events the user has provided handlers for. We could significantly reduce the backfill load for certain projects by lazily fetching event data, e.g. only fetch events for handlers the user has provided. Example: I want to index the ownership changes (OwnershipTransferred) for an ERC20 contract without indexing every Transfer event.

  2. Ponder currently adds all events available for a contract to the indexing queue, regardless of which events the user has provided handlers for. In the indexing task function, if a handler is not found for the given event log, it just does nothing and returns early. Instead, when adding logs to the indexing queue, we could filter to only include logs for events that the user has provided handlers for.

Part 2 is much easier to implement than part 1. The parts can be implemented independently.

Here's a rough sketch for how to go about Part 2 (this could be a huge perf win)

  • During the backfill, when we get a new log, use the source’s ABI interface to decode the log and add the event name (signature?) and decoded parameter data to the database in optional columns
  • Add a parameter to CacheStore.getLogs function that allows us to filter by event name/signature. Also update that query to get the block and transaction associated with each log. Also probably add pagination for that query because the size of the response could get massive with all the blocks and transactions.
  • In handleNewLogs, pass the list of handled event signatures to the new getLogs function
  • Update the worker to remove basically all the setup crap - the log object now has everything it needs (event name, parsed params, assoc block and transaction)
  • Now what happens when the user initially did the backfill with a partial ABI, then adds a complete ABI? This could only reload on a full server reload. So maybe, as an async process on startup, query the cache store for all the logs for each source that have an undefined signature, and attempt to parse them using the current ABI. If any are a hit, update those records…
  • This implementation would also solve Improve error message for "no matching event" #36

Edit 2/10/23

AFAICT, it's not possible to verify that a provided ABI is correct and complete for a given contract. So, the implementation here should not persist decoded event signatures or parameters. Instead, what Ponder can do is persist all the log topics independently (and create indexes for them). Then, when logs need to be fetched to add to the handler queue, the query can include a clause to only include event logs where the first topic matches the hash of the names of the events that the user has provided handlers for - something like AND topic_0 IN ["0xABC", "0x123"]. This is actually pretty simple, but it does unfortunately require a migration to the cache store tables and a change in the code that persists logs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant