Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Event Filtering Query is very slow #21778

Open
quentinlesceller opened this issue Nov 2, 2020 · 5 comments
Open

Event Filtering Query is very slow #21778

quentinlesceller opened this issue Nov 2, 2020 · 5 comments

Comments

@quentinlesceller
Copy link

System information

Geth version:

Geth
Version: 1.9.22-stable
Git Commit: c71a7e26a8b1e332bbf3262d88ba3ff32071456c
Architecture: amd64
Protocol Versions: [65 64 63]
Go Version: go1.15
Operating System: linux
GOPATH=
GOROOT=go

Geth is running as a daemon and is fully synced.

/usr/bin/geth --cache 2048 --http --http.api web3,eth --http.port 8545 --http.addr 0.0.0.0 --http.corsdomain * --http.vhosts *

The server is behind a firewall so no port are accessible from the outside. Another machine will do the call on the RPC server.

OS & Version: Ubuntu 20.04

Expected behaviour

The goal is to fetch the latest implementation address for a proxy ERC-20 following OpenZeppelin Upgradable interface.
Hence the way that I found is to get the latest Upgraded(address) event.

The code below does just that with the USDC contract (0xa0b86991c6218b36c1d19d4a2e9eb0ce3606eb48). It works with Infura but does not work with my local geth node.

The BlockByHash method was to test that the node was responding properly. The block query works almost instantly however the ethereum.FilterQuery is extremely slow and hang for a long time without response. Any advice?

Actual behaviour

The query just hang for a very long time.

Steps to reproduce the behaviour

package main

import (
	"context"
	"fmt"
	"log"
	"math/big"

	"github.com/ethereum/go-ethereum"
	"github.com/ethereum/go-ethereum/common"
	"github.com/ethereum/go-ethereum/ethclient"
)

func main() {
	client, err := ethclient.Dial("http://127.0.0.1:8545")
	if err != nil {
		log.Fatal(err)
	}

	contractAddress := common.HexToAddress("0xa0b86991c6218b36c1d19d4a2e9eb0ce3606eb48")
	// sha3 of []byte("Upgraded(address)")
	topicHash := common.HexToHash("0xbc7cd75a20ee27fd9adebab32041f755214dbc6bffa90cc0225b39da2e5c2d3b")
	query := ethereum.FilterQuery{
		FromBlock: big.NewInt(10743414),
		ToBlock:   nil,
		Addresses: []common.Address{
			contractAddress,
		},
		Topics: [][]common.Hash{{topicHash}},
	}
	b, err := client.BlockByHash(context.Background(), common.HexToHash("4e5a468e1e30332bee7bd3f2fe7ae60386651b4e98e0996548de238e503a1418"))
	if err != nil {
		log.Fatal(err)
	}
	fmt.Println(b)
	logs, err := client.FilterLogs(context.Background(), query)
	if err != nil {
		log.Fatal(err)
	}

	if len(logs) == 0 {
		log.Fatal("empty")
	}
	highestHeight := logs[0].BlockNumber
	upgradedAddress := common.BytesToAddress(logs[0].Data)
	for _, log := range logs {
		if log.BlockNumber > highestHeight {
			upgradedAddress = common.BytesToAddress(log.Data)
		}
	}

	fmt.Println(upgradedAddress.String())
}

Thank you.

@AusIV
Copy link
Contributor

AusIV commented Nov 11, 2020

Infura has a separate log index that allows it to search logs much more quickly than a conventional node can. A conventional node has to check your queries against the bloom filters for every block, and the receipts of any blocks where there is a hit on the bloom filters. For millions of blocks, that takes a long time.

My team has developed Flume - which is an open source version of Infura's log indexer. It takes a long while and several hundred gigs of disk to index all the data, but once you have the index built up responses are blazing fast.

@ligi
Copy link
Member

ligi commented Nov 12, 2020

Do you need to search 76 days worth of logs or can you make the span a bit shorter? Can you specify how slow exactly it is on your setup?

@holiman
Copy link
Contributor

holiman commented Nov 12, 2020

Some probably relevant context: the filter range is to from 10743414 to nil (which is 11241994 (as of today)), so almost 500K blocks.

@holiman
Copy link
Contributor

holiman commented Nov 12, 2020

The receipt lookup is a bit suboptimal. We do use a bloom filter to deduce which blocks are even interesting. That's pretty fast. Afterwards, we need to go through the hits and compare the topic more granularly, which filters out some false positives. And then we deliver the results.

However, the initial loading of receipts is not a simple "load from disk". It's a three-step process:

  1. Load receipts from disk,
  2. Load block body from disk
  3. Use block body to populate receipt metadata.

So what we could do is

  1. Load 'interesting' receipts from disk, without metadata
  2. Filter out those that aren't interesting (doesn't fit topic etc)
  3. Load metadata

That would probably reduce the time, a little bit. YMMV

@quentinlesceller
Copy link
Author

Do you need to search 76 days worth of logs or can you make the span a bit shorter?

I need to know the address of the latest implementation of the Upgradable Contract (OpenZeppelin Upgradable interface). I haven't find any other way to do it. So yeah basically scan the blocks from the latest Upgraded event. I need to fetch the exact address in order to check whether or not I have the ABI stored in DB.
Considering that it doesn't seems to be really feasible with my geth node without additional indexes, I think I'll hardcode the ABI of the original proxy contract for now as I don't think Centre USDC will ever change it.

Can you specify how slow exactly it is on your setup?

To be honest it was something well above 5/10 minutes so I stopped the query.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants