Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Efficient way to get multiple blocks / transactions? #1194

Closed
Zvezdin opened this issue Nov 22, 2017 · 8 comments
Closed

Efficient way to get multiple blocks / transactions? #1194

Zvezdin opened this issue Nov 22, 2017 · 8 comments
Assignees
Labels
2.x 2.0 related issues Feature Request Stale Has not received enough activity

Comments

@Zvezdin
Copy link

Zvezdin commented Nov 22, 2017

Hello. I've been using web3 for a while now and I'm currently trying to use it to extract every block and transaction to store in my own database. Unfortunately, web3's current API doesn't provide an efficient way to get multiple items - just block by block. What I came up with as a fast way to get the whole blockchain is to request 10 blocks (async) at once. When a block (including TXs) arrives, a new one is requested. This whole process takes less than 24 hours for me.

However, I'm also trying to get all transaction receipts from TXs to contracts. Getting that via the aforementioned method would take a week, if not two, because each individual receipt is a different request.

Is there any faster way to get multiple receipts at once?

@danuker
Copy link

danuker commented Dec 2, 2017

I am also interested in this. However, apparently this is not a use case for which Ethereum is designed:
ethereum/go-ethereum#2104 (comment)

@medvedev1088
Copy link

medvedev1088 commented Jun 29, 2018

See if this tool will be useful https://github.com/medvedev1088/ethereum-etl. It uses JSON RPC batch request to speed up the exporting process for blocks, transactions, receipts, logs, erc20 transfers.

Also help upvote this feature request ethereum/go-ethereum#17044

@danuker
Copy link

danuker commented Jun 29, 2018

@medvedev1088 There is also a separate project for indexing the Ethereum blockchain.
Check out if it suits your needs: https://quickblocks.io/

@mesqueeb
Copy link

mesqueeb commented Jul 6, 2018

I'm writing a nodejs program that gets a large amount of blocks (>100,000) and looks at each transaction inside to filter / analyse the data.
I'm currently using

eth.getBlock(i, true, handle)

However, even though the handle is called asynchronously, my program (if not manually throttled) blasts out 100,000 requests, overheating the infura websocket I think...

I have manually made a throttle to make only 10 requests and wait 'till all of them are handled, then go to the next 10.

My question: I imagine that a lot of people face this same problem, so is there any method from web3 more appropriate to request multiple blocks? Or is there any design aspect I can improve on?

PS: I had a look at the other solutions, but I'm writing in Nodejs...

@Zvezdin
Copy link
Author

Zvezdin commented Jul 6, 2018

@mesqueeb as far as I'm aware, there is none. I also took a similar approach in node in one of my projects (https://github.com/Zvezdin/blockchain-predictor/blob/master/js/getter.js). In addition, I cached each block and other data locally so that it's only requested once. This gets the job done for my data analysis case. Other's solutions are the only other valid ones I was able to find.

@BenKnigge
Copy link

I'm likely going to be implementing something for my own personal use sometime over the next day or two.

My plane is to add another api endpoint based on GetBlockByNumber in the file

https://github.com/ethereum/go-ethereum/blob/master/internal/ethapi/api.go

I was thinking of calling it GetBlocksByRange with the following signature.

GetBlocksByRange(ctx context.Context, fromBlockNr rpc.BlockNumber, toBlockNr rpc.BlockNumber, fullTx bool) ([]map[string]interface{}, error)

It's basically just a loop around the code in GetBlockByNumber. I imagine that returning 100s or 1000s of blocks in a single request would be significantly more efficient.

I haven't contributed to this project but would love it, if it was possible, to add this into a future version of Getth upon my completion and testing.

Does anyone have any thoughts on this prior to me commencing?

@nivida do you think you help guide me through the geth contribution process upon my completion?

@nivida nivida added the 2.x 2.0 related issues label Jun 20, 2019
@MartinWeise
Copy link

Is there any progress made so far? BR

@github-actions
Copy link

github-actions bot commented Jul 4, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2.x 2.0 related issues Feature Request Stale Has not received enough activity
Projects
None yet
Development

No branches or pull requests

7 participants