-
Notifications
You must be signed in to change notification settings - Fork 961
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adaptive log range #911
Adaptive log range #911
Conversation
Resolves #776. The idea is to adapt on the fly if calls start failing due to too much logs. Caching the previous step is an idea, but I don't think it would be that helpful and would increase complexity.
@yanivtal Yes, it's specific to Infura. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good!
// Code returned by Infura if a requests returns too many logs. | ||
// web3 doesn't seem to offer a better way of checking the error code. | ||
const TOO_MANY_LOGS_FINGERPRINT: &str = "ServerError(-32005)"; | ||
let eth_get_logs = move |eth_adapter: Self, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer if this remained in its own method, taking in the error fingerprint.
.from_err() | ||
}) | ||
.map_err(move |e| { | ||
e.into_inner().unwrap_or_else(move || { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like we're losing this error message. Is that deliberate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, since this is retried indefinitely, this error can never be printed afaict.
let string_err = e.to_string(); | ||
if string_err.contains(TOO_MANY_LOGS_FINGERPRINT) { | ||
let new_step = (step / 10).max(1); | ||
debug!(logger, "Reducing log step"; "new_step" => new_step); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about "Reducing block range size to scan for events"
+ "new_size"
, since that's what the environment variable is called?
This reverts commit 810892a.
Also realize it's correct to have step 0, we always increase the start and advance by a block.
Review addressed. This will help with slow syncs, but I've seen subgraphs quickly drop to a range of 10 blocks. Bringing parallel log requests back will probably be worth it. |
Are we not doing parallel log requests anymore? Another thought: How soon do we go back to the configured block range size? If we have a block range size of 10,000 and we have to reduce the range size to 5 to get past the first block in this range, does that mean we stick to a range size of 5 for the remaining 9,999 blocks? |
No, we dropped it in the block triggers PR. I was ok with that since it could complicate making this change which we need quickly, but we should bring it back with adaptive ranges in mind. Yes, that's right. We can try to increase the range more often, but there is a tradeoff where in one extreme we keep the low range forever and in the other extreme we start from the large range every time. |
Resolves #776. If we start getting this error even when requesting logs for a single block, we should re-evaluate in a new issue.
The first thing this does is have
ETHEREUM_FAST_SCAN_END
affect the block stream as a whole, we now multiplyETHEREUM_BLOCK_RANGE_SIZE
by 10 if doing a fast scan. The one-size-fits-all nature of these values can be an issue, I also added a special case to go one block at a time if there are unconditional block triggers. We should at some point change the way we control the amount of triggers processed at a time.The adaptive logic for log ranges first tries requesting the entire given range, if that doesn't work the step is reduced on each failure until we no longer get the 1000 logs error.
And then there are some refactors along the way.