Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IP Filter #635

Merged
merged 4 commits into from
Oct 5, 2023
Merged

IP Filter #635

merged 4 commits into from
Oct 5, 2023

Conversation

peterjan
Copy link
Member

@peterjan peterjan commented Oct 2, 2023

Our IP filter isn't always able to resolve the host IP. When we encounter an error, we consider the host's IP as redundant, causing unnecessary churn at times. This PR doesn't really change that but whilst trying to figure out where the churn was coming from I found some stuff I wanted to improve or suggest.

I'm not sure whether we can really handle the errors better unfortunately... Currently we treat any error to resolve a host's IP as a redundant IP, this will definitely cause some churn since sometimes we run into server misbehaving errors. We now log them at least so we're more aware.

edit: @ChrisSchinnerl I refactored it a little because I wanted to get rid of the .Reset() and find a reasonably good spot to prune the cache. You can now ask the contractor for a newIPFilter that wraps a resolver type that has a cache.

@peterjan
Copy link
Member Author

peterjan commented Oct 3, 2023

@ChrisSchinnerl I found that on both my node and the integrity-checker node the "server misbehaving" and "I/O timeout" errors that sometimes occur on DNS lookups cause contract to flip-flop in and out of the contract set causing unnecessary migrations and spammy alert messages.

Currently we will mark the host as redundant and move on but maybe we can either increase the timeout or even use a cached lookup response in case one fails? We could add the IPFilter on the contractor instead and resort to using cached addresses if we have one that's not stale.

autopilot/contractor.go Outdated Show resolved Hide resolved
autopilot/ipfilter.go Outdated Show resolved Hide resolved
@ChrisSchinnerl
Copy link
Member

@ChrisSchinnerl I found that on both my node and the integrity-checker node the "server misbehaving" and "I/O timeout" errors that sometimes occur on DNS lookups cause contract to flip-flop in and out of the contract set causing unnecessary migrations and spammy alert messages.

Currently we will mark the host as redundant and move on but maybe we can either increase the timeout or even use a cached lookup response in case one fails? We could add the IPFilter on the contractor instead and resort to using cached addresses if we have one that's not stale.

Yeah we could try that. Maybe give them a reasonable timeout like a day or so.

@peterjan peterjan self-assigned this Oct 5, 2023
@ChrisSchinnerl ChrisSchinnerl merged commit e7cafa3 into master Oct 5, 2023
6 checks passed
@ChrisSchinnerl ChrisSchinnerl deleted the pj/ip-filter branch October 5, 2023 14:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants