-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
roachprod: ns-cloud-a1.googledomains.com lookup failure #111269
Labels
Projects
Comments
cc @cockroachdb/test-eng |
herkolategan
added a commit
to herkolategan/cockroach
that referenced
this issue
Sep 26, 2023
Previously we observed a flake while trying to lookup the DNS server's ip `ns-cloud-a1.googledomains.com`. This change replaces the host name with its static IP (216.239.32.106) in order to reduce flakes. Fixes: cockroachdb#111269 Epic: None Release Note: None
herkolategan
added a commit
to herkolategan/cockroach
that referenced
this issue
Oct 25, 2023
Previously we observed a flake while trying to lookup the DNS server's ip `ns-cloud-a1.googledomains.com`. This change falls back to a known static IP (216.239.32.106) if the lookup fails in order to reduce flakes. Fixes: cockroachdb#111269 Epic: None Release Note: None
herkolategan
added a commit
to herkolategan/cockroach
that referenced
this issue
Nov 2, 2023
Previously we observed flakes while trying to lookup the DNS server's IP `ns-cloud-a1.googledomains.com`. This change adds a new function that uses multiple resolvers to resolve the IP. In the worst case it falls back to a known static IP (216.239.32.106), although this IP might not remain correct. Fixes cockroachdb#111269 Epic: None Release Note: None
herkolategan
added a commit
to herkolategan/cockroach
that referenced
this issue
Nov 17, 2023
Previously `net.LookupSRV` with a custom resolver was used to lookup DNS records. This approach resulted in several flakes and required waiting on DNS servers to have the records available. The CLI is more stable, but has a greater call overhead. Fixes cockroachdb#111269 Epic: None Release Note: None
craig bot
pushed a commit
that referenced
this issue
Nov 17, 2023
113934: roachprod: use gcloud CLI instead of net.LookupSRV r=renatolabs a=herkolategan Previously `net.LookupSRV` with a custom resolver was used to lookup DNS records. This approach resulted in several flakes and required waiting on DNS servers to have the records available. The CLI is more stable, but has a greater call overhead. This PR also introduces a cache to reduce the cost of the `LookupSRVRecords` call which could be called frequently depending on the origin of use. The cache is updated for any CRUD operations on the DNS entries, and a call to the CLI will not occur if any entry exists for the name the lookup is attempting. The names are also normalised to remove a trailing dot in order to make matching against the cache work correctly. There is a small risk that the cache could go out of sync if any other roachprod process manipulates the records with a create, update or destroy operation, while a continuous roachprod process is interacting with the entries. This risk is relatively small and usually applies to roachtest rather than everyday use of roachprod. Fixes #111269 Epic: None Release Note: None 113996: upgrade: use high priority txn's to update the cluster version r=fqazi a=fqazi Previously, it was possible for the leasing subsystem to starve out attempts to set the cluster version during upgrades, since the leasing subsystem uses high priority txn for renewals. To address this, this patch makes the logic to set the cluster version high priority so it can't be pushed out by lease renewals. Fixes: #113908 Release note (bug fix): Addressed a bug that could cause cluster version finalization to get starved out by descriptor lease renewals on larger clusters. Co-authored-by: Herko Lategan <herko@cockroachlabs.com> Co-authored-by: Faizan Qazi <faizan@cockroachlabs.com>
cockroach-teamcity
pushed a commit
to cockroach-teamcity/cockroach
that referenced
this issue
Nov 27, 2023
Previously `net.LookupSRV` with a custom resolver was used to lookup DNS records. This approach resulted in several flakes and required waiting on DNS servers to have the records available. The CLI is more stable, but has a greater call overhead. Fixes cockroachdb#111269 Epic: None Release Note: None
annrpom
pushed a commit
to annrpom/cockroach
that referenced
this issue
Nov 29, 2023
Previously `net.LookupSRV` with a custom resolver was used to lookup DNS records. This approach resulted in several flakes and required waiting on DNS servers to have the records available. The CLI is more stable, but has a greater call overhead. Fixes cockroachdb#111269 Epic: None Release Note: None
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
We observed a DNS flake on a custom run of roachtest [1].
It was unable to lookup
ns-cloud-a1.googledomains.com
the DNS server against which follow-up DNS requests would have been made.[1] https://teamcity.cockroachdb.com/buildConfiguration/Cockroach_Nightlies_RoachtestNightlyGceBazel/11905017
Jira issue: CRDB-31847
The text was updated successfully, but these errors were encountered: