Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Webfinger timeout seems too short : increasing sidekiq error rate caused by PR #13836 #14091

Closed
ponapalt opened this issue Jun 19, 2020 · 3 comments · Fixed by #14919
Closed

Comments

@ponapalt
Copy link

Expected behaviour

Acceptable sidekiq error rate.

Actual behaviour

Sidekiq retry and dead job queue growing more than before.

Below is error message sample:

HTTP::TimeoutError
execution expired
/usr/home/mastodon/live/app/helpers/webfinger_helper.rb:8:in initialize' /usr/home/mastodon/live/app/helpers/webfinger_helper.rb:8:in open'
/usr/home/mastodon/live/app/helpers/webfinger_helper.rb:8:in `block in connect'

Steps to reproduce the problem

Upgrade mastodon to master branch d890abf (PR 13836) or newer.

Specifications

This problem is caused by PR #13836

Amendment Proposal

I think connection timeout is too short for busy servers.
It's difficult to find optimal value of timeout, but, based on my investigation, It seems 120sec. is good start for us.

line 29 of app/helpers/webfinger_helper.rb:

timeout_options: {
 write_timeout: 10,
 connect_timeout: 120,  # <== increase timeout value
 read_timeout: 10,
},

Note

This problem seems to occur while processing ActivityPub Announce activity (a.k,a, Boost, BT).
I haven't checked in detail, but there may be performance issues in processing Announce activity.

@Gargron
Copy link
Member

Gargron commented Jun 19, 2020

I believe raised error rates is a natural effect of tightened timeout limits. Previously those requests were holding up the Sidekiq queue by possibly minutes. The idea is to fail fast and retry later because it allows more successful requests to be processed instead of waiting for just one thing.

@ponapalt
Copy link
Author

Thanks for your comment.
I thought twice about shortening the timeout, and it's also a convincing strategy for saving resources.

It seems I was too concerned about missing toots due to too short timeout.
I'd like to close this issue after waiting a while for any other opinions.

Gargron added a commit that referenced this issue Oct 4, 2020
There are edge cases where requests to certain hosts timeout when
using the vanilla HTTP.rb gem, which the goldfinger gem uses. Now
that we no longer need to support OStatus servers, webfinger logic
is so simple that there is no point encapsulating it in a gem, so
we can just use our own Request class. With that, we benefit from
more robust timeout code and IPv4/IPv6 resolution.

Fix #14091
@Gargron
Copy link
Member

Gargron commented Oct 4, 2020

I apologize for dismissing this bug report. I believe there was something amiss after all.

I noticed that my dead job queue has many entries for HTTP::TimeoutError. When I check the domain from a browser, it opens instantly. When I retry the job, it raises HTTP::TimeoutError. I was able to narrow it down to the goldfinger gem and vanilla HTTP.rb that it uses. Something around its default DNS resolution and timeout handling is not correct. Therefore, I am submitting a fix that removes the goldfinger gem and instead re-implements simplified webfinger resolution using our own, more advanced Request class.

@Gargron Gargron reopened this Oct 4, 2020
Gargron added a commit that referenced this issue Oct 7, 2020
There are edge cases where requests to certain hosts timeout when
using the vanilla HTTP.rb gem, which the goldfinger gem uses. Now
that we no longer need to support OStatus servers, webfinger logic
is so simple that there is no point encapsulating it in a gem, so
we can just use our own Request class. With that, we benefit from
more robust timeout code and IPv4/IPv6 resolution.

Fix #14091
Gargron added a commit that referenced this issue Oct 7, 2020
There are edge cases where requests to certain hosts timeout when
using the vanilla HTTP.rb gem, which the goldfinger gem uses. Now
that we no longer need to support OStatus servers, webfinger logic
is so simple that there is no point encapsulating it in a gem, so
we can just use our own Request class. With that, we benefit from
more robust timeout code and IPv4/IPv6 resolution.

Fix #14091
mashirozx added a commit to mashirozx/mastodon that referenced this issue Oct 16, 2020
* Bump babel-jest from 26.3.0 to 26.5.2 (mastodon#14945)

Bumps [babel-jest](https://github.com/facebook/jest/tree/HEAD/packages/babel-jest) from 26.3.0 to 26.5.2.
- [Release notes](https://github.com/facebook/jest/releases)
- [Changelog](https://github.com/facebook/jest/blob/master/CHANGELOG.md)
- [Commits](https://github.com/facebook/jest/commits/v26.5.2/packages/babel-jest)

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump @github/webauthn-json from 0.5.5 to 0.5.6 (mastodon#14946)

Bumps [@github/webauthn-json](https://github.com/github/webauthn-json) from 0.5.5 to 0.5.6.
- [Release notes](https://github.com/github/webauthn-json/releases)
- [Commits](github/webauthn-json@v0.5.5...v0.5.6)

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump sass from 1.26.11 to 1.26.12 (mastodon#14947)

Bumps [sass](https://github.com/sass/dart-sass) from 1.26.11 to 1.26.12.
- [Release notes](https://github.com/sass/dart-sass/releases)
- [Changelog](https://github.com/sass/dart-sass/blob/master/CHANGELOG.md)
- [Commits](sass/dart-sass@1.26.11...1.26.12)

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump eslint-plugin-react from 7.21.2 to 7.21.3 (mastodon#14950)

Bumps [eslint-plugin-react](https://github.com/yannickcr/eslint-plugin-react) from 7.21.2 to 7.21.3.
- [Release notes](https://github.com/yannickcr/eslint-plugin-react/releases)
- [Changelog](https://github.com/yannickcr/eslint-plugin-react/blob/master/CHANGELOG.md)
- [Commits](jsx-eslint/eslint-plugin-react@v7.21.2...v7.21.3)

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump mini-css-extract-plugin from 0.11.0 to 0.11.3 (mastodon#14949)

Bumps [mini-css-extract-plugin](https://github.com/webpack-contrib/mini-css-extract-plugin) from 0.11.0 to 0.11.3.
- [Release notes](https://github.com/webpack-contrib/mini-css-extract-plugin/releases)
- [Changelog](https://github.com/webpack-contrib/mini-css-extract-plugin/blob/master/CHANGELOG.md)
- [Commits](webpack-contrib/mini-css-extract-plugin@v0.11.0...v0.11.3)

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump jest from 26.4.2 to 26.5.2 (mastodon#14951)

Bumps [jest](https://github.com/facebook/jest) from 26.4.2 to 26.5.2.
- [Release notes](https://github.com/facebook/jest/releases)
- [Changelog](https://github.com/facebook/jest/blob/master/CHANGELOG.md)
- [Commits](jestjs/jest@v26.4.2...v26.5.2)

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump eslint from 7.6.0 to 7.10.0 (mastodon#14948)

Bumps [eslint](https://github.com/eslint/eslint) from 7.6.0 to 7.10.0.
- [Release notes](https://github.com/eslint/eslint/releases)
- [Changelog](https://github.com/eslint/eslint/blob/master/CHANGELOG.md)
- [Commits](eslint/eslint@v7.6.0...v7.10.0)

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* update themes

* Remove dependency on goldfinger gem (mastodon#14919)

There are edge cases where requests to certain hosts timeout when
using the vanilla HTTP.rb gem, which the goldfinger gem uses. Now
that we no longer need to support OStatus servers, webfinger logic
is so simple that there is no point encapsulating it in a gem, so
we can just use our own Request class. With that, we benefit from
more robust timeout code and IPv4/IPv6 resolution.

Fix mastodon#14091

* Fix unread notification marker not updating when mounting column (mastodon#14954)

* Fix issue checking for last unread notification when there are gaps (mastodon#14960)

* add & fix themes

* update theme

* fix theme

* fix theme

* Add IP-based rules (mastodon#14963)

* Fix browser notification permission request logic (mastodon#13543)

* Add notification permission handling code

* Request notification permission when enabling any notification setting

* Add badge to notification settings when permissions insufficient

* Disable alerts by default, requesting permission and enable them on onboarding

* Add duration parameter to muting. (mastodon#13831)

* Adding duration to muting.

* Remove useless checks

* helm: add optional cron job to run `tootctl remove media` (mastodon#14396)

* Change how CDN_HOST is passed down to make assets build reproducible (mastodon#14381)

* Change how CDN_HOST is passed down to make assets build reproducible

* Change webpacker/webpack configuration to dynamically load publicPath based on meta header

* Fix embedded layout missing the cdn-host meta header

* Bump compression-webpack-plugin from 6.0.2 to 6.0.3 (mastodon#14979)

Bumps [compression-webpack-plugin](https://github.com/webpack-contrib/compression-webpack-plugin) from 6.0.2 to 6.0.3.
- [Release notes](https://github.com/webpack-contrib/compression-webpack-plugin/releases)
- [Changelog](https://github.com/webpack-contrib/compression-webpack-plugin/blob/master/CHANGELOG.md)
- [Commits](webpack-contrib/compression-webpack-plugin@v6.0.2...v6.0.3)

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump sass-loader from 10.0.2 to 10.0.3 (mastodon#14977)

Bumps [sass-loader](https://github.com/webpack-contrib/sass-loader) from 10.0.2 to 10.0.3.
- [Release notes](https://github.com/webpack-contrib/sass-loader/releases)
- [Changelog](https://github.com/webpack-contrib/sass-loader/blob/master/CHANGELOG.md)
- [Commits](webpack-contrib/sass-loader@v10.0.2...v10.0.3)

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump imports-loader from 1.1.0 to 1.2.0 (mastodon#14976)

Bumps [imports-loader](https://github.com/webpack-contrib/imports-loader) from 1.1.0 to 1.2.0.
- [Release notes](https://github.com/webpack-contrib/imports-loader/releases)
- [Changelog](https://github.com/webpack-contrib/imports-loader/blob/master/CHANGELOG.md)
- [Commits](webpack-contrib/imports-loader@v1.1.0...v1.2.0)

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump tzinfo-data from 1.2020.1 to 1.2020.2 (mastodon#14966)

Bumps [tzinfo-data](https://github.com/tzinfo/tzinfo-data) from 1.2020.1 to 1.2020.2.
- [Release notes](https://github.com/tzinfo/tzinfo-data/releases)
- [Commits](tzinfo/tzinfo-data@v1.2020.1...v1.2020.2)

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump rubocop from 0.92.0 to 0.93.0 (mastodon#14967)

Bumps [rubocop](https://github.com/rubocop-hq/rubocop) from 0.92.0 to 0.93.0.
- [Release notes](https://github.com/rubocop-hq/rubocop/releases)
- [Changelog](https://github.com/rubocop-hq/rubocop/blob/master/CHANGELOG.md)
- [Commits](rubocop/rubocop@v0.92.0...v0.93.0)

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump file-loader from 6.1.0 to 6.1.1 (mastodon#14974)

Bumps [file-loader](https://github.com/webpack-contrib/file-loader) from 6.1.0 to 6.1.1.
- [Release notes](https://github.com/webpack-contrib/file-loader/releases)
- [Changelog](https://github.com/webpack-contrib/file-loader/blob/master/CHANGELOG.md)
- [Commits](webpack-contrib/file-loader@v6.1.0...v6.1.1)

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump eslint-plugin-react from 7.21.3 to 7.21.4 (mastodon#14968)

Bumps [eslint-plugin-react](https://github.com/yannickcr/eslint-plugin-react) from 7.21.3 to 7.21.4.
- [Release notes](https://github.com/yannickcr/eslint-plugin-react/releases)
- [Changelog](https://github.com/yannickcr/eslint-plugin-react/blob/master/CHANGELOG.md)
- [Commits](jsx-eslint/eslint-plugin-react@v7.21.3...v7.21.4)

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump terser-webpack-plugin from 4.2.2 to 4.2.3 (mastodon#14971)

Bumps [terser-webpack-plugin](https://github.com/webpack-contrib/terser-webpack-plugin) from 4.2.2 to 4.2.3.
- [Release notes](https://github.com/webpack-contrib/terser-webpack-plugin/releases)
- [Changelog](https://github.com/webpack-contrib/terser-webpack-plugin/blob/master/CHANGELOG.md)
- [Commits](webpack-contrib/terser-webpack-plugin@v4.2.2...v4.2.3)

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump eslint from 7.10.0 to 7.11.0 (mastodon#14975)

Bumps [eslint](https://github.com/eslint/eslint) from 7.10.0 to 7.11.0.
- [Release notes](https://github.com/eslint/eslint/releases)
- [Changelog](https://github.com/eslint/eslint/blob/master/CHANGELOG.md)
- [Commits](eslint/eslint@v7.10.0...v7.11.0)

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump sass from 1.26.12 to 1.27.0 (mastodon#14973)

Bumps [sass](https://github.com/sass/dart-sass) from 1.26.12 to 1.27.0.
- [Release notes](https://github.com/sass/dart-sass/releases)
- [Changelog](https://github.com/sass/dart-sass/blob/master/CHANGELOG.md)
- [Commits](sass/dart-sass@1.26.12...1.27.0)

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump jest from 26.5.2 to 26.5.3 (mastodon#14969)

Bumps [jest](https://github.com/facebook/jest) from 26.5.2 to 26.5.3.
- [Release notes](https://github.com/facebook/jest/releases)
- [Changelog](https://github.com/facebook/jest/blob/master/CHANGELOG.md)
- [Commits](jestjs/jest@v26.5.2...v26.5.3)

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Fix a bear check when the activity object is nil (mastodon#14981)

* Change how missing desktop notifications permission is displayed (mastodon#14985)

Add missing controls for new notification type

* Fix strings that could not be translated (mastodon#14980)

* Fix translation string (mastodon#14986)

* update theme

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Eugen Rochko <eugen@zeonfederated.com>
Co-authored-by: ThibG <thib@sitedethib.com>
Co-authored-by: OSAMU SATO <satosamu@gmail.com>
Co-authored-by: Alex Dunn <dunn.alex@gmail.com>
Co-authored-by: Takeshi Umeda <noel.yoshiba@gmail.com>
Co-authored-by: mayaeh <mayaeh@marimo-net.org>
Gargron added a commit that referenced this issue Oct 19, 2020
There are edge cases where requests to certain hosts timeout when
using the vanilla HTTP.rb gem, which the goldfinger gem uses. Now
that we no longer need to support OStatus servers, webfinger logic
is so simple that there is no point encapsulating it in a gem, so
we can just use our own Request class. With that, we benefit from
more robust timeout code and IPv4/IPv6 resolution.

Fix #14091
highemerly pushed a commit to highemerly/mastodon that referenced this issue Dec 30, 2020
There are edge cases where requests to certain hosts timeout when
using the vanilla HTTP.rb gem, which the goldfinger gem uses. Now
that we no longer need to support OStatus servers, webfinger logic
is so simple that there is no point encapsulating it in a gem, so
we can just use our own Request class. With that, we benefit from
more robust timeout code and IPv4/IPv6 resolution.

Fix mastodon#14091
aquarla pushed a commit to iwatedon/mastodon that referenced this issue Dec 24, 2021
There are edge cases where requests to certain hosts timeout when
using the vanilla HTTP.rb gem, which the goldfinger gem uses. Now
that we no longer need to support OStatus servers, webfinger logic
is so simple that there is no point encapsulating it in a gem, so
we can just use our own Request class. With that, we benefit from
more robust timeout code and IPv4/IPv6 resolution.

Fix mastodon#14091
shouo1987 pushed a commit to CrossGate-Pawoo/mastodon that referenced this issue Dec 7, 2022
There are edge cases where requests to certain hosts timeout when
using the vanilla HTTP.rb gem, which the goldfinger gem uses. Now
that we no longer need to support OStatus servers, webfinger logic
is so simple that there is no point encapsulating it in a gem, so
we can just use our own Request class. With that, we benefit from
more robust timeout code and IPv4/IPv6 resolution.

Fix mastodon#14091
aquarla pushed a commit to iwatedon/mastodon that referenced this issue Mar 11, 2024
There are edge cases where requests to certain hosts timeout when
using the vanilla HTTP.rb gem, which the goldfinger gem uses. Now
that we no longer need to support OStatus servers, webfinger logic
is so simple that there is no point encapsulating it in a gem, so
we can just use our own Request class. With that, we benefit from
more robust timeout code and IPv4/IPv6 resolution.

Fix mastodon#14091
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants