-
-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor user agent handling and introduce site-specific quirks #5157
Conversation
We now use a format string for the user_agent setting and parse both backend's default user agents to get the needed information. Fixes #513
No point in unpacking them into their components just to pack them again.
WhatsApp Web: See #4445 Google Accounts: See #5147 Slack: See #4669 (With older QtWebEngine versions, Slack refuses to work, despite it (seemingly) working fine.) Dell Support Google Docs: See #4805 and https://bugreports.qt.io/browse/QTBUG-69652 Note that the default UA with Chrome/... shows a "browser not supported" warning. Fixes #4810
Seems fine to me. The reason for unpacking was because id was tracked per
Values, but if we track it globally that's not needed.
The UA parsing changes seems like a hack that will break at any change in the
format, but I don't personally care (I use a constant UA and I'm not going to
maintain that). Is there an escape-hatch to truly use the un-altered
user-agent from the backend in case the format changes?
I would personally resolve the quirk UAs before putting them in settings on
startup, and just use constants for overrides. I haven't profiled the
regression here yet, but reading the UA from config is one of the slowest
parts of per-request work in qutebrowser, and I'm not looking forward to even
more overhead here.
|
Ah, thanks for the explanation!
We've already had that hack (for getting the Chromium version) and other projects (e.g. Otter) had something similar for a long time - thanks to user agents being a legacy compatibility disaster, it's very unlikely to change. I also want to take a look at some crash reports where the Chromium version parsing didn't work - from a quick look, most are with qutebrowser in a weird state, but I'm not sure what'll happen with the current code.
Nope. I don't think one is needed, but I'll think about it some more.
That's why this PR adds caching there 😉 So this only matters for the first request to whatever website, and it prevents paying a cost for quirk UAs you're never going to use. |
j
Florian Bruhin writes:
That's why this PR adds caching there 😉 So this only matters for the
first request to whatever website, and it prevents paying a cost for quirk
UAs you're never going to use.
Ah, I missed that, this seems fine then :)
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems good. I'm sure you died a little more inside while writing this though. The web is such a wonderful place.
user_agents = { | ||
'https://web.whatsapp.com/': no_qtwe_ua, | ||
'https://accounts.google.com/ServiceLogin': firefox_ua, | ||
'https://*.slack.com/*': new_chrome_ua, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Slack complains on Qt 5.11 (it lets you dismiss the message now though, from memory) and doesn't complain on 5.13. I've only tested briefly on 5.12 and it doesn't appear to complain there either. So if you can bother checking for 5.11 before adding the slack one I think that would be appropriate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about this - apparently Slack is quite aggressive in blocking old Chromium versions despite them working fine (at least according to what @jgkamat said in #4669). That's why I ended up just sending Chrome/99
no matter what Qt version, so we don't need to update this a lot in the future. Maybe it'd be more reasonable to send a "real" Chrome version and only on affected Qt versions? But then we somehow need to notice when it breaks again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose tomorrow they could decide to start blocking whatever chome versions 5.12 is based on as well.
@@ -333,6 +381,8 @@ def init(args): | |||
global_settings = WebEngineSettings(_SettingsWrapper()) | |||
global_settings.init_settings() | |||
|
|||
_init_site_specific_quirks() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe put this in _update_settings()
too in case people have quirks off by default but then run into a broken site and don't want to restart. Although I suppose turning off still wouldn't work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it's a bit confusing when turning on a setting without restart works, but turning it off requires a restart. I'm also guessing people won't disable those much - the only reason I see to disable them is for testing the quirks.
desc: 'Enable quirks (such as faked user agent headers) needed to get | ||
specific sites to work properly.' | ||
|
||
# emacs: ' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It fixes emacs' buggy syntax highlighting which sometimes highlights the rest of the file as a string... There's another instance or two elsewhere in the file as well.
default: null | ||
default: 'Mozilla/5.0 ({os_info}) | ||
AppleWebKit/{webkit_version} (KHTML, like Gecko) | ||
{qt_key}/{qt_version} {upstream_browser_key}/{upstream_browser_version} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any downside to sticking qutebrowser/{qutebrowser_version}
in here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this will probably break many sites that have strict parsing of the UA
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe (I will test!) but if so I would think the same was true of the QtWebEngine bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My goal here was to keep the QtWebEngine user agent 1:1 the same to what it was before, and keeping the QtWebKit user agent somewhat close to stock as well (it sends something like qutebrowser/1.8.9
if an application sets that version, but Qt/5.14.0
for applications which don't set anything).
if not config.val.content.site_specific_quirks: | ||
return | ||
|
||
# default_ua = ("Mozilla/5.0 ({os_info}) " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this comment need to stay here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to keep it, as it makes it easier to add new user agent strings below by copy-pasting this one and adjusting it in some way.
Note to self: music.amazon.com is broken as well, see this Reddit post. From a quick test, I only can make it show different error messages but I can't actually make it work... |
Turns out it can - I fixed that in dcf8d83 and merged everything. Thanks @jgkamat and @toofar for the reviews! |
This PR refactors user-agent handling so we don't directly use the default UA from QtWebKit/QtWebEngine anymore (#513). To do so, the
user_agent
setting now is a template string, getting filled from information we parse from the default UA.With that change, we can also fix issues like #2800 properly, as we now can get a proper UA without having a QWebEngineProfile/QWebPage handy.
I also added some code so specific
ScopedValue
settings can be hidden fromqute://configdiff
, which fixes #4076. Finally, this allows us to introduce "site specific quirks", i.e. custom user agents for badly behaving websites such as Google, WhatsApp or Slack (#4810 and others).Since this turned out to be a quite big change, I'd love some feedback. Looking at the commits, for most commits one of you made some changes in those areas, so I'd appreciate if you'd at least review those if time allows - thanks!
Values.__init__
(@jgkamat, ditto)