Add a new http-user-agent config. #209

rand0mdud3 · 2022-03-05T18:28:16Z

Gives the ability to override the HTTP user agent.

This should take care of #154

Also related to #120 (which applies to the mail user agent).

Gives the ability to override the HTTP user agent.

auouymous · 2022-03-07T01:59:19Z

LGTM. However, it was decided to deprecate user-agent and alias it to mail-user-agent, but I'm not sure how to do that. @Ekleog Do you know how?

Idea for fixing: add a setting mail-user-agent that is an alias to current user-agent, and add a deprecation warning to user-agent. #154 can then add an http-user-agent setting

amiryal

Seems like the old user_agent config already was used for HTTP, in the optional redirect filter:

rss2email/rss2email/post_process/redirect.py

Line 66 in 853ec71

request.add_header('User-agent', feed.user_agent)

Can you change it to use the new http_user_agent?

I found it while drafting a solution for adding a mail-user-agent option, BTW. When I have time, and if the branch allows it, I will push my work onto here.

Ekleog · 2022-03-07T16:22:19Z

@auouymous

LGTM. However, it was decided to deprecate user-agent and alias it to mail-user-agent, but I'm not sure how to do that. @Ekleog Do you know how?

We're probably doing a release soon as the changes are slowly piling up, so I'd say something like:

Add a mail-user-agent option and make previous uses of user-agent use mail-user-agent instead
Have the old user-agent option just set mail-user-agent and output a warning that on the next release the user-agent option will be removed and people should already migrate to mail-user-agent
Update the documentation
Add the change to the changelog
Make the release
Make user-agent error out instead of warning+redirecting to mail-user-agent

Does that make sense?

Ekleog · 2022-03-07T16:25:12Z

Or actually, given @amiryal's message about the fact that the previous user-agent was overloaded and the previous stage was basically a mess, we could skip the deprecation step and just write down the removal of the user-agent option in the changelog; though people's configuration files will break it may not be that bad because people won't read the warning messages anyway as it'll all be automated away.

(BTW, all this is relying on the fact that I'm assuming anyone using rss2email would know how to edit the configuration file if pointed to it by a warning/error; I'm thinking this is a reasonable enough assumption)

@amiryal

Noticed by @amiryal (thanks!)

rand0mdud3 · 2022-03-13T22:22:31Z

Apologies for the lag, fixed that use case I missed, thanks @amiryal!

amiryal · 2022-03-31T09:39:02Z

I found it while drafting a solution for adding a mail-user-agent option, BTW. When I have time, and if the branch allows it, I will push my work onto here.

That line of work ☝️ did not get anywhere useful. Since it was already overloaded for use in HTTP, and since I can’t think of a use case for 2 different values for HTTP and message-header agent strings*, I think we can kidnap user-agent and use it for both. (I.e. neither http-user-agent nor mail-user-agent should be added, and instead use the existing user-agent config for HTTP.)

* Frankly, I can’t think of a use case for changing the message header at all. It seems like this one is purely informational for the end recipient, so it’s only useful to change it if one likes to hide the use of r2e for whatever reason. If that is the case, then we can direct users to write a custom post_process hook.

amiryal · 2022-03-31T09:45:06Z

neither http-user-agent nor mail-user-agent should be added, and instead use the existing user-agent config for HTTP

@rand0mdud3 if you agree with the above, then I think it would be OK to implement it in this PR. Just remember to reflect the additional use of the config option in the manual.

Ekleog · 2022-04-01T09:57:07Z

@amiryal For backwards-compatibility reasons, I don't think changing the meaning of user-agent from "something somewhat useless but that doesn't impact usage" to "something that will change all outgoing http requests" would be a good idea.

So, just for this reason, I don't think reusing user-agent for http headers right now would be a good idea.

That said, just removing user-agent altogether and adding http-user-agent for the http header would probably make sense to me :)

auouymous · 2022-04-19T05:43:23Z

I agree with not using user-agent for HTTP and instead adding a new http-user-agent, but I don't fully agree with removing user-agent as it isn't much code and no one likes having features they use removed from a program. Do post-process hooks need to be added to rss2email or will it load them from a user owned directory? Can multiple post-process hooks be used? If both are possible, then maybe a post-process hook could replace user-agent.

Fun fact, f57513a#diff-86d7b95911fb37a51f57d39f3ab50678bdf8f85d9d6a1853bf05eb75729e8461L64 in #74 hijacked the static http user agent to add a customizable mail user agent.

amiryal · 2022-04-19T14:19:11Z

Fun fact, f57513a#diff-86d7b95911fb37a51f57d39f3ab50678bdf8f85d9d6a1853bf05eb75729e8461L64 in #74 hijacked the static http user agent to add a customizable mail user agent.

Somehow I missed that! So, effectively, f57513a introduced an unintended change of behaviour in 3.11, which we are now trying to untangle. Anecdotally, that same commit also broke the redirect post process, which was fixed in 6e859aa (version 3.12), and another fix for its fallout was spread across 30ca438 and 745c1c4.

I would put it this way:

A good netizen program identifies itself as the user-agent in all outgoing requests. (I.e. rss2email should not fall back to using feedparser’s identity.)
In light of the above, f57513a introduced a bug in 3.11.
To fix the bug, restore rss2email into the UA string and explain the effects clearly in the release notes.

Ignoring the discussion thus far, I hope we can agree on at least the first two bullets. The third bullet raises a couple of contention points:

If users started to rely on feedparser in the UA, is it OK to break it for them? To that I would answer yes, it’s OK, because with fixing the bug I would like to bring our users back to good netizenship, and the release notes should explain that.
If users started to rely on the fact that changing the mail header doesn’t affect the HTTP UA, is it OK to break it for them? To that I would answer yes, it’s OK, because that “feature” was never documented as such (documentation for the user-agent config was introduced in 3.11 together with the bug, never mentioning HTTP requests at all), and the release notes should explain that.

TL;DR

If updating the manual accordingly and adding the support of some good release notes to explain the situation, I am still in favour of:

neither http-user-agent nor mail-user-agent should be added, and instead use the existing user-agent config for HTTP

amiryal · 2022-04-19T14:29:03Z

Do post-process hooks need to be added to rss2email or will it load them from a user owned directory?

The latter — they simply need to be somewhere in sys.path.

Can multiple post-process hooks be used?

Only one can be named in the configuration, but there is nothing to stop users from calling multiple others in a chain from their custom post-process hook.

Add a new http-user-agent config.

b5191c9

Gives the ability to override the HTTP user agent.

amiryal requested changes Mar 7, 2022

View reviewed changes

Fix previous use of user_agent

1808864

Noticed by @amiryal (thanks!)

auouymous mentioned this pull request Mar 31, 2022

Feature Request: Modify the subject line #82

Closed

amiryal mentioned this pull request Apr 20, 2022

Fix a couple of HTTP user agent regressions #216

Merged

amiryal closed this in #216 Aug 25, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a new http-user-agent config. #209

Add a new http-user-agent config. #209

rand0mdud3 commented Mar 5, 2022

auouymous commented Mar 7, 2022

amiryal left a comment

Ekleog commented Mar 7, 2022

Ekleog commented Mar 7, 2022

rand0mdud3 commented Mar 13, 2022

amiryal commented Mar 31, 2022

amiryal commented Mar 31, 2022

Ekleog commented Apr 1, 2022

auouymous commented Apr 19, 2022

amiryal commented Apr 19, 2022

amiryal commented Apr 19, 2022

Add a new http-user-agent config. #209

Add a new http-user-agent config. #209

Conversation

rand0mdud3 commented Mar 5, 2022

auouymous commented Mar 7, 2022

amiryal left a comment

Choose a reason for hiding this comment

Ekleog commented Mar 7, 2022

Ekleog commented Mar 7, 2022

rand0mdud3 commented Mar 13, 2022

amiryal commented Mar 31, 2022

amiryal commented Mar 31, 2022

Ekleog commented Apr 1, 2022

auouymous commented Apr 19, 2022

amiryal commented Apr 19, 2022

TL;DR

amiryal commented Apr 19, 2022