Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTP statuses cleanup #230

Merged
merged 2 commits into from Jan 14, 2024
Merged

Conversation

amiryal
Copy link
Member

@amiryal amiryal commented Nov 4, 2022

Fixes #229 by not attempting to enumerate all non-error statuses explicitly.

While researching for this fix, I noticed that HTTP status 308 (Moved Permanently) is another form of permanent redirect and corrected the code to include that case. Note that currently feedparser does not handle HTTP status 308 as a redirect, but this will change when (hopefully) kurtmckee/feedparser#326 gets merged and released there.

In addition, while I was in the area, I added handling of status 410 (Gone) in similar vein to the handling of permanent redirects, by marking the feed as inactive (as suggested in the feedparser documentation).

Other than permanent redirect statuses (301 and 308, which we care about for
updating the feed with the new location), treat all statuses below 400 as
success without hard-coding any other number.
@amiryal amiryal changed the title Http statuses cleanup HTTP statuses cleanup Nov 4, 2022
@auouymous
Copy link
Contributor

The PR looks good but r2e doesn't appear to save the url or active changes made by these statuses. Do either of these work for you?

r2e add test301 http://borg.qzx.com/test-rss.php?test=301
r2e -VVVV run test301 --no-send 2>&1 |grep redirect

r2e add test410 http://borg.qzx.com/test-rss.php?test=410
r2e -VVVV run test410 --no-send 2>&1 |grep deactivate

@arunpersaud
Copy link

Just wanted to check in what the status is. Would be great to get this merged (I currently have to patch this for rss2email to work on my computer). Thanks!

@amiryal
Copy link
Member Author

amiryal commented Jan 8, 2024

Do either of these work for you?
[…]

I didn’t try, but I trust you that they don’t.

@auouymous: any objection to merging this despite the seemingly useless code paths? If anyone cares, it can still be cleaned up afterwards.

@auouymous auouymous merged commit ea07fe0 into rss2email:master Jan 14, 2024
@auouymous
Copy link
Contributor

I added some comments in the code 2f8274b. I believe that is the only way to save these changes.

@auouymous
Copy link
Contributor

@amiryal While looking at #240, I found the new 410 also logs a bunch of errors, because it doesn't return or raise to avoid the checks. Changing to a warning, and returning seems to produce the cleanest log output. Do you agree with this change?

--- a/rss2email/feed.py
+++ b/rss2email/feed.py
@@ -413,10 +413,11 @@ class Feed (object):
             self.url = parsed['url']
             # TODO: `url` is not saved -- add config option to call feeds.save_config() in run command
         elif status == 410:
-            _LOG.info('deactivate {} because {} is gone'.format(
+            _LOG.warning('deactivate {} because {} is gone'.format(
                     self.name, self.url))
             self.active = False
             # TODO: `active` is not saved -- add config option to call feeds.save_config() in run command
+            return
         elif status >= 400:
             raise _error.HTTPError(status=status, feed=self)

@amiryal
Copy link
Member Author

amiryal commented Jan 14, 2024

@auouymous

Do you agree with this change?

Yes, LGTM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

303 HTTP status issue
3 participants