Since recently been unable to parse Reddit's web feeds (RSS) #14

why-not-try-calmer · 2021-10-14T15:03:11Z

import requests
import atoma
response = requests.get("https://www.reddit.com/r/<insert subreddit here>.rss")
decoded = response.content
parsed = atoma.parse_atom_bytes(decoded)

will yield
raise FeedXMLError('Not a valid XML document')
It used to work flawlessly. I'll look into the details I can get when debugging and update this Issue accordingly.

The text was updated successfully, but these errors were encountered:

NicolasLM · 2021-10-18T09:27:46Z

Hi, you should check the status code of the response. It seems that reddit quickly returns 429 errors together with an HTML body, which fails to be parsed.

why-not-try-calmer · 2021-10-19T13:50:21Z

Salut Nicolas, thank you for keeping a vigilant eye on your issues! I had 200 status codes, however with probably truncated contents due to a way I was sending requests. Your library is very likely not the culprit. Will get back to you when I have time to put my finger exactly on what went wrong. Take care!

why-not-try-calmer closed this as completed May 9, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Since recently been unable to parse Reddit's web feeds (RSS) #14

Since recently been unable to parse Reddit's web feeds (RSS) #14

why-not-try-calmer commented Oct 14, 2021 •

edited

NicolasLM commented Oct 18, 2021

why-not-try-calmer commented Oct 19, 2021

Since recently been unable to parse Reddit's web feeds (RSS) #14

Since recently been unable to parse Reddit's web feeds (RSS) #14

Comments

why-not-try-calmer commented Oct 14, 2021 • edited

NicolasLM commented Oct 18, 2021

why-not-try-calmer commented Oct 19, 2021

why-not-try-calmer commented Oct 14, 2021 •

edited