Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error while adding gazzetta.it rss feeds #1260

Open
Paolo7297 opened this issue Feb 21, 2024 · 9 comments
Open

Error while adding gazzetta.it rss feeds #1260

Paolo7297 opened this issue Feb 21, 2024 · 9 comments

Comments

@Paolo7297
Copy link

Whenever I try to add a gazzetta.it feed (Like https://www.gazzetta.it/dynamic-feed/rss/section/Calcio/Serie-A.xml), it throws this error: org.xml.sax.SAXParseException: DOCTYPE is disallowed when the feature "http://apache.org/xml/features/disallow-doctype-decl" set to true..
There's a way to bypass the error? It works great in other rss reader.
Thanks!

Screenshot 2024-02-21 alle 14 21 15

@Athou
Copy link
Owner

Athou commented Feb 21, 2024

There is a DOCTYPE declaration at the top of the feed, which is unusual.

image

The parser CommaFeed is using actively blocks feeds with a DOCTYPE declaration for security reasons (see rometools/rome#203 and https://en.wikipedia.org/wiki/Billion_laughs_attack).

Maybe I can remove the DOCTYPE from the XML before the parsing occurs, I'll see what I can do.

In the mean time, you could contact the website to ask them to remove the DOCTYPE declaration.

@Paolo7297
Copy link
Author

Thanks!
Actually their contact form isn't working, I hope it will in the next days

@travisbeard
Copy link

I also have several that could not be imported when switching from feedly. It would be nice to have an option ignore.

@Athou
Copy link
Owner

Athou commented Apr 11, 2024

I also have several that could not be imported when switching from feedly. It would be nice to have an option ignore.

Do you get the same error as above? What are the feed urls that are not working?

@travisbeard
Copy link

This is no longer bothering me. I was able to find the sites all had 2 feeds, one with and one without. The website parser in commafeed finds the wrong one by default, but i was able to find the 2nd feed for all these sites worked.

@dstutz
Copy link

dstutz commented Oct 4, 2024

I am also getting this exact error on some private GitLab EE Activity feeds, and I can't add them. I would love to have these feeds available.

The feed doesn't appear to have a doctype, though.

<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" xmlns:media="http://search.yahoo.com/mrss/">
<title>******activity</title>
<link href="https://**********?feed_token=glft-******" rel="self" type="application/atom+xml"/>

GitLab Enterprise Edition v16.11.10-ee

I was able to get an issues feed subscribed.

@Athou
Copy link
Owner

Athou commented Oct 4, 2024

I seem to able to subscribe to this URL though. Are you having a DOCTYPE is disallowed error? Do you have an error in the log files with a stacktrace?

@dstutz
Copy link

dstutz commented Oct 4, 2024

In the UI:
image

Nothing showing up in the logs. I am also still running v4.6.0. I've been putting off updating the config for the new Quarkus versions

Yeah, I tested with some public gitlab.org projects and it works fine there. I don't know if it's specifically the EE version I'm trying to subscribe to or something else. Like I said, I was able to subscribe to the issues feed for one of the sub-projects.

@dstutz
Copy link

dstutz commented Oct 4, 2024

Ok...changed to DEBUG level and got some output. Maybe the way that instance is setup the feed token isn't working right, it looks like it might be redirecting to a login page for some reason and I guess THAT is what is not parsing correctly (yes...at the top of the login page: <!DOCTYPE html>).

DEBUG [2024-10-04 10:08:16,878] com.commafeed.backend.HttpGetter: fetching https://*****feed url with token******
DEBUG [2024-10-04 10:08:17,728] com.commafeed.frontend.resource.FeedREST: Could not parse feed from https://*****/users/sign_in : Invalid XML: Error on line 1: DOCTYPE is disallowed when the feature "http://apache.org/xml/features/disallow-doctype-decl" set to true.

 Causing: com.rometools.rome.io.FeedException: Could not parse feed from https://*******/users/sign_in : Invalid XML: Error on line 1: DOCTYPE is disallowed when the feature "http://apache.org/xml/features/disallow-doctype-decl" set to true.
! at com.commafeed.backend.feed.parser.FeedParser.parse(FeedParser.java:85)
! at com.commafeed.backend.feed.FeedFetcher.fetch(FeedFetcher.java:46)
! at com.commafeed.frontend.resource.FeedREST.fetchFeedInternal(FeedREST.java:242)
! at com.commafeed.frontend.resource.FeedREST.fetchFeed(FeedREST.java:269)
! at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants