New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Loading hackernews #340
Comments
From #341:
|
Hi Steve! This looks like you get HTML pages instead of XML. Can you show us the code that does the http request? |
My bad on the reddis one, I used http, they want https, chrome fixed for me when I was testing, did not realize. :( |
Cool! What about hackernews? Can you show us the code that does the http request? |
Url = http://news.ycombinator.com/rss
private boolean getFeed(final int index)
{
String urlString = "";
String provider = "";
String entryURL;
SyndFeed feed;
urlString = Statics.rssFeeds.get(index).urlString;
provider = Statics.rssFeeds.get(index).toThisName;
System.out.println(Utilities.get_date("CST", 0, 16) + " Loading feed: " + urlString + " for " + provider + " " + index);
try
{
feed = new SyndFeedInput().build(new XmlReader(new URL(urlString)));
Thread.yield();
for (SyndEntry entry : feed.getEntries())
{
entryURL = entry.getUri();
if ((entryURL.toLowerCase().startsWith("http")) == false) entryURL = entry.getLink();
saveUrl(entryURL, index, entry.getTitle(), entry.getPublishedDate(), provider);
}
}
catch(Exception e)
{
System.out.println(Utilities.get_date("CST", 0, 16) + " ********************************************************************************************* Failed: " + urlString + " for " + provider);
return false;
}
return true;
}
Steve
512-964-3424<tel:512-964-3424>
Latest app: https://itunes.apple.com/us/app/your-news-feed/id1245276956?ls=1&mt=8
http://www.axee.com<http://www.axee.com/>
The heights that great men reached and kept were not attained by sudden flight, but while their competitors slept they toiled upward through the night.
CONFIDENTIALITY NOTICE: This communication and attachments are confidential and intended only for use by individual or entity to which addressed and may contain information that is privileged, confidential and exempt from disclosure under applicable law. If you are not the intended recipient, be aware that any use, dissemination or disclosure, distribution or copying of communication or attachments is strictly prohibited. If you received this E-mail in error, please notify us immediately by replying to sender and deleting all copies of communication and attachments. Although this E-mail and attachments are believed to be virus and defect free, it is the responsibility of recipient to ensure that it is virus free. Thank You.
From: mishako <notifications@github.com>
Reply-To: rometools/rome <reply@reply.github.com>
Date: Sunday, July 9, 2017 at 09:57
To: rometools/rome <rome@noreply.github.com>
Cc: axe <steve@axelrod.net>, Author <author@noreply.github.com>
Subject: Re: [rometools/rome] Loading hackernews (#340)
Cool! What about hackernews? Can you show us the code that does the http request?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub<#340 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AA39E2FpZPoIDxi9XmBmm6YvmyB383L6ks5sMOpUgaJpZM4OROTI>.
|
And this is this wired url: https://www.wired.com/feed/rss
Steve
512-964-3424<tel:512-964-3424>
Latest app: https://itunes.apple.com/us/app/your-news-feed/id1245276956?ls=1&mt=8
http://www.axee.com<http://www.axee.com/>
The heights that great men reached and kept were not attained by sudden flight, but while their competitors slept they toiled upward through the night.
CONFIDENTIALITY NOTICE: This communication and attachments are confidential and intended only for use by individual or entity to which addressed and may contain information that is privileged, confidential and exempt from disclosure under applicable law. If you are not the intended recipient, be aware that any use, dissemination or disclosure, distribution or copying of communication or attachments is strictly prohibited. If you received this E-mail in error, please notify us immediately by replying to sender and deleting all copies of communication and attachments. Although this E-mail and attachments are believed to be virus and defect free, it is the responsibility of recipient to ensure that it is virus free. Thank You.
From: mishako <notifications@github.com>
Reply-To: rometools/rome <reply@reply.github.com>
Date: Sunday, July 9, 2017 at 09:57
To: rometools/rome <rome@noreply.github.com>
Cc: axe <steve@axelrod.net>, Author <author@noreply.github.com>
Subject: Re: [rometools/rome] Loading hackernews (#340)
Cool! What about hackernews? Can you show us the code that does the http request?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub<#340 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AA39E2FpZPoIDxi9XmBmm6YvmyB383L6ks5sMOpUgaJpZM4OROTI>.
|
@Steveaxelrod007 The problem is that the The solution is to use any other http library. |
I am using the
feed = new SyndFeedInput().build(new XmlReader(new URL(urlString)));
rome class to load the url.
What do you suggest I try?
Steve
512-964-3424<tel:512-964-3424>
Latest app: https://itunes.apple.com/us/app/your-news-feed/id1245276956?ls=1&mt=8
http://www.axee.com<http://www.axee.com/>
The heights that great men reached and kept were not attained by sudden flight, but while their competitors slept they toiled upward through the night.
CONFIDENTIALITY NOTICE: This communication and attachments are confidential and intended only for use by individual or entity to which addressed and may contain information that is privileged, confidential and exempt from disclosure under applicable law. If you are not the intended recipient, be aware that any use, dissemination or disclosure, distribution or copying of communication or attachments is strictly prohibited. If you received this E-mail in error, please notify us immediately by replying to sender and deleting all copies of communication and attachments. Although this E-mail and attachments are believed to be virus and defect free, it is the responsibility of recipient to ensure that it is virus free. Thank You.
From: mishako <notifications@github.com>
Reply-To: rometools/rome <reply@reply.github.com>
Date: Tuesday, July 11, 2017 at 15:35
To: rometools/rome <rome@noreply.github.com>
Cc: axe <steve@axelrod.net>, Mention <mention@noreply.github.com>
Subject: Re: [rometools/rome] Loading hackernews (#340)
@Steveaxelrod007<https://github.com/steveaxelrod007> The problem is that the URL class you're using is very basic and doesn't handle many common use cases in the world of HTTP. For example it doesn't follow redirects, so you end up trying to parse the intermediate redirect page instead of the final destination.
The solution is to use any other http library.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#340 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AA39E5BxljAhtZaZ_CzwRSiJnaopy3xyks5sM9x2gaJpZM4OROTI>.
|
@Steveaxelrod007 I would suggest to use jersey http client, but there is also apache http client which I think is more popular. We have an example for the http client here: #276 |
When I try to load "https://news.ycombinator.com/rss" I get
com.rometools.rome.io.ParsingFeedException: Invalid XML: Error on line 6: The element type "hr" must be terminated by the matching end-tag "".
at com.rometools.rome.io.WireFeedInput.build(WireFeedInput.java:236)
at com.rometools.rome.io.SyndFeedInput.build(SyndFeedInput.java:150)
at com.axee.safetyNet.SearchTopNews.getFeed(SearchTopNews.java:556)
at com.axee.safetyNet.SearchTopNews.lambda$goodUrls$8(SearchTopNews.java:121)
at com.axee.safetyNet.SearchTopNews$$Lambda$14/1774033198.run(Unknown Source)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.jdom2.input.JDOMParseException: Error on line 6: The element type "hr" must be terminated by the matching end-tag "".
at org.jdom2.input.sax.SAXBuilderEngine.build(SAXBuilderEngine.java:232)
at org.jdom2.input.sax.SAXBuilderEngine.build(SAXBuilderEngine.java:303)
at org.jdom2.input.SAXBuilder.build(SAXBuilder.java:1196)
at com.rometools.rome.io.WireFeedInput.build(WireFeedInput.java:233)
I checked the XML and it looks fine and the XML validator says it is ok.
Thank you.
The text was updated successfully, but these errors were encountered: