You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If a sitemap is erroneously detected as plain-text sitemap (cf. #144), SiteMapParser may report all or most of the file content as "bad url". This may result
either in many log messages:
2016-12-22 13:55:26.628 c.s.SiteMapParser [WARN] Bad url: [<?xml version="1.0" encoding="UTF-8"?>]
2016-12-22 13:55:26.628 c.s.SiteMapParser [WARN] Bad url: [<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:news="http://www.google.com/schemas/sitemap-news/0.9">]
2016-12-22 13:55:26.628 c.s.SiteMapParser [WARN] Bad url: [ <url>]
2016-12-22 13:55:26.628 c.s.SiteMapParser [WARN] Bad url: [ <loc>http://www.azonline.de/Sport/Fussball/1.-Bundesliga/2640159-Hamburger-SV-Bruchhagen-Lasse-mich-bei-der-Sportchef-Suche-nicht-hetzen</loc>]
2016-12-22 13:55:26.628 c.s.SiteMapParser [WARN] Bad url: [ <news:news>]
2016-12-22 13:55:26.628 c.s.SiteMapParser [WARN] Bad url: [ <news:publication>]
2016-12-22 13:55:26.628 c.s.SiteMapParser [WARN] Bad url: [ <news:name>Allgemeine Zeitung</news:name>]
2016-12-22 13:55:26.628 c.s.SiteMapParser [WARN] Bad url: [ <news:language>ger</news:language>]
2016-12-22 13:55:26.628 c.s.SiteMapParser [WARN] Bad url: [ </news:publication>]
2016-12-22 13:55:26.628 c.s.SiteMapParser [WARN] Bad url: [ <news:publication_date>2016-12-22T14:52:00Z</news:publication_date>]
2016-12-22 13:55:26.628 c.s.SiteMapParser [WARN] Bad url: [ <news:title>Hamburger SV : Bruchhagen: Lasse mich bei der Sportchef-Suche nicht hetzen</news:title>]
2016-12-22 13:55:26.628 c.s.SiteMapParser [WARN] Bad url: [ </news:news>]
2016-12-22 13:55:26.628 c.s.SiteMapParser [WARN] Bad url: [ </url>]
2016-12-22 13:55:26.628 c.s.SiteMapParser [WARN] Bad url: [ <url>]
... (5000 lines following)
or even in one very long message (more than 180 kB in a single line):
2016-12-22 14:43:22.173 c.s.SiteMapParser [WARN] Bad url: [<?xml version="1.0" encoding="UTF-8"?> <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:n="http://www.google.com/schemas/sitemap-news/0.9"> <url> <loc>http://www.hotnews.ro/stiri-politic-21489493-liviu-dragnea-cer-public-serviciilor-secrete-spuna-daca-exista-vreo-problema-securitate-legata-sotul-premierului-propus-sevil-shhaideh.htm</loc> <n:news> <n:publication> <n:name>HotNews.ro</n:name> <n:language>ro</n:language> </n:publication> <n:genres>PressRelease</n:genres> <publication_date>2016-12-22T16:16:35</publication_date> <n:title><![CDATA[Liviu Dragnea: Cer public serviciilor secrete sa spuna daca exista vreo problema de securitate legata de sotul premierului propus Sevil Shhaideh]]></n:title> <n:keywords><![CDATA[]]></n:keywords> </n:news> </url> ...
There should be a limit on both the max. number of lines and the line length, logged as error. This avoids consequential errors, e.g.:
2016-12-22 14:43:22,176 ERROR Unable to write to stream UDP:localhost:514 for appender syslog
2016-12-22 14:43:22,176 ERROR An exception occurred processing Appender syslog org.apache.logging.log4j.core.appender.AppenderLoggingException: Error flushing stream UDP:localhost:514
...
Caused by: java.io.IOException: Message too long (sendto failed)
The text was updated successfully, but these errors were encountered:
If a sitemap is erroneously detected as plain-text sitemap (cf. #144), SiteMapParser may report all or most of the file content as "bad url". This may result
There should be a limit on both the max. number of lines and the line length, logged as error. This avoids consequential errors, e.g.:
The text was updated successfully, but these errors were encountered: