You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
http://www.soliant.com/feeds/jobs-sitemap/
returns the following http header Content-Type: text/html; charset=utf-8
as a result the underlying sitemap parser can't handle it properly.
What we can do is to do the detection based on the clue regardless of whether the doc has been declared as being a sitemap and if it matches, force the mime-type to 'application/xml' as the clue indicates a XML doc for sure.
For this particular URL, not setting the mime-type at all does not work either as the content does not have the required xml element <?xml version="1.0" encoding="UTF-8"?> which Tika uses to guess the mimetype.
The text was updated successfully, but these errors were encountered:
http://www.soliant.com/feeds/jobs-sitemap/
returns the following http header
Content-Type: text/html; charset=utf-8
as a result the underlying sitemap parser can't handle it properly.
What we can do is to do the detection based on the clue regardless of whether the doc has been declared as being a sitemap and if it matches, force the mime-type to 'application/xml' as the clue indicates a XML doc for sure.
For this particular URL, not setting the mime-type at all does not work either as the content does not have the required xml element
<?xml version="1.0" encoding="UTF-8"?>
which Tika uses to guess the mimetype.The text was updated successfully, but these errors were encountered: