Skip to content
This repository has been archived by the owner on Mar 9, 2021. It is now read-only.
This repository has been archived by the owner on Mar 9, 2021. It is now read-only.

Error 3: post not parsable Element "root" from Namespace "" expected #280

Closed
Kvothe1970 opened this issue Oct 31, 2018 · 7 comments
Closed

Comments

@Kvothe1970
Copy link

Kvothe1970 commented Oct 31, 2018

Hi, this has been happening for the past three days. With multiple blogs, it just pops up. I believe you changed the behaviour there back then with the hidden blogs and the unparsable contact back then, Johannes but this may be unrelated. For me the downloads stop with those posts.

The internal error message is unfortunately in German so I will add a screenshot.

Can anyone else confirm this behaviour?
image

This happens on both hidden and non-hidden blogs for me.

@johanneszab
Copy link
Owner

Thanks for sharing this issue.

I've started to look into it. It seems like they randomly return a blank site/answer instead of content. They do not even return a http error code, but instead a 200 (OK) with a blank site. Weird.

You can even try this in the browser by refreshing this api page several times, and it will randomly be empty.

@johanneszab
Copy link
Owner

Did you encounter the same for hidden tumblr blogs?

@Kvothe1970
Copy link
Author

I’m reasonably certain I did. It’s not reproducible due to the random nature you noticed via the api. I’ll run the crawl again now and tomorrow morning using only the Hidden blogs.

@Kvothe1970
Copy link
Author

Just ran the first round. First the hidden blogs, no troubles. After that I ran the normal blogs and it happened twice in a sample size of about 90 blogs compared to 30 hidden blogs.

@johanneszab
Copy link
Owner

Thanks for running the test.

I'll fix it tomorrow. Should be a simple one: I'd just have to wrap a method around here and re-call the api if the page is empty.

@Kvothe1970
Copy link
Author

That’s brilliant, thank you :)

johanneszab added a commit that referenced this issue Nov 1, 2018
Reties the tumblr blog api v1 request if the server returns an empty HTTP-200 answer.
johanneszab added a commit that referenced this issue Nov 1, 2018
Reties the tumblr blog api v1 request if the server returns an empty HTTP-200 answer.
@Kvothe1970
Copy link
Author

Just updated, crawled all blogs. No more errors, thanks for the quick fix :)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants