Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Arbitrary Redirection #42

Closed
zapbot opened this issue Jun 4, 2015 · 5 comments
Closed

Arbitrary Redirection #42

zapbot opened this issue Jun 4, 2015 · 5 comments

Comments

@zapbot
Copy link
Contributor

zapbot commented Jun 4, 2015

What steps will reproduce the problem?
1. Spider a site with redirection
2. crawl redirected msg
3. Request is from orginal msg, Response is from redirected msg

What is the expected output? What do you see instead?
When this happens not only the HttpMessage is dishonest, saved  and shown on the site
tree that way, but all the crawled anchors of this message will have a base url of
the original message. Which is unusable!

What version of the product are you using? On what operating system?
Version 1.1.1

Please provide any additional information below.
I suggest, for the spider at least, To not follow redirects in the HttpSender, but
keep the redirect response intact, and handle it in the spider thread, by for e.g.
adding the redirect url to the spider queue.

Original issue reported on code.google.com by amjad.masad on 2011-01-08 08:41:39

@zapbot
Copy link
Contributor Author

zapbot commented Jun 4, 2015

(No text was entered with this change)

Original issue reported on code.google.com by psiinon on 2011-01-08 12:09:39

@zapbot
Copy link
Contributor Author

zapbot commented Jun 4, 2015

(No text was entered with this change)

Original issue reported on code.google.com by THC202 on 2012-01-09 15:49:15

@zapbot
Copy link
Contributor Author

zapbot commented Jun 4, 2015

r1126

Original issue reported on code.google.com by THC202 on 2012-01-09 16:51:55

@zapbot
Copy link
Contributor Author

zapbot commented Jun 4, 2015

(No text was entered with this change)

Original issue reported on code.google.com by psiinon on 2012-04-08 13:19:51

@lock
Copy link

lock bot commented Nov 2, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Nov 2, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Development

No branches or pull requests

1 participant