Skip to content

NUTCH-2398: Save content of redirected robots.txt under redirect target URL#199

Merged
sebastian-nagel merged 1 commit intoapache:masterfrom
sebastian-nagel:NUTCH-2398-store-robots-txt-redirect
Jul 17, 2017
Merged

NUTCH-2398: Save content of redirected robots.txt under redirect target URL#199
sebastian-nagel merged 1 commit intoapache:masterfrom
sebastian-nagel:NUTCH-2398-store-robots-txt-redirect

Conversation

@sebastian-nagel
Copy link
Contributor

do not use original URL (http://example.com/robots.txt) to store both
redirect response (HTTP 301) and response of redirect target

See also commoncrawl#4.

…et URL,

do not use original URL (http://example.com/robots.txt) to store both
redirect response (HTTP 301) and response of redirect target
@sebastian-nagel sebastian-nagel merged commit 620b85d into apache:master Jul 17, 2017
@sebastian-nagel sebastian-nagel deleted the NUTCH-2398-store-robots-txt-redirect branch August 12, 2017 14:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant