Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NUTCH-2451 protocol-ftp to resolve relative URL when following redirects #241

Closed
wants to merge 19 commits into from
Closed

NUTCH-2451 protocol-ftp to resolve relative URL when following redirects #241

wants to merge 19 commits into from

Conversation

HiranChaudhuri
Copy link
Contributor

The change suggested in Jira seems to help. I no longer observe MalformedURLExceptions after this modification.

Hiran Chaudhuri added 19 commits September 22, 2017 12:07
…mHandlers without being dependent on externally installed protocol handlers.
It is not a full implementation, take it more as a showcase for the
plugin system modifications that allow new protocols to be introduced
with a minimum installation effort.

The skeleton implementation we have so far allows foo://... urls to be
injected into crawldb.
…tory.

Now any amount of instances can be created, reconfigured or destroyed.

The only problem might occur when a URL needs to be opened which of the
many PluginRepositories should deliver the protocol.
nutch phases can use protocol plugins as well. May not be the best
approach.
Empty content seems to make nutch fail later.
we do not need to implement yet another such mechanism.
TODO removed, log messages adapted.
@jorgelbg
Copy link
Member

Hi @HiranChaudhuri, do you mind adding a descriptive title/description to the PR and squashing your commits?

@sebastian-nagel sebastian-nagel changed the title Nutch 2451 NUTCH-2451 protocol-ftp to resolve relative URL when following redirects Dec 5, 2017
@sebastian-nagel
Copy link
Contributor

Thanks, @HiranChaudhuri! The commit fixing the URL redirection in protocol-ftp is cherry-picked and integrated into master/1.x and 2.x.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants