Skip to content

NUTCH-2757 : Indexer-elastic: add authentication options#508

Merged
sebastian-nagel merged 4 commits intoapache:masterfrom
balashashanka:NUTCH-2757
Apr 19, 2020
Merged

NUTCH-2757 : Indexer-elastic: add authentication options#508
sebastian-nagel merged 4 commits intoapache:masterfrom
balashashanka:NUTCH-2757

Conversation

@balashashanka
Copy link
Copy Markdown
Contributor

Added username and password as default credentials to the HTTP Client

LOG.debug("No cluster name provided so using default");
}
//Only add username and password if password is configured
if(StringUtils.isNotBlank(password)) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, @balashashanka! Looks good and should work.

Minor point: If possible try to apply the Nutch Eclipse Code Formatting rules and also remove trailing white space. Are you using Eclipse? If not let me know or whether you need help to set it up.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @sebastian-nagel, did the formatting.

Copy link
Copy Markdown
Member

@jorgelbg jorgelbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First of all thanks for the PR @balashashanka. I left a couple of questions/comments.

<param name="cluster" value=""/>
<param name="index" value="nutch"/>
<!--Configure username and password if necessary, default user is "elastic"
<param name="user" value="user"/>
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Perhaps we should use username instead of user. For the Solr indexer configuration, we use username and also the comment mentions username.

Copy link
Copy Markdown
Member

@jorgelbg jorgelbg Apr 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: perhaps here we should use as well a default elastic value for the username? That way matches the DEFAULT_USER inside the Java class.

LOG.debug("No cluster name provided so using default");
}
// Only add username and password if password is configured
if (StringUtils.isNotBlank(password)) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Solr indexer, we have a dedicated auth boolean key to verify if the authentication must be set (see https://github.com/apache/nutch/blob/master/src/plugin/indexer-solr/src/java/org/apache/nutch/indexwriter/solr/SolrIndexWriter.java#L97). I'm not sure I have a strong preference for either option (the boolean auth key vs checking for the empty password key) but I would err in the side of consistency since both Indexers work in a similar way.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @jorgelbg, thanks for pointing this out. I can change the code to follow the same.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@balashashanka sure, let's make this part the same for both indexers

Copy link
Copy Markdown
Contributor

@sebastian-nagel sebastian-nagel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @balashashanka - code formatting is fine now. Meanwhile I've also verified that authentication works as expected (cf. NUTCH-2778 but that's clearly another issue). Please address @jorgelbg's remarks. Thanks!

@balashashanka balashashanka requested a review from jorgelbg April 16, 2020 09:00
Copy link
Copy Markdown
Member

@jorgelbg jorgelbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! Great job! I left one tiny comment in the README but it is super Nit thing 😄.

@@ -32,10 +32,12 @@ Parameter Name | Description | Default value
--|--|--
host | Comma-separated list of hostnames to send documents to using [TransportClient](https://static.javadoc.io/org.elasticsearch/elasticsearch/5.3.0/org/elasticsearch/client/transport/TransportClient.html). Either host and port must be defined or cluster. |
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I missed that the or cluster part of the sentence is now obsolete (due to the removal) but that is not a blocker from my side. If you fix it before merging it would be great.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, kind of missed this one. I removed this in the latest commit.

@sebastian-nagel sebastian-nagel merged commit 6f51618 into apache:master Apr 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants