Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add multi-valued text support to the analyzer API #10847

Closed

Conversation

johtani
Copy link
Contributor

@johtani johtani commented Apr 28, 2015

Add support array text as a multi-valued for AnalyzeRequestBuilder and REST API

Closes #3023

Add support array text as a multi-valued for AnalyzeRequestBuilder
Add support array text as a multi-valued for Analyze REST API
Add docs

Closes elastic#3023
@johtani johtani force-pushed the fix/add_support_text_array_analyze_api branch from 1d1f894 to 55c0f5b Compare April 28, 2015 10:56
@@ -18,6 +18,19 @@ curl -XGET 'localhost:9200/_analyze' -d '

coming[2.0.0, body based parameters were added in 2.0.0]

If text parameter is provided as array of string, it analyze as a multi-valued.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would change this:

If text parameter is provided as array of string, it analyze as a multi-valued.

to If text parameter is provided as array of strings, it is analyzed as a multi-valued field.

@dakrone
Copy link
Member

dakrone commented Apr 29, 2015

Left some comments but generally looks good, thanks @johtani!

@johtani
Copy link
Contributor Author

johtani commented Apr 30, 2015

Thanks for your comments. Fixed. @dakrone

@dakrone
Copy link
Member

dakrone commented May 1, 2015

LGTM

* @param index The index name
* @param text The array of text to analyze
*/
AnalyzeRequestBuilder prepareAnalyze(@Nullable String index, String... text);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering if using varargs here makes sense, I think it makes these different methods ambigous?
what's the difference between client.prepareAnalyze("index", "text1"); and client.prepareAnalyze("text1", "text2"); and client.prepareAnalyze("text1");?

Maybe we should got for new methods that accept an array instead?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point

I think we should drop the prepareAnalyze(String... text) arity, and force specifying the index if you want to use the varargs analysis. What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have another option.
Drop prepareAnalyze() with varargs which added this PR.
And add addText(String...) to AnalyzeRequestBuilder.java

Hmm, is it hard to understand?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@johtani but in that case users would have to specify the first value via prepareAnalyze and the following ones using addText ?

Along same lines as Lee's comment, how about having a single AnalyzeRequestBuilder prepareAnalyze(@Nullable String index, String... text); and drop all the others? It would be a breaking change though for the java api. Otherwise we should have the following without varargs which doesnt look great:

AnalyzeRequestBuilder prepareAnalyze(@Nullable String index, String text);
AnalyzeRequestBuilder prepareAnalyze(@Nullable String index, String[] text);
AnalyzeRequestBuilder prepareAnalyze(String text);
AnalyzeRequestBuilder prepareAnalyze(String[] text);

I think compromises between the above two solutions won't work well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@javanna Yeah, my option is not good...

I think Lee's comment is the following:

    AnalyzeRequestBuilder prepareAnalyze(@Nullable String index, String text);
    AnalyzeRequestBuilder prepareAnalyze(@Nullable String index, String... text);
    AnalyzeRequestBuilder prepareAnalyze(String text);

or

    AnalyzeRequestBuilder prepareAnalyze(@Nullable String index, String... text);
    AnalyzeRequestBuilder prepareAnalyze(String text);

however, I think it is easy to understand that we have the other name of method, like
prepareAnalyze and prepareAnalyzeForMultiValued

Change names of IndicesAdminClient methods

Closes elastic#3023
@johtani
Copy link
Contributor Author

johtani commented May 2, 2015

I push changing names of methods with varargs.

*
* @param text The array of text to analyze
*/
AnalyzeRequestBuilder prepareAnalyzeWithMultiValued(String... text);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these two methods go a bit against the naming convention that we have here for prepare* methods. I really think we should go for one of the options I mentioned here. What do you think @dakrone ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, sorry, and I completely agree about the naming convention.
Other question.
If we can change I/F about prepareAnalyze, what do you think about dropping all text parameters for I/F simplification and adding the setter of text/texts to AnalyzeRequestBuilder?

Of course, we are hard to find the difference between prepareAnalyze(String text), that is old version, and prepareAnalyze(String index), that is new one.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you think about dropping all text parameters for I/F simplification and adding the setter of text/texts to AnalyzeRequestBuilder?

that is an option too. For now I'd just expose a prepareAnalyze() then with no parameters, leave the existing ones for bw comp, and add the needed setters where needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now, AnalyzeRequestBuilder doesn't have methods related text parameters.
If we have prepareAnalyze(), we should setText(String)/setText(String...) to AnalyzeRequestBuilder.java

Remove bad naming method
Add prepareAnalyze() with no params
Add setText to AnalyzeRequestBuilder

Closes elastic#3023
@dakrone
Copy link
Member

dakrone commented May 14, 2015

LGTM, thanks @johtani!

analyzeRequest.text(text);
if (texts == null || texts.length == 0) {
texts = new String[]{ RestActions.getRestContent(request).toUtf8() };
analyzeRequest.text(texts);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: we seem to support a single value only here. What is the difference between this case and the below case where we check the presence of arrays? It is just me not knowing tha analyze api I guess.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question.

Analyze API has three way to receive text.

  1. request parameter
  2. request body, raw text
  3. request body, JSON parameter

This case is 2.
In this case, I think it is difficult which letter we assume a delimiter.
And we leave this case for backward compatibility.

Then, I think it's OK we don't support multi valued text for this case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the explanation. case 2) is same as case 1) I guess. The only way we could make it work with multiple values would be by allowing for comma separated values, but then the comma coyuldn't be part of the text anymore. Let's leave it as-is.

@javanna
Copy link
Member

javanna commented May 15, 2015

left a couple more comments, mainly around bw comp but this is targeted for 2.0 so it shouldn't be a problem. LGTM!

@johtani
Copy link
Contributor Author

johtani commented May 15, 2015

Pushed

@johtani johtani closed this May 15, 2015
@kevinkluge kevinkluge removed the review label May 15, 2015
@johtani
Copy link
Contributor Author

johtani commented May 15, 2015

This PR breaks the java API.
This PR changes the return type of the text getter String[] instead of String in AnalyzeRequest.

@clintongormley clintongormley changed the title Analysis: Add multi-valued text support Add multi-valued text support Jun 6, 2015
@clintongormley clintongormley changed the title Add multi-valued text support Add multi-valued text support to the analyzer API Aug 13, 2015
@johtani johtani deleted the fix/add_support_text_array_analyze_api branch April 25, 2016 00:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow AnalyzeRequestBuilder to be given an arbitrary text array
5 participants