New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add multi-valued text support to the analyzer API #10847
Add multi-valued text support to the analyzer API #10847
Conversation
Add support array text as a multi-valued for AnalyzeRequestBuilder Add support array text as a multi-valued for Analyze REST API Add docs Closes elastic#3023
1d1f894
to
55c0f5b
Compare
@@ -18,6 +18,19 @@ curl -XGET 'localhost:9200/_analyze' -d ' | |||
|
|||
coming[2.0.0, body based parameters were added in 2.0.0] | |||
|
|||
If text parameter is provided as array of string, it analyze as a multi-valued. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would change this:
If text parameter is provided as array of string, it analyze as a multi-valued.
to If text parameter is provided as array of strings, it is analyzed as a multi-valued field.
Left some comments but generally looks good, thanks @johtani! |
Fix some comments Closes elastic#3023
Thanks for your comments. Fixed. @dakrone |
LGTM |
* @param index The index name | ||
* @param text The array of text to analyze | ||
*/ | ||
AnalyzeRequestBuilder prepareAnalyze(@Nullable String index, String... text); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am wondering if using varargs here makes sense, I think it makes these different methods ambigous?
what's the difference between client.prepareAnalyze("index", "text1");
and client.prepareAnalyze("text1", "text2");
and client.prepareAnalyze("text1");
?
Maybe we should got for new methods that accept an array instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point
I think we should drop the prepareAnalyze(String... text)
arity, and force specifying the index if you want to use the varargs analysis. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have another option.
Drop prepareAnalyze()
with varargs which added this PR.
And add addText(String...)
to AnalyzeRequestBuilder.java
Hmm, is it hard to understand?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@johtani but in that case users would have to specify the first value via prepareAnalyze and the following ones using addText ?
Along same lines as Lee's comment, how about having a single AnalyzeRequestBuilder prepareAnalyze(@Nullable String index, String... text);
and drop all the others? It would be a breaking change though for the java api. Otherwise we should have the following without varargs which doesnt look great:
AnalyzeRequestBuilder prepareAnalyze(@Nullable String index, String text);
AnalyzeRequestBuilder prepareAnalyze(@Nullable String index, String[] text);
AnalyzeRequestBuilder prepareAnalyze(String text);
AnalyzeRequestBuilder prepareAnalyze(String[] text);
I think compromises between the above two solutions won't work well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@javanna Yeah, my option is not good...
I think Lee's comment is the following:
AnalyzeRequestBuilder prepareAnalyze(@Nullable String index, String text);
AnalyzeRequestBuilder prepareAnalyze(@Nullable String index, String... text);
AnalyzeRequestBuilder prepareAnalyze(String text);
or
AnalyzeRequestBuilder prepareAnalyze(@Nullable String index, String... text);
AnalyzeRequestBuilder prepareAnalyze(String text);
however, I think it is easy to understand that we have the other name of method, like
prepareAnalyze
and prepareAnalyzeForMultiValued
Change names of IndicesAdminClient methods Closes elastic#3023
I push changing names of methods with varargs. |
* | ||
* @param text The array of text to analyze | ||
*/ | ||
AnalyzeRequestBuilder prepareAnalyzeWithMultiValued(String... text); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, sorry, and I completely agree about the naming convention.
Other question.
If we can change I/F about prepareAnalyze, what do you think about dropping all text parameters for I/F simplification and adding the setter of text/texts to AnalyzeRequestBuilder?
Of course, we are hard to find the difference between prepareAnalyze(String text), that is old version, and prepareAnalyze(String index), that is new one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what do you think about dropping all text parameters for I/F simplification and adding the setter of text/texts to AnalyzeRequestBuilder?
that is an option too. For now I'd just expose a prepareAnalyze()
then with no parameters, leave the existing ones for bw comp, and add the needed setters where needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now, AnalyzeRequestBuilder doesn't have methods related text
parameters.
If we have prepareAnalyze()
, we should setText(String)/setText(String...) to AnalyzeRequestBuilder.java
Remove bad naming method Add prepareAnalyze() with no params Add setText to AnalyzeRequestBuilder Closes elastic#3023
LGTM, thanks @johtani! |
analyzeRequest.text(text); | ||
if (texts == null || texts.length == 0) { | ||
texts = new String[]{ RestActions.getRestContent(request).toUtf8() }; | ||
analyzeRequest.text(texts); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: we seem to support a single value only here. What is the difference between this case and the below case where we check the presence of arrays? It is just me not knowing tha analyze api I guess.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question.
Analyze API has three way to receive text.
- request parameter
- request body, raw text
- request body, JSON parameter
This case is 2.
In this case, I think it is difficult which letter we assume a delimiter.
And we leave this case for backward compatibility.
Then, I think it's OK we don't support multi valued text for this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the explanation. case 2) is same as case 1) I guess. The only way we could make it work with multiple values would be by allowing for comma separated values, but then the comma coyuldn't be part of the text anymore. Let's leave it as-is.
left a couple more comments, mainly around bw comp but this is targeted for 2.0 so it shouldn't be a problem. LGTM! |
remove setSource JSON Closes elastic#3023
Pushed |
This PR breaks the java API. |
Add support array text as a multi-valued for AnalyzeRequestBuilder and REST API
Closes #3023