Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-10145][table] Add replace supported in TableAPI and SQL #6576

Closed
wants to merge 4 commits into from

Conversation

Guibo-Pan
Copy link
Contributor

What is the purpose of the change

This pull request add replace supported in TableAPI and SQL

Brief change log

  • Add replace supported in TableAPI and SQL

Verifying this change

This change is already covered by existing tests, such as ScalarFunctionsTest#testReplace.

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): (no)
  • The public API, i.e., is any changed class annotated with @Public(Evolving): (no)
  • The serializers: (no)
  • The runtime per-record code paths (performance sensitive): (no)
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)
  • The S3 file system connector: (no)

Documentation

  • Does this pull request introduce a new feature? (yes)
  • If yes, how is the feature documented? (docs)

@Guibo-Pan
Copy link
Contributor Author

cc @yanghua @tillrohrmann a small feature~

Copy link
Contributor

@yanghua yanghua left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @Guibo-Pan thanks for your contribution, I left some comment you can consider.

</td>
<td>
<p>Returns a new string from <i>STRING</i> replaced <i>STRING1</i>(non-overlapping) with <i>STRING2</i>.</p>
<p>E.g., <code>'hello world'.replace('world', 'flink')</code> returns 'hello flink'; <code>'ababab'.replace('abab', 'z')</code> returns "zab"</p>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replace all '' to ""

</td>
<td>
<p>Returns a new string from <i>STRING</i> replaced <i>STRING1</i>(non-overlapping) with <i>STRING2</i>.</p>
<p>E.g., <code>'hello world'.replace('world', 'flink')</code> returns 'hello flink'; <code>'ababab'.replace('abab', 'z')</code> returns "zab"</p>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replace all '' to ""

* Create a new string of the given string with non-overlapping occurrences
* of given search replaced with replacement.
*
* @param search
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove @param and @return to match the style of other methods

@@ -410,3 +410,27 @@ case class ToBase64(child: Expression) extends UnaryExpression with InputTypeSpe
override def toString: String = s"($child).toBase64"

}

/**
* Returns the string `str` with all non-overlapping occurrences
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove all the "``" , just describe the function

@@ -94,6 +94,21 @@ class ScalarFunctionsTest extends ScalarTypesTestBase {
"his is a test String.")
}

@Test
def testReplace(): Unit = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add more test for invalid input such : empty string, null or a string is not available of the original string

@@ -149,6 +149,9 @@ class SqlExpressionTest extends ExpressionTestBase {
testSqlApi("RPAD('hi',4,'??')", "hi??")
testSqlApi("FROM_BASE64('aGVsbG8gd29ybGQ=')", "hello world")
testSqlApi("TO_BASE64('hello world')", "aGVsbG8gd29ybGQ=")
testSqlApi("REPLACE('hello world', 'world', 'flink')", "hello flink")
testSqlApi("REPLACE('ababab', 'abab', 'Z')", "Zab")
testSqlApi("REPLACE('ababab', 'a', '')", "bbb")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems this test file is a show case, so just add one example .

@Guibo-Pan
Copy link
Contributor Author

@yanghua thanks for your suggestion, I've made an update.

Copy link
Member

@xccui xccui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this PR @Guibo-Pan. I left some comments, mainly about the style and the code duplication.

Thanks, Xingcan

</td>
<td>
<p>Returns a new string from <i>string</i> replaced <i>search</i>(non-overlapping) with <i>replacement</i>.</p>
<p>E.g., <code>REPLACE("hello world", "world", "flink")</code> returns "hello flink"; <code>REPLACE("ababab", "abab", "z")</code> returns "zab"</p>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use single quotes for strings in SQL an Java <code> contexts.

<tr>
<td>
{% highlight text %}
REPLACE(string, search, replacement)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For style consistency, just use string1, string2 and string3 here.

{% endhighlight %}
</td>
<td>
<p>Returns a new string from <i>string</i> replaced <i>search</i>(non-overlapping) with <i>replacement</i>.</p>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Returns a new string which replaces <i>string2</i> with <i>string3</i> (non-overlapping) from <i>string1</i>. Please update the Java and Scala documentation correspondingly.

</td>
<td>
<p>Returns a new string from <i>STRING</i> replaced <i>STRING1</i>(non-overlapping) with <i>STRING2</i>.</p>
<p>E.g., <code>"hello world".replace("world", "flink")</code> returns "hello flink"; <code>"ababab".replace("abab", "z")</code> returns "zab"</p>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't forget the period at the end of a sentence.

@@ -459,6 +459,15 @@ trait ImplicitExpressionOperations {
}
}

/**
* Create a new string of the given string with non-overlapping occurrences
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use singular for docs here (Create -> Creates).

* Create a new string of the given string with non-overlapping occurrences
* of given search replaced with replacement.
*/
def replace(search: Expression,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The linebreak-style should be consistent with others (check the overlap method for an example).

@@ -67,6 +67,14 @@ object ScalarSqlFunctions {
OperandTypes.ONE_OR_MORE,
SqlFunctionCategory.STRING)

val REPLACE = new SqlFunction(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The REPLACE SqlFunction has been provided in org.apache.calcite.sql.fun.SqlStdOperatorTable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I use SqlStdOperatorTable.REPLACE at first, however that REPLACE function can not handle null params, so I create a new function which refer to SqlStdOperatorTable.REPLACE.

@@ -118,6 +118,13 @@ object BuiltInMethods {
Types.lookupMethod(
classOf[ScalarFunctions], "concat_ws", classOf[String], classOf[Array[String]])

val REPLACE = Types.lookupMethod(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The REPLACE method has been provided in org.apache.calcite.util.BuiltInMethod. There's no need to duplicate it here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same reason like the above. The calcite builtin REPLACE can not handle null parameters, which may throw NPE.

* Returns the string str with all non-overlapping occurrences
* of search replaced with replacement.
*/
def replace(str: String, search: String, replacement: String): String = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The replace() method has been provided in org.apache.calcite.runtime.SqlFunctions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same reason like the above. The calcite builtin replace can not handle null parameters, which may throw NPE.
The builtin replace is

public static String replace(String s, String search, String replacement) {
    return s.replace(search, replacement);
  }

and in java.lang.String#replace it is

public String replace(CharSequence target, CharSequence replacement) {
        return Pattern.compile(target.toString(), Pattern.LITERAL).matcher(
                this).replaceAll(Matcher.quoteReplacement(replacement.toString()));
    }

Here it calls toString method, that will cause Exception when any of the thress params is null.

@xccui
Copy link
Member

xccui commented Aug 21, 2018

Hi @Guibo-Pan, in SQL convention, any null argument will lead to a null result. Thus when generating a scalar function call, all the arguments (expressions) will be first checked to ensure not null. See

@Guibo-Pan
Copy link
Contributor Author

Thanks for your patient review, @xccui . I have update the docs and styles.

I am not so familiar with CodeGenerator so far. At the comment you said:

in SQL convention, any null argument will lead to a null result. Thus when generating a scalar function call, all the arguments (expressions) will be first checked to ensure not null

I did make some test cases on null inputs, it seems that the builtin replace can not pass. So I change the builtin replace to my replace in commit b262a4.
I think maybe something is misunderstand by me, can you give me more guidance to help me understand it? Thanks.

@xccui
Copy link
Member

xccui commented Aug 21, 2018

Hi @Guibo-Pan, see xccui@e374c3e for an example. Besides, merging commits from other branches into an active PR leads to chaos. Use git rebase instead.

@Guibo-Pan
Copy link
Contributor Author

Guibo-Pan commented Aug 21, 2018

@xccui thanks very much, I've updated it, and used rebase instead. Can you take a second look on it? thanks.

@xccui
Copy link
Member

xccui commented Sep 21, 2018

Hi @Guibo-Pan, I wonder if you could rebase your code to resolve these conflicts.
Thanks, Xingcan

Change-Id: I7da6b3e81e1f9f9a177b54d6f0abb0fa41bcb7f7
Change-Id: I2b6d95d63b438e3d3e9d176ab5c18c4718fdeb1e
Change-Id: I47e5ca5b75f374d824df4f12aff61382fef7a186
Change-Id: Ibeb0e1bf2cc68bb3cfeb6ab7c1f7d791096d017b
@Guibo-Pan
Copy link
Contributor Author

Hi @xccui, I've removed the conflicts. Thanks~

@xccui
Copy link
Member

xccui commented Sep 26, 2018

Thanks for the update @Guibo-Pan. I'll have a final check and merge this.
Best, Xingcan

@asfgit asfgit closed this in 24af70f Sep 27, 2018
zhangxinyu1 pushed a commit to zhangxinyu1/flink that referenced this pull request Sep 27, 2018
Clarkkkkk pushed a commit to Clarkkkkk/flink that referenced this pull request Mar 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants