[FLINK-11830][table-planner-blink] Introduce CodeGeneratorContext to maintain reusable statements #7906

wuchong · 2019-03-06T03:30:34Z

What is the purpose of the change

Introduce CodeGeneratorContext to maintain reusable statements for code generation.

The CodeGeneratorContext will keep all the reusable statements which will be the basic class for code generation. In the future, we will introduce FunctionCodeGeneration, AggregateCodeGeneration, etc... and they will depend on the CodeGeneratorContext to store reusable statements.

Brief change log

Introduce CodeGeneratorContext.
Introduce CompileUtils in table-runtime-blink to support compile code in runtime.
Introduce GeneratedClass in table-runtime-blink to wrap generated code and class name to support easy instantiation.
Copy TableConfig to table-planner-blink because CodeGeneratorContext depends on it.
Introduce CodeGenUtils to package utilities for codegen (will add various generateXXX in FLINK-11788).
Introduce TypeCheckUtils to check types.

Verifying this change

The CodeGeneratorContext will be covered in later FLINK-11788.

Added a CompileUtilsTest to test compile classes with cache.

Does this pull request potentially affect one of the following parts:

Dependencies (does it add or upgrade a dependency): (no)
The public API, i.e., is any changed class annotated with @Public(Evolving): (no)
The serializers: (no)
The runtime per-record code paths (performance sensitive): (no)
Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)
The S3 file system connector: (no)

Documentation

Does this pull request introduce a new feature? (no)
If yes, how is the feature documented? (not applicable)

…maintain reusable statements

flinkbot · 2019-03-06T03:31:18Z

Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community
to review your pull request. We will use this comment to track the progress of the review.

Review Progress

✅ 1. The [description] looks good.
- Approved by @KurtYoung [committer]
✅ 2. There is [consensus] that the contribution should go into to Flink.
- Approved by @KurtYoung [committer]
❓ 3. Needs [attention] from.
✅ 4. The change fits into the overall [architecture].
- Approved by @KurtYoung [committer]
✅ 5. Overall code [quality] is good.
- Approved by @KurtYoung [committer]

Please see the Pull Request Review Guide for a full explanation of the review process.

The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required

Bot commands

The @flinkbot bot supports the following commands:

@flinkbot approve description to approve one or more aspects (aspects: description, consensus, architecture and quality)
@flinkbot approve all to approve all aspects
@flinkbot approve-until architecture to approve everything until architecture
@flinkbot attention @username1 [@username2 ..] to require somebody's attention
@flinkbot disapprove architecture to remove an approval you gave earlier

JingsongLi

Thanks for the effort, i have some comments

JingsongLi · 2019-03-06T03:44:03Z

...e/flink-table-planner-blink/src/main/scala/org/apache/flink/table/codegen/CodeGenUtils.scala

+    case _ => false
+  }
+
+  def needCloneRefForType(t: InternalType): Boolean = t match {


This method is useless.

JingsongLi · 2019-03-06T03:44:35Z

...e/flink-table-planner-blink/src/main/scala/org/apache/flink/table/codegen/CodeGenUtils.scala

+    case InternalTypes.TIMESTAMP => boxedTypeTermForType(InternalTypes.LONG)
+
+    case InternalTypes.STRING => BINARY_STRING
+    case InternalTypes.BINARY => "byte[]"


InternalTypes.BINARY is ArrayType, remove it, let ArrayType do.

JingsongLi · 2019-03-06T03:44:57Z

...e/flink-table-planner-blink/src/main/scala/org/apache/flink/table/codegen/CodeGenUtils.scala

+
+    case _: RowType => classOf[BaseRow].getCanonicalName
+    case _: DecimalType => throw new UnsupportedOperationException
+    case _: ArrayType => throw new UnsupportedOperationException


ArrayType and MapType have been supported.

JingsongLi · 2019-03-06T03:48:09Z

...e/flink-table-planner-blink/src/main/scala/org/apache/flink/table/codegen/CodeGenUtils.scala

+  def boxedTypeTermForExternalType(t: TypeInformation[_]): String = t match {
+    // From PrimitiveArrayTypeInfo we would get class "int[]", scala reflections
+    // does not seem to like this, so we manually give the correct type here.
+    case INT_PRIMITIVE_ARRAY_TYPE_INFO => "int[]"


I try getTypeClass().getCanonicalName(), it will output byte[], seem it work?

Yes, I think we can use getTypeClass().getCanonicalName() instead.
Not sure why hardcode "int[]" in the original code.

JingsongLi · 2019-03-06T03:52:22Z

...table-planner-blink/src/main/scala/org/apache/flink/table/codegen/CodeGeneratorContext.scala

+      Thread.currentThread().getContextClassLoader)
+    references += objCopy
+
+    val clsName = Option(className).getOrElse(obj.getClass.getName)


getCanonicalName is better?

JingsongLi · 2019-03-06T03:53:05Z

...table-planner-blink/src/main/scala/org/apache/flink/table/codegen/CodeGeneratorContext.scala

+    val byteArray = InstantiationUtil.serializeObject(obj)
+    val objCopy: AnyRef = InstantiationUtil.deserializeObject(
+      byteArray,
+      Thread.currentThread().getContextClassLoader)


obj.getClass.getClassLoader?

JingsongLi · 2019-03-06T03:54:12Z

...table-planner-blink/src/main/scala/org/apache/flink/table/codegen/CodeGeneratorContext.scala

+  private def addReusableObject_(obj: AnyRef, fieldTerm: String, fieldTypeTerm: String): Unit = {
+    val idx = references.length
+    // make a deep copy of the object
+    val byteArray = InstantiationUtil.serializeObject(obj)


merge code with addReferenceObj(obj: AnyRef, className: String = null)?

JingsongLi · 2019-03-06T03:55:51Z

...e/flink-table-runtime-blink/src/main/java/org/apache/flink/table/generated/CompileUtils.java

+		try {
+			compiler.cook(code);
+		} catch (Throwable t) {
+			// TODO: println pretty code


I don't think println pretty code is a good idea, because it doesn't match the number of lines compiled incorrectly.

Yes, you are right. I will remove the todo and consider it later if we need.

KurtYoung

The most changes looks good to me, however i think there are some very error prone codes, and i opened some follow up issues to track these

KurtYoung · 2019-03-06T04:01:15Z

...table-planner-blink/src/main/scala/org/apache/flink/table/codegen/CodeGeneratorContext.scala

+  * The context for code generator, maintaining various reusable statements that could be insert
+  * into different code sections in the final generated class.
+  */
+class CodeGeneratorContext(val tableConfig: TableConfig) {


I create a follow up issue: FLINK-11831 to further separate the reused members for different codegen purpose.

It is a child issue of "Finalize the Blink SQL merging efforts", we can do it in the last steps

KurtYoung · 2019-03-06T04:08:35Z

...e/flink-table-planner-blink/src/main/scala/org/apache/flink/table/codegen/CodeGenUtils.scala

+    */
+  def className[T](implicit m: Manifest[T]): String = m.runtimeClass.getCanonicalName
+
+  def needCopyForType(t: InternalType): Boolean = t match {


Functions like this should better be a method of InternalType, otherwise there is no guarantee when someone add a new InternalType, he will check this functionality.
I will also open a follow up issue for this.

KurtYoung · 2019-03-06T04:20:59Z

...ink-table-planner-blink/src/main/scala/org/apache/flink/table/typeutils/TypeCheckUtils.scala

+
+  def isMap(dataType: InternalType): Boolean = dataType.isInstanceOf[MapType]
+
+  def isComparable(dataType: InternalType): Boolean =


This is also very error prone

wuchong · 2019-03-06T05:51:47Z

Hi @JingsongLi @KurtYoung , thanks for reviewing.

I agree with your ideas and addressed the comments.

KurtYoung · 2019-03-06T06:40:54Z

+1 to merge

KurtYoung · 2019-03-06T06:41:04Z

@flinkbot approve all

wuchong · 2019-03-06T09:48:24Z

Merging...

…maintain reusable statements This closes apache#7906

[FLINK-11830][table-planner-blink] Introduce CodeGeneratorContext to …

e8ae499

…maintain reusable statements

rmetzger added the review=description? label Mar 6, 2019

JingsongLi reviewed Mar 6, 2019

View reviewed changes

fix checkstyle

118eb59

KurtYoung reviewed Mar 6, 2019

View reviewed changes

wuchong added 2 commits March 6, 2019 13:47

address comments

fa4f25f

address comments

8086603

fix compile

adaf77f

rmetzger added review=approved ✅ and removed review=description? labels Mar 6, 2019

asfgit closed this in ec3b36c Mar 6, 2019

rmetzger added the component=SQL/Planner label Mar 18, 2019

HuangZhenQiu pushed a commit to HuangZhenQiu/flink that referenced this pull request Apr 22, 2019

[FLINK-11830][table-planner-blink] Introduce CodeGeneratorContext to …

97ce053

…maintain reusable statements This closes apache#7906

flinkbot added component=TableSQL/Planner and removed component=SQL/Planner labels Mar 17, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FLINK-11830][table-planner-blink] Introduce CodeGeneratorContext to maintain reusable statements #7906

[FLINK-11830][table-planner-blink] Introduce CodeGeneratorContext to maintain reusable statements #7906

wuchong commented Mar 6, 2019

flinkbot commented Mar 6, 2019 •

edited

JingsongLi left a comment

JingsongLi Mar 6, 2019

JingsongLi Mar 6, 2019

JingsongLi Mar 6, 2019

JingsongLi Mar 6, 2019

wuchong Mar 6, 2019

JingsongLi Mar 6, 2019

JingsongLi Mar 6, 2019

JingsongLi Mar 6, 2019

JingsongLi Mar 6, 2019

wuchong Mar 6, 2019

KurtYoung left a comment

KurtYoung Mar 6, 2019

KurtYoung Mar 6, 2019

KurtYoung Mar 6, 2019

wuchong Mar 6, 2019

KurtYoung Mar 6, 2019

wuchong Mar 6, 2019

wuchong commented Mar 6, 2019

KurtYoung commented Mar 6, 2019

KurtYoung commented Mar 6, 2019

wuchong commented Mar 6, 2019


		def isMap(dataType: InternalType): Boolean = dataType.isInstanceOf[MapType]

		def isComparable(dataType: InternalType): Boolean =

[FLINK-11830][table-planner-blink] Introduce CodeGeneratorContext to maintain reusable statements #7906

[FLINK-11830][table-planner-blink] Introduce CodeGeneratorContext to maintain reusable statements #7906

Conversation

wuchong commented Mar 6, 2019

What is the purpose of the change

Brief change log

Verifying this change

Does this pull request potentially affect one of the following parts:

Documentation

flinkbot commented Mar 6, 2019 • edited

Review Progress

JingsongLi left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

KurtYoung left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wuchong commented Mar 6, 2019

KurtYoung commented Mar 6, 2019

KurtYoung commented Mar 6, 2019

wuchong commented Mar 6, 2019

flinkbot commented Mar 6, 2019 •

edited