-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FLINK-33792] Generate the same code for the same logic #23984
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zoudan Thanks for the PR, it looks great. My main point is that we don't need to add a config for it, it can be done by default.
@@ -233,6 +233,16 @@ private TableConfigOptions() {} | |||
.withDescription( | |||
"Specifies a threshold where class members of generated code will be grouped into arrays by types."); | |||
|
|||
@Documentation.TableOption(execMode = Documentation.ExecMode.BATCH_STREAMING) | |||
public static final ConfigOption<Boolean> INDEPENDENT_NAME_COUNTER_ENABLED = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this config, I think it's ok to change the behavior by default, it's no meaning to expose it to users.
@@ -80,23 +80,25 @@ object HashCodeGenerator { | |||
} | |||
""".stripMargin | |||
|
|||
System.out.println(code) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove it.
@libenchao Thanks for reviewing, I have remove the configuration, please take a look when you have time. |
@zoudan The PR looks good to me, except that the CI is failing, could you fix that? |
@flinkbot run azure |
1 similar comment
@flinkbot run azure |
@libenchao CI is passed, please take a look when you have time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zoudan Thanks for the updating, using 'parent context' is a great idea. I'll left a few minor comments, others looks good to me.
...ble-planner/src/main/scala/org/apache/flink/table/planner/codegen/CodeGeneratorContext.scala
Outdated
Show resolved
Hide resolved
@@ -124,14 +124,27 @@ object CodeGenUtils { | |||
|
|||
private val nameCounter = new AtomicLong | |||
|
|||
def newName(name: String): String = { | |||
s"$name$$${nameCounter.getAndIncrement}" | |||
def newName(context: CodeGeneratorContext = null, name: String): String = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one has a default value for context
, but newNames
does not have, is there any rational behind this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Parameter section with *-parameter is not allowed to have default arguments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My point is that do you really need to have a default value, IIUC, we want all places to pass context
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Get it, removed
require(names.toSet.size == names.length, "Duplicated names") | ||
val newId = nameCounter.getAndIncrement | ||
names.map(name => s"$name$$$newId") | ||
if (context == null || context.getNameCounter == null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering that if newNames
can be optimized to something like names.map(name => newName(context, name))
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The returned new names use the same index, I keep it the same as before. So, I did not call newName
...k-table-planner/src/test/scala/org/apache/flink/table/planner/codegen/CodeGenUtilsTest.scala
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zoudan Thanks for contribution, I will take a look in the next few days.
@flinkbot run azure |
1 similar comment
@flinkbot run azure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 from my side
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zoudan Thanks for contribution, LGTM overall, I just left some minor comments.
@@ -87,10 +90,11 @@ object LongHashJoinGenerator { | |||
def genProjection( | |||
tableConfig: ReadableConfig, | |||
classLoader: ClassLoader, | |||
types: Array[LogicalType]): GeneratedProjection = { | |||
types: Array[LogicalType], | |||
parentCtx: CodeGeneratorContext): GeneratedProjection = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can simplify the method signature to
def genProjection(
types: Array[LogicalType],
parentCtx: CodeGeneratorContext): GeneratedProjection
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I do not get your point, I think tableConfig
and classLoader
are needed in this method. And I only add the parameter parentCtx
without change anything else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just mean that we can get the tableConfig
and classLoader
from parentCtx
, but this isn't an important point, ignore it now.
names.map(name => s"$name$$$newId") | ||
if (context == null || context.getNameCounter == null) { | ||
val newId = nameCounter.getAndIncrement | ||
// Add an 'i' in the middle to distinguish from nameCounter in CodeGeneratorContext |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why here will have conflicts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There may be some scenarios where it is not convenient for us to obtain class level CodeGeneratorContext, and we could use the nameCounter in CodeGenUtils
to generate new names. In these cases we may use nameCounter from CodeGenUtils
and CodeGeneratorContext
in the same class.
@Override | ||
@Nullable | ||
public CodeGeneratorContext getCodeGeneratorContext() { | ||
return null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add some annotation to explain why here returns null?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change it to non null
@flinkbot run azure |
@flinkbot run azure |
@flinkbot run azure |
@lsyldliu I have updated my code, please have a took when you have time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
What is the purpose of the change
This pull request ensure that we generate the same code for the same logic, it is a precondition for sharing generated classes between different jobs.
Brief change log
CodeGeneratorContext
and use it when we generate names for variables.Verifying this change
Please make sure both new and modified tests in this PR follows the conventions defined in our code quality guide: https://flink.apache.org/contributing/code-style-and-quality-common.html#testing
This change added tests and can be verified as follows:
Does this pull request potentially affect one of the following parts:
@Public(Evolving)
: (no)Documentation