Skip to content

Conversation

@FangYongs
Copy link
Contributor

What is the purpose of the change

When flink perform create temporary function or create system function with a jar file, it will register the jar file to ResourceManager and add it to the ClassLoader. This cause the SHOW JARS statement can list the jar files for these functions, and even when a job does not use these function, the jar file will be aded to the job's classpath.

This PR aims to address this issue with three steps:

  1. Create temporary or system function
    a> Download the jar file from remote as needed, which is the same as before for these functions.
    b> Add jar files for temporary and system functions to a new class FunctionResourceManager in ResourceManager when they are created.
    c> Create a new classloader with the added jar files and validate the function class
  2. Job uses temporary or system function
    a> Register the jar files for the functions to ResourceManager, which will be added to the classpath of the job
    b> Add the jar files to the classloader for the job
  3. Drop temporary or system function
    a> There is reference counter in FunctionResourceManager for each jar file
    b> Register or unregister function to the FunctionResourceManager will update the reference counter
    c> The jar file in FunctionResourceManager will be removed when the reference counter comes to 0

In this way, the jar files for temporary or system function will not be listed in SHOW JARS and will not be added to the classpath and classloader when a job does not use it.

Brief change log

  • Added FunctionResourceManager to register and unregister jar files for temporary and system functions
  • Don't register jar files to resourceInfos and userClassLoader when temporary and system functions are created, register jar files to FunctionResourceManager
  • Register jar files to resourceInfos and userClassLoader when the functions are used in a job

Verifying this change

This change added tests and can be verified as follows:

  • Added SHOW JARS; in function.q after creating a temporary functions and the result is empty
  • Updated FunctionITCase.testCreateTemporarySystemFunctionByUsingJar to show jars after creating system function
  • Added ResourceManagerTest.testRegisterFunctionResource to verify the jar file will no be added to classloader when it is registered in FunctionResourceManager
  • Added ResourceManagerTest.testRegisterFunctionResource to check the jar file can be registered and unregistered in FunctionResourceManager with reference counter

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): (yes / no) no
  • The public API, i.e., is any changed class annotated with @Public(Evolving): (yes / no) no
  • The serializers: (yes / no / don't know) no
  • The runtime per-record code paths (performance sensitive): (yes / no / don't know) no
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / no / don't know) no
  • The S3 file system connector: (yes / no / don't know) no

Documentation

  • Does this pull request introduce a new feature? (yes / no) no
  • If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)

@FangYongs FangYongs force-pushed the FLINK_32512_temporary_function_not_register_rm branch from 6ded434 to 12c7871 Compare July 4, 2023 02:52
@flinkbot
Copy link
Collaborator

flinkbot commented Jul 4, 2023

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

@FangYongs
Copy link
Contributor Author

Hi @fsk119 @lsyldliu Please help to review this PR when you're free, thanks

Copy link
Member

@fsk119 fsk119 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution.

@FangYongs FangYongs force-pushed the FLINK_32512_temporary_function_not_register_rm branch 2 times, most recently from d9717b3 to a7067ea Compare July 4, 2023 11:00
@FangYongs FangYongs force-pushed the FLINK_32512_temporary_function_not_register_rm branch from dcc1bec to 418eb19 Compare July 12, 2023 05:47
@FangYongs
Copy link
Contributor Author

Hi @fsk119 , I have updated this PR, please help to review it again, thanks

@FangYongs FangYongs force-pushed the FLINK_32512_temporary_function_not_register_rm branch from e96dea5 to 7ab3ec3 Compare July 12, 2023 10:18
@KarmaGYZ
Copy link
Contributor

Thanks for the PR, @FangYongs . Please rebase it on the latest master and see my comments.

@FangYongs FangYongs force-pushed the FLINK_32512_temporary_function_not_register_rm branch from 7ab3ec3 to 27d11c5 Compare October 11, 2023 08:50
@FangYongs
Copy link
Contributor Author

@fsk119 @KarmaGYZ I have updated this PR, please have a look again, thanks

@FangYongs FangYongs force-pushed the FLINK_32512_temporary_function_not_register_rm branch from 27d11c5 to e8f015a Compare October 13, 2023 14:01
Copy link
Contributor

@KarmaGYZ KarmaGYZ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update. LGTM except for 2 minor comments.

Copy link
Member

@fsk119 fsk119 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@KarmaGYZ KarmaGYZ force-pushed the FLINK_32512_temporary_function_not_register_rm branch from d1fe483 to 2136e23 Compare October 19, 2023 02:37
@KarmaGYZ
Copy link
Contributor

@flinkbot run azure

@KarmaGYZ KarmaGYZ merged commit c2e14ff into apache:master Oct 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants