Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[improve][fn] Optimize Function Worker startup by lazy loading and direct zip/bytecode access #22122

Merged

Conversation

lhotari
Copy link
Member

@lhotari lhotari commented Feb 26, 2024

Motivation

Currently the Functions Worker eagerly unpacks and extracts all Pulsar IO connectors when the function worker starts up.
This takes a significant amount of CPU resources and also slows down the startup. On Kubernetes with an empheral filesystem, this will happen every single time the Functions Worker is restarted.
When CPU is limited with k8s resource limits on the Functions Worker, this problem is even worse.

Modifications

Refactor the Functions Worker in a way where it is possible to do minimal work at startup time. This requires avoiding class loading and therefore this PR contains a refactoring where the previous logic that was coupled with classloading is now decoupled. Instead of classloading, Byte Buddy library is used to read the required metadata from the functions and connector packages.
Instead of using the JarFile API, NarClassloader has been modified to use the ZipFile API since the Nar files don't need any features that are specific to Jar file handling.

Additional Context

There was a previous PR #17902 that made the unpacking run in parallel. Those changes aren't needed any more since startup will be fast when no nar files are unpacked on startup time.
This PR doesn't require a PIP since this doesn't change external interfaces. This doesn't introduce new external features and is also a bug fix for the maintained Pulsar releases.

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label Feb 26, 2024
Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm

This is an amazing improvement

@codecov-commenter
Copy link

codecov-commenter commented Feb 26, 2024

Codecov Report

Attention: Patch coverage is 59.84733% with 263 lines in your changes missing coverage. Please review.

Project coverage is 73.55%. Comparing base (1b1cfb5) to head (f1ea2d9).
Report is 587 commits behind head on master.

Files with missing lines Patch % Lines
.../pulsar/functions/utils/FunctionRuntimeCommon.java 41.02% 43 Missing and 3 partials ⚠️
.../apache/pulsar/functions/utils/ValidatorUtils.java 13.04% 38 Missing and 2 partials ⚠️
...he/pulsar/functions/utils/FunctionFilePackage.java 68.57% 19 Missing and 3 partials ⚠️
...apache/pulsar/functions/utils/SinkConfigUtils.java 53.19% 16 Missing and 6 partials ⚠️
...ache/pulsar/functions/utils/io/ConnectorUtils.java 32.14% 19 Missing ⚠️
...ache/pulsar/functions/utils/SourceConfigUtils.java 51.61% 10 Missing and 5 partials ⚠️
.../apache/pulsar/functions/utils/FunctionCommon.java 64.10% 8 Missing and 6 partials ⚠️
...rg/apache/pulsar/functions/utils/io/Connector.java 56.25% 13 Missing and 1 partial ⚠️
...he/pulsar/functions/worker/rest/api/SinksImpl.java 57.57% 10 Missing and 4 partials ⚠️
...he/pulsar/functions/utils/FunctionConfigUtils.java 69.76% 10 Missing and 3 partials ⚠️
... and 11 more
Additional details and impacted files

Impacted file tree graph

@@              Coverage Diff              @@
##             master   #22122       +/-   ##
=============================================
+ Coverage     34.05%   73.55%   +39.49%     
- Complexity    12150    32612    +20462     
=============================================
  Files          1659     1877      +218     
  Lines        129418   139502    +10084     
  Branches      14173    15299     +1126     
=============================================
+ Hits          44079   102610    +58531     
+ Misses        79497    28937    -50560     
- Partials       5842     7955     +2113     
Flag Coverage Δ
inttests 24.62% <2.13%> (-0.12%) ⬇️
systests 24.24% <29.16%> (?)
unittests 72.83% <57.86%> (+40.96%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
...pulsar/functions/runtime/thread/ThreadRuntime.java 72.00% <100.00%> (+16.00%) ⬆️
...g/apache/pulsar/functions/worker/WorkerConfig.java 92.98% <100.00%> (+12.17%) ⬆️
...ache/pulsar/functions/worker/FunctionActioner.java 74.93% <100.00%> (+72.70%) ⬆️
...e/pulsar/functions/worker/PulsarWorkerService.java 70.26% <100.00%> (+17.16%) ⬆️
...ulsar/functions/worker/rest/api/ComponentImpl.java 48.01% <100.00%> (+47.47%) ⬆️
...a/org/apache/pulsar/common/nar/NarClassLoader.java 54.92% <86.66%> (+14.60%) ⬆️
.../pulsar/functions/utils/LoadedFunctionPackage.java 90.90% <90.90%> (ø)
...sar/functions/utils/functions/FunctionArchive.java 88.23% <88.23%> (+28.23%) ⬆️
...java/org/apache/pulsar/common/nar/NarUnpacker.java 78.68% <76.92%> (+0.11%) ⬆️
...ulsar/functions/worker/rest/api/FunctionsImpl.java 56.67% <83.33%> (+55.91%) ⬆️
... and 16 more

... and 1496 files with indirect coverage changes

@lhotari lhotari merged commit bbc6224 into apache:master Feb 26, 2024
56 of 57 checks passed
lhotari added a commit that referenced this pull request Feb 26, 2024
…rect zip/bytecode access (#22122)

(cherry picked from commit bbc6224)
lhotari added a commit that referenced this pull request Feb 26, 2024
…rect zip/bytecode access (#22122)

(cherry picked from commit bbc6224)
lhotari added a commit that referenced this pull request Feb 26, 2024
…rect zip/bytecode access (#22122)

(cherry picked from commit bbc6224)

# Conflicts:
#	pulsar-functions/worker/src/main/java/org/apache/pulsar/functions/worker/PulsarWorkerService.java
#	tests/integration/src/test/java/org/apache/pulsar/tests/integration/topologies/PulsarCluster.java
lhotari added a commit to lhotari/pulsar that referenced this pull request Feb 26, 2024
…rect zip/bytecode access (apache#22122)

(cherry picked from commit bbc6224)
lhotari added a commit to lhotari/pulsar that referenced this pull request Feb 26, 2024
…rect zip/bytecode access (apache#22122)

(cherry picked from commit bbc6224)
(cherry picked from commit 597cfa1)

# Conflicts:
#	pulsar-functions/localrun/src/main/java/org/apache/pulsar/functions/LocalRunner.java
#	pulsar-functions/runtime/src/main/java/org/apache/pulsar/functions/runtime/thread/ThreadRuntime.java
#	pulsar-functions/runtime/src/main/java/org/apache/pulsar/functions/worker/ConnectorsManager.java
#	pulsar-functions/runtime/src/main/java/org/apache/pulsar/functions/worker/WorkerConfig.java
#	pulsar-functions/utils/src/main/java/org/apache/pulsar/functions/utils/FunctionCommon.java
#	pulsar-functions/utils/src/main/java/org/apache/pulsar/functions/utils/FunctionConfigUtils.java
#	pulsar-functions/utils/src/main/java/org/apache/pulsar/functions/utils/SinkConfigUtils.java
#	pulsar-functions/utils/src/main/java/org/apache/pulsar/functions/utils/SourceConfigUtils.java
#	pulsar-functions/utils/src/main/java/org/apache/pulsar/functions/utils/ValidatorUtils.java
#	pulsar-functions/utils/src/main/java/org/apache/pulsar/functions/utils/functions/FunctionUtils.java
#	pulsar-functions/utils/src/main/java/org/apache/pulsar/functions/utils/functions/Functions.java
#	pulsar-functions/utils/src/main/java/org/apache/pulsar/functions/utils/io/ConnectorUtils.java
#	pulsar-functions/utils/src/test/java/org/apache/pulsar/functions/utils/FunctionCommonTest.java
#	pulsar-functions/utils/src/test/java/org/apache/pulsar/functions/utils/FunctionConfigUtilsTest.java
#	pulsar-functions/utils/src/test/java/org/apache/pulsar/functions/utils/SinkConfigUtilsTest.java
#	pulsar-functions/worker/src/main/java/org/apache/pulsar/functions/worker/FunctionsManager.java
#	pulsar-functions/worker/src/main/java/org/apache/pulsar/functions/worker/rest/api/ComponentImpl.java
#	pulsar-functions/worker/src/main/java/org/apache/pulsar/functions/worker/rest/api/FunctionsImpl.java
#	pulsar-functions/worker/src/main/java/org/apache/pulsar/functions/worker/rest/api/SinksImpl.java
#	pulsar-functions/worker/src/main/java/org/apache/pulsar/functions/worker/rest/api/SourcesImpl.java
#	pulsar-functions/worker/src/test/java/org/apache/pulsar/functions/worker/FunctionAssignmentTailerTest.java
#	pulsar-functions/worker/src/test/java/org/apache/pulsar/functions/worker/rest/api/FunctionsImplTest.java
#	pulsar-functions/worker/src/test/java/org/apache/pulsar/functions/worker/rest/api/v2/FunctionApiV2ResourceTest.java
#	pulsar-functions/worker/src/test/java/org/apache/pulsar/functions/worker/rest/api/v3/FunctionApiV3ResourceTest.java
#	pulsar-functions/worker/src/test/java/org/apache/pulsar/functions/worker/rest/api/v3/SinkApiV3ResourceTest.java
#	pulsar-functions/worker/src/test/java/org/apache/pulsar/functions/worker/rest/api/v3/SourceApiV3ResourceTest.java
#	tests/docker-images/latest-version-image/conf/functions_worker.conf
@Technoboy- Technoboy- added this to the 3.3.0 milestone Feb 28, 2024
mukesh-ctds pushed a commit to datastax/pulsar that referenced this pull request Mar 1, 2024
…rect zip/bytecode access (apache#22122)

(cherry picked from commit bbc6224)
(cherry picked from commit 3d3606b)
nikhil-ctds pushed a commit to datastax/pulsar that referenced this pull request Mar 5, 2024
…rect zip/bytecode access (apache#22122)

(cherry picked from commit bbc6224)
(cherry picked from commit 597cfa1)

(cherry picked from commit d608bfc)

 Conflicts:
	pulsar-broker/pom.xml
	pulsar-functions/localrun/src/main/java/org/apache/pulsar/functions/LocalRunner.java
	pulsar-functions/runtime/src/main/java/org/apache/pulsar/functions/runtime/thread/ThreadRuntime.java
	pulsar-functions/runtime/src/main/java/org/apache/pulsar/functions/runtime/thread/ThreadRuntimeFactory.java
	pulsar-functions/runtime/src/main/java/org/apache/pulsar/functions/worker/FunctionsManager.java
	pulsar-functions/utils/src/main/java/org/apache/pulsar/functions/utils/FunctionCommon.java
	pulsar-functions/utils/src/main/java/org/apache/pulsar/functions/utils/FunctionConfigUtils.java
	pulsar-functions/utils/src/main/java/org/apache/pulsar/functions/utils/SinkConfigUtils.java
	pulsar-functions/utils/src/main/java/org/apache/pulsar/functions/utils/functions/FunctionArchive.java
	pulsar-functions/utils/src/main/java/org/apache/pulsar/functions/utils/functions/FunctionUtils.java
	pulsar-functions/utils/src/main/java/org/apache/pulsar/functions/utils/io/ConnectorUtils.java
	pulsar-functions/utils/src/test/java/org/apache/pulsar/functions/utils/FunctionCommonTest.java
	pulsar-functions/utils/src/test/java/org/apache/pulsar/functions/utils/FunctionConfigUtilsTest.java
	pulsar-functions/utils/src/test/java/org/apache/pulsar/functions/utils/SinkConfigUtilsTest.java
	pulsar-functions/worker/pom.xml
	pulsar-functions/worker/src/main/java/org/apache/pulsar/functions/worker/rest/api/ComponentImpl.java
	pulsar-functions/worker/src/main/java/org/apache/pulsar/functions/worker/rest/api/FunctionsImpl.java
	pulsar-functions/worker/src/main/java/org/apache/pulsar/functions/worker/rest/api/SinksImpl.java
	pulsar-functions/worker/src/main/java/org/apache/pulsar/functions/worker/rest/api/SourcesImpl.java
	pulsar-functions/worker/src/test/java/org/apache/pulsar/functions/worker/rest/api/FunctionsImplTest.java
	pulsar-functions/worker/src/test/java/org/apache/pulsar/functions/worker/rest/api/v2/FunctionApiV2ResourceTest.java
	pulsar-functions/worker/src/test/java/org/apache/pulsar/functions/worker/rest/api/v3/FunctionApiV3ResourceTest.java
	pulsar-functions/worker/src/test/java/org/apache/pulsar/functions/worker/rest/api/v3/SinkApiV3ResourceTest.java
nikhil-ctds pushed a commit to datastax/pulsar that referenced this pull request Mar 6, 2024
…rect zip/bytecode access (apache#22122)

(cherry picked from commit bbc6224)

 Conflicts:
	pulsar-common/src/main/java/org/apache/pulsar/common/nar/NarUnpacker.java
	pulsar-functions/runtime/src/main/java/org/apache/pulsar/functions/worker/ConnectorsManager.java
	pulsar-functions/utils/src/main/java/org/apache/pulsar/functions/utils/FunctionCommon.java
	pulsar-functions/utils/src/main/java/org/apache/pulsar/functions/utils/FunctionConfigUtils.java
	pulsar-functions/utils/src/main/java/org/apache/pulsar/functions/utils/SinkConfigUtils.java
	pulsar-functions/utils/src/main/java/org/apache/pulsar/functions/utils/SourceConfigUtils.java
	pulsar-functions/utils/src/main/java/org/apache/pulsar/functions/utils/ValidatorUtils.java
	pulsar-functions/utils/src/main/java/org/apache/pulsar/functions/utils/functions/FunctionUtils.java
	pulsar-functions/utils/src/main/java/org/apache/pulsar/functions/utils/io/ConnectorUtils.java
	pulsar-functions/utils/src/test/java/org/apache/pulsar/functions/utils/FunctionCommonTest.java
	pulsar-functions/utils/src/test/java/org/apache/pulsar/functions/utils/FunctionConfigUtilsTest.java
	pulsar-functions/utils/src/test/java/org/apache/pulsar/functions/utils/SinkConfigUtilsTest.java
	pulsar-functions/worker/src/main/java/org/apache/pulsar/functions/worker/PulsarWorkerService.java
	pulsar-functions/worker/src/main/java/org/apache/pulsar/functions/worker/rest/api/ComponentImpl.java
	pulsar-functions/worker/src/main/java/org/apache/pulsar/functions/worker/rest/api/FunctionsImpl.java
	pulsar-functions/worker/src/main/java/org/apache/pulsar/functions/worker/rest/api/SinksImpl.java
	pulsar-functions/worker/src/main/java/org/apache/pulsar/functions/worker/rest/api/SourcesImpl.java
	pulsar-functions/worker/src/test/java/org/apache/pulsar/functions/worker/rest/api/v2/FunctionApiV2ResourceTest.java
	pulsar-functions/worker/src/test/java/org/apache/pulsar/functions/worker/rest/api/v3/FunctionApiV3ResourceTest.java
	pulsar-functions/worker/src/test/java/org/apache/pulsar/functions/worker/rest/api/v3/SinkApiV3ResourceTest.java
	pulsar-functions/worker/src/test/java/org/apache/pulsar/functions/worker/rest/api/v3/SourceApiV3ResourceTest.java
	tests/docker-images/latest-version-image/conf/functions_worker.conf
	tests/integration/src/test/java/org/apache/pulsar/tests/integration/topologies/PulsarCluster.java
mukesh-ctds pushed a commit to datastax/pulsar that referenced this pull request Mar 6, 2024
…rect zip/bytecode access (apache#22122)

(cherry picked from commit bbc6224)
(cherry picked from commit 3d3606b)
nikhil-ctds pushed a commit to datastax/pulsar that referenced this pull request Mar 7, 2024
…rect zip/bytecode access (apache#22122)

(cherry picked from commit bbc6224)

 Conflicts:
	pulsar-common/src/main/java/org/apache/pulsar/common/nar/NarUnpacker.java
	pulsar-functions/runtime/src/main/java/org/apache/pulsar/functions/worker/ConnectorsManager.java
	pulsar-functions/utils/src/main/java/org/apache/pulsar/functions/utils/FunctionCommon.java
	pulsar-functions/utils/src/main/java/org/apache/pulsar/functions/utils/FunctionConfigUtils.java
	pulsar-functions/utils/src/main/java/org/apache/pulsar/functions/utils/SinkConfigUtils.java
	pulsar-functions/utils/src/main/java/org/apache/pulsar/functions/utils/SourceConfigUtils.java
	pulsar-functions/utils/src/main/java/org/apache/pulsar/functions/utils/ValidatorUtils.java
	pulsar-functions/utils/src/main/java/org/apache/pulsar/functions/utils/functions/FunctionUtils.java
	pulsar-functions/utils/src/main/java/org/apache/pulsar/functions/utils/io/ConnectorUtils.java
	pulsar-functions/utils/src/test/java/org/apache/pulsar/functions/utils/FunctionCommonTest.java
	pulsar-functions/utils/src/test/java/org/apache/pulsar/functions/utils/FunctionConfigUtilsTest.java
	pulsar-functions/utils/src/test/java/org/apache/pulsar/functions/utils/SinkConfigUtilsTest.java
	pulsar-functions/worker/src/main/java/org/apache/pulsar/functions/worker/PulsarWorkerService.java
	pulsar-functions/worker/src/main/java/org/apache/pulsar/functions/worker/rest/api/ComponentImpl.java
	pulsar-functions/worker/src/main/java/org/apache/pulsar/functions/worker/rest/api/FunctionsImpl.java
	pulsar-functions/worker/src/main/java/org/apache/pulsar/functions/worker/rest/api/SinksImpl.java
	pulsar-functions/worker/src/main/java/org/apache/pulsar/functions/worker/rest/api/SourcesImpl.java
	pulsar-functions/worker/src/test/java/org/apache/pulsar/functions/worker/rest/api/v2/FunctionApiV2ResourceTest.java
	pulsar-functions/worker/src/test/java/org/apache/pulsar/functions/worker/rest/api/v3/FunctionApiV3ResourceTest.java
	pulsar-functions/worker/src/test/java/org/apache/pulsar/functions/worker/rest/api/v3/SinkApiV3ResourceTest.java
	pulsar-functions/worker/src/test/java/org/apache/pulsar/functions/worker/rest/api/v3/SourceApiV3ResourceTest.java
	tests/docker-images/latest-version-image/conf/functions_worker.conf
	tests/integration/src/test/java/org/apache/pulsar/tests/integration/topologies/PulsarCluster.java
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants