-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Tiered Storage] Prevent Class Loader Leak; Restore Offloader Directory Override #9878
[Tiered Storage] Prevent Class Loader Leak; Restore Offloader Directory Override #9878
Conversation
Failed the check style. I'll be able to take a look and fix it in an hour. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Users could change the offload directory in the namespace level or the topic-level through the admin API, right? |
Yes, the the |
Yes, you are right. Great work! Maybe we could find a way to make the same offloaders nar packages in the different directories only be loaded once. For example, in the |
How would you determine which nar packages are the same? I agree that it would be good to try to prevent unnecessary class loading, but I also think users that are loading offloaders from two different directories are likely doing so intentionally. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/pulsarbot run-failure-checks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@michaeljmarshall the change is great! Can you add a test to cover the code path?
2bf2987
to
6af4d21
Compare
In putting together the test, I noticed that the implementation would be cleaner with new class and test just that class, so I refactored the original change. @sijie - let me know what you think about this test. I had wanted to include the following: // Should only be called a single time because of the caching
PowerMockito.verifyStatic(OffloaderUtils.class, Mockito.times(1)); but it failed, and I'm not quite sure why. If you think it is important to include a test on the idempotency of loading an offloaders class, let me know. |
Failed checkstyle. I should be able to fix it tomorrow morning. |
/pulsarbot run-failure-checks |
0e18196
to
fa12747
Compare
Had merge conflicts. Fixed with a rebase. |
/pulsarbot run-failure-checks |
2 similar comments
/pulsarbot run-failure-checks |
/pulsarbot run-failure-checks |
…ry Override (#9878) In Pulsar 2.7.0, there is a class loader leak. It looks like #8739 fixed the leak by only loading the offloader classes for the directory configured in `broker.conf`. However, the solution in #8739 ignores the fact that an offload policy can override the the offloaded directory. As such, there could be a regression in 2.7.1 if users are providing multiple offload directories. This PR returns the functionality without reintroducing the class loader leak. Update the `PulsarService` and the `PulsarConnectorCache` classes to use a map from directory strings to `Offloaders`. The new `Map` has keys of type `String`, but we could use keys of type `Path` and then normalize the paths to ensure that `./offloaders` and `offloaders` result in a single class loader. However, it looks like the `normalize` method in the path class has a warning about symbolic links. As such, I went with the basic `String` approach, which might lead to some duplication of loaded classes. Below is the javadoc for `normalize`, in case that helps for a design decision. ```java /** * Returns a path that is this path with redundant name elements eliminated. * * <p> The precise definition of this method is implementation dependent but * in general it derives from this path, a path that does not contain * <em>redundant</em> name elements. In many file systems, the "{@code .}" * and "{@code ..}" are special names used to indicate the current directory * and parent directory. In such file systems all occurrences of "{@code .}" * are considered redundant. If a "{@code ..}" is preceded by a * non-"{@code ..}" name then both names are considered redundant (the * process to identify such names is repeated until it is no longer * applicable). * * <p> This method does not access the file system; the path may not locate * a file that exists. Eliminating "{@code ..}" and a preceding name from a * path may result in the path that locates a different file than the original * path. This can arise when the preceding name is a symbolic link. * * @return the resulting path or this path if it does not contain * redundant name elements; an empty path is returned if this path * does have a root component and all name elements are redundant * * @see #getParent * @see #toRealPath */ Path normalize(); ``` This change is a code cleanup without any test coverage that should be covered by other tests. If required, I can create some tests. (cherry picked from commit 6c3ebbb)
Motivation
In Pulsar 2.7.0, there is a class loader leak. It looks like #8739 fixed the leak by only loading the offloader classes for the directory configured in
broker.conf
. However, the solution in #8739 ignores the fact that an offload policy can override the the offloaded directory. As such, there could be a regression in 2.7.1 if users are providing multiple offload directories.This PR returns the functionality without reintroducing the class loader leak.
Modifications
Update the
PulsarService
and thePulsarConnectorCache
classes to use a map from directory strings toOffloaders
.Alternative Approaches
The new
Map
has keys of typeString
, but we could use keys of typePath
and then normalize the paths to ensure that./offloaders
andoffloaders
result in a single class loader. However, it looks like thenormalize
method in the path class has a warning about symbolic links. As such, I went with the basicString
approach, which might lead to some duplication of loaded classes. Below is the javadoc fornormalize
, in case that helps for a design decision.Verifying this change
This change is a code cleanup without any test coverage that should be covered by other tests. If required, I can create some tests.
Does this pull request potentially affect one of the following parts:
Documentation
None needed.
Acknowledgements
Thanks to @st0ckface for helping me identify the source of the class loader leak in 2.7.0.