Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ClassLoader streamlining and composability #3049

Closed
plexus opened this issue Sep 9, 2021 · 12 comments
Closed

ClassLoader streamlining and composability #3049

plexus opened this issue Sep 9, 2021 · 12 comments
Labels
high priority Tickets of particular importance

Comments

@plexus
Copy link
Contributor

plexus commented Sep 9, 2021

As part of #3037 I'm trying to get a better sense of how classloaders are used across nREPL, Orchard, clojure.main, and elsewhere. I've been doing some deep diving into this topic, and have collected a bunch of helpers to help inspect and diagnose classloader related issues.

This is a meta issue where I will for starters try to better document the situation, and collect links to various issues and PRs that have been created over time.

There's a lot of getting/setting of the thread-local context classloader, and some binding of clojure.compiler.LOADER. This doesn't compose well, and issues are hard to diagnose. It's easy to end up in a situation where different parts see different classloaders. E.g. when you (require ...) it uses one, but when you do a var lookup in Orchard it uses an other one. It's also easy for one tool to set a classloader, which then gets replaced by a different one, or which no longer is accessible (because it has been replaced or because we're on a different thread) by the time it's supposed to be used.

Tools that currently do classloader tricks

  • Pomegrenate
  • lambdaisland.classpath
  • dynapath
  • nREPL sideloader
  • nREPL session middleware
  • nREPL interruptible eval
  • clj-refactor
  • Virgil
  • boot
  • enrich-classpath

Issues

And many more (cider-nrepl issues, nrepl issues, orchard issue).

@plexus
Copy link
Contributor Author

plexus commented Sep 13, 2021

My current thinking is that we need to evolve better patterns which are more explicit, and that we need to reduce the reliance on the "ContextClassloader". The latter being a thread-local and mutable field is where a lot of our predicament comes from.

Some initial guidelines (subject to change, discussion welcome):

  • if you rely on a specific classloader to be present when loading resources, then allow that classloader to be passed in explicitly, or if that's not feasible then at least make it configurable via a var/atom/dynamic binding
  • don't get/set ContextClassloader if you don't have to. If you need the classloader to be set in a certain scope then dynamically bind clojure.lang.Compiler/LOADER
  • if you want the classloader that Clojure currently "sees" then use clojure.lang.RT/baseLoader, this will use Compiler/LOADER or ContextClassloader depending on which are set
  • if you need a DynamicClassloader to manipulate the classpath then find the "root" classloader, i.e. traverse parents until you find the DynamicClassloader that sits directly above the application classloader
  • be aware that when you do add libraries there (this is what enrich-classpath or add-lib do) then this breaks clojure.java.classpath/classpath (Jira CLASSPATH-9)
  • avoid (try (.setContextClassloader ...) ... (finally (.setContextClassloader ...))) as a pattern to temporarily set the classloader, any contextclassloader changes that happen in between will be discarded. Bind Compiler/LOADER instead

@vemv
Copy link
Member

vemv commented Sep 13, 2021

be aware that when you do add libraries there (this is what enrich-classpath or add-lib do)

enrich-classpath doesn't use classloaders at all

@plexus
Copy link
Contributor Author

plexus commented Sep 13, 2021

I may be misunderstanding something. Isn't it enrich-classpath that adds src.zip? because that definitely ends up on the root classloader's path.

@vemv
Copy link
Member

vemv commented Sep 13, 2021

It doesn't do that at the moment, the day I get to implement it I wouldn't use classloaders either. Associating this library to https://clojure.atlassian.net/browse/CLASSPATH-9 is therefore misleading.

That issue is caused by Orchard's classloader-based adding of src.zip, which we currently intend to get rid of.

Ref: https://github.com/clojure-emacs/orchard/blob/ca5fae9f1a8292956a9f3d25cdb3dc160e4c80df/src/orchard/java.clj#L67

that definitely ends up on the root classloader's path.

that's a different topic, one thing is performing classloader tricks or breaking clojure.java.classpath/classpath; another is placing items in the classpath beforehand via e.g. a Lein plugin (as opposed to doing it at runtime).

Of course, adding things to the classpath will make those items reachable by classloaders; that's the whole point (and a fairly clean thing to do: items are placed at the tail of the classpath for minimum interference).

If you believe it's problematic feel free to expand on it, but please be mindful of conflating issues.

Other than that kudos for the writeup / initiative!

@bbatsov
Copy link
Member

bbatsov commented Sep 13, 2021

@plexus Thanks for summarizing the state of affairs. This was long overdue. Here a few thoughts from me:

  • dynapath should go away from Orchard - it has caused us a ton of grief, after JDK 9 was released, and @vemv's work on enrich-classpath will render it redundant there
  • clj-refactor/refactor-nrepl currently don't do any classpath manipulation, if I remember correctly (as hotloading dependencies has been broken for a while); of course, this will change if the Pomegranate PR is completed/merged.
  • It seems to me that Boot is almost dead at this point, so we probably shouldn't worry that much about interactions with it, although if there are some known problems that we can solve on that front that'd be great.
  • Orchard reimplements a part of clojure.java.classpath to better fit the needs of tooling and to avoid a runtime dependency, but I think this part wasn't doing anything with classloaders.

The guidelines outlined by you make sense to me, and they should be fairly easy to adopt. I have to admit that in the early days of nREPL/CIDER I didn't give much thought to the ramifications of our approach, plus things worked reasonably well until JDK 9, which was a wake up call. 😆

@bbatsov bbatsov added the high priority Tickets of particular importance label Sep 13, 2021
@plexus
Copy link
Contributor Author

plexus commented Sep 13, 2021

Thanks for clarifying @vemv, it seems I did indeed misunderstand enrich-classpath.

one thing is performing classloader tricks or breaking clojure.java.classpath/classpath; another is placing items in the classpath beforehand via e.g. a Lein plugin (as opposed to doing it at runtime).

Agreed, manipulating the classpath outside of the JVM process is outside of the scope of what we are talking about here. Adding items at runtime is. Even if the code does not create or replace classloaders, it can make assumptions about the classloader state that don't always hold.

@plexus
Copy link
Contributor Author

plexus commented Sep 15, 2021

I did some more testing on this, and my main impression is that we should start getting rid of some of the noise.

clojure.main/repl adds a new DynamicClassLoader to the stack on every invocation, this is a long known issue. It really should check first if there is already a DynamicClassLoader as the current or an ancestor of the context classloader, and if not refrain from doing anything. I'm not sure if there's been an honest attempt to get something like this upstream, but I think we can start by making our own version of clojure.main/repl. I checked and there aren't too many private functions in there so I think we can copy just this function and make the necessary changes. This won't make a big difference in itself, but it will make it much easier to inspect and debug anything related to the classloader.

Similarly when cloning a session we create a new DynamicClassloader and set it on the session original PR. A thread inherits the context classloader from the thread it was forked from, and generally by that time the CCL has been set (we run a no-op :eval very early on), so if it's already there we don't need to set a new one. I briefly ran this by @cgrand and he didn't see a reason not to share the CL between threads, since they are stateless.

Then there's misc/with-session-classloader (used in sideloader, when loading print functions, and when loading deferred middleware) which sets the contextclassloader, then afterwards tries to "reset" it, which unless you are using the sideloader is so far always a double no-op, except that the second call risks undoing other classloader changes. So there I think we can also be more conservative and only set/reset the contextClassLoader when the session classloader differs from the current contextclassloader.

Even better would perhaps be to not touch the contextclassloader, and only bind Compiler/LOADER (which it also does). Clojure will give precedence to Compiler/LOADER if it is set (see clojure.lang.RT/baseLoader) so this should be enough, but probably isn't, because of code out there that calls getContextClassLoader when what they really want is the baseLoader...

I have a local PoC branch that I'm using to try to dogfood this stuff while working on real world projects. Will keep folks posted and submit the changes I feel most confident about in incremental PRs.


Maybe of interest, a debug log of what currently happens when you start nREPL and connect from cider (no cider-nrepl). Every two lines is a call to setContextClassLoader.

[thread] file:line:col
  + instance -> parent

nREPL starts, we immediately call setContextClassLoader 4 times for no reason.
This is because we load the built-in middleware via the dynamic loader. These
are calls inside dynamic-loader that use with-session-classloader.

[main] nrepl.middleware.dynamic-loader:: (user/cl-app)
 No-op
[main] nrepl.middleware.dynamic-loader:: (user/cl-app)
 No-op
[main] nrepl.middleware.dynamic-loader:: (user/cl-app)
 No-op
[main] nrepl.middleware.dynamic-loader:: (user/cl-app)
 No-op

Now the server is ready, and the client connects. You can see it immediately
building up a chain of classlaoders.

[clojure-agent-send-off-pool-3] clojure.main:51:5 (user/cl-7a032ecb)
  +  clojure.lang.DynamicClassLoader@7a032ecb -> app 
[clojure-agent-send-off-pool-3] clojure.main:51:5 (user/cl-6cc5539d)
  +  clojure.lang.DynamicClassLoader@6cc5539d -> 7a032ecb

First clone request

[nREPL-session-f1ab66a0-7753-4caa-bbc8-e740a81b0a1b] nrepl.middleware.session:211:25 (user/cl-4c481042)
  +  clojure.lang.DynamicClassLoader@4c481042 -> 6cc5539d 
[clojure-agent-send-off-pool-3] clojure.main:51:5 (user/cl-326d65b6)
  +  clojure.lang.DynamicClassLoader@326d65b6 -> 6cc5539d 
[clojure-agent-send-off-pool-3] clojure.main:51:5 (user/cl-38bed28e)
  +  clojure.lang.DynamicClassLoader@38bed28e -> 326d65b6

Second clone request

[nREPL-session-8428b667-5ec3-4c27-9ac5-c745a03a376d] nrepl.middleware.session:211:25 (user/cl-16eeb3c4)
  +  clojure.lang.DynamicClassLoader@16eeb3c4 -> 38bed28e 
[nREPL-session-f1ab66a0-7753-4caa-bbc8-e740a81b0a1b] clojure.main:51:5 (user/cl-7c4e20bd)
  +  clojure.lang.DynamicClassLoader@7c4e20bd -> 4c481042 
[nREPL-session-8428b667-5ec3-4c27-9ac5-c745a03a376d] clojure.main:51:5 (user/cl-3b6cf511)
  +  clojure.lang.DynamicClassLoader@3b6cf511 -> 16eeb3c4 
[nREPL-session-8428b667-5ec3-4c27-9ac5-c745a03a376d] clojure.main:51:5 (user/cl-14b9e978)
  +  clojure.lang.DynamicClassLoader@14b9e978 -> 3b6cf511 

The user hasn't done anything yet and we have instantiated 9 classloaders and called setContextClassloader 13 times.

@bbatsov
Copy link
Member

bbatsov commented Sep 15, 2021

clojure.main/repl adds a new DynamicClassLoader to the stack on every invocation, this is a long known issue. It really should check first if there is already a DynamicClassLoader as the current or an ancestor of the context classloader, and if not refrain from doing anything. I'm not sure if there's been an honest attempt to get something like this upstream, but I think we can start by making our own version of clojure.main/repl. I checked and there aren't too many private functions in there so I think we can copy just this function and make the necessary changes. This won't make a big difference in itself, but it will make it much easier to inspect and debug anything related to the classloader.

Originally nREPL used its own REPL implementation to avoid problems like this one. I recall that after adopting clojure.main/repl Chas regretted this decision, but as clojure.main/repl more or less got the job done we never spent much time thinking about trying to improve it upstream or replacing it.

@bbatsov
Copy link
Member

bbatsov commented Sep 15, 2021

This was the comment from Chas I was thinking about - nrepl/nrepl#8 (comment)

@plexus
Copy link
Contributor Author

plexus commented Sep 16, 2021

Interesting observation today. futures are scheduled on the clojure.lang.Agent/soloExecutor, which is a fixed size threadpool. So calling future may create a new thread or reuse an existing thread. Threads inherit the ContextClassloader from the thread they are created from.

This means that when you call future, you don't know what he contextclassloader inside that future will be. It may be the one you currently have, it may be one that was the contextclassloader at some time in the past, or it may be one that was set during a previous execution of a future.

This also means that if any future/agent threads were created before clojure.main/repl was called, then these will have the application classloader set, not Clojure's DynamicClassLoader.

@vemv
Copy link
Member

vemv commented Aug 25, 2023

Hey @plexus , is this issue good to close by now?

@plexus
Copy link
Contributor Author

plexus commented Aug 28, 2023

Not sure what the current situation is, but I'm fine with closing this. I'm no longer actively following up on it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
high priority Tickets of particular importance
Projects
None yet
Development

No branches or pull requests

3 participants