Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move ThreadContextMap implementations to log4j-core #2593

Closed
wants to merge 4 commits into from

Conversation

ppkarwasz
Copy link
Contributor

We move ThreadContextMap alternative implementations to log4j-core. These thread context maps are not useful to other log4j-api implementations:

  • log4j-to-slf4j delegates everything to SLF4J's MDCAdapter,
  • log4j-to-jul uses the no-op context map.

We move `ThreadContextMap` alternative implementations to `log4j-core`.
These thread context maps are not useful to other `log4j-api`
implementations:

* `log4j-to-slf4j` delegates everything to SLF4J's `MDCAdapter`,
* `log4j-to-jul` uses the no-op context map.
@ppkarwasz ppkarwasz force-pushed the feature/2.x/cleanup-thread-context2 branch from 80cce50 to 9c512f3 Compare May 20, 2024 10:44
@ppkarwasz ppkarwasz force-pushed the feature/2.x/cleanup-thread-context2 branch from 9c512f3 to 2b87b8d Compare May 22, 2024 05:59
@ppkarwasz ppkarwasz marked this pull request as ready for review May 22, 2024 10:19
@rgoers
Copy link
Member

rgoers commented May 22, 2024

Question: With the changes I recently made are these alternative implementations really of value to anything? The primary motivation for them seemed to be to reduce the number of garbage objects being created or the amount of copying being done. But now the algorithm used no longer needs to copy the maps, which makes me wonder if they are even needed.

@jengebr
Copy link
Contributor

jengebr commented May 22, 2024

Question: With the changes I recently made are these alternative implementations really of value to anything? The primary motivation for them seemed to be to reduce the number of garbage objects being created or the amount of copying being done. But now the algorithm used no longer needs to copy the maps, which makes me wonder if they are even needed.

@rgoers what changes are you referring to? I don't see anything that eliminates the copies.

@rgoers
Copy link
Member

rgoers commented May 22, 2024

@jengebr ThreadContextDataInjector was doing all the copying into the LogEvent by making a copy of the current ThreadContextMap and then passing that copy so the fields could be included in the LogEvent's Map. That class is now deprecated and removed in 3.0.0. It was replaced with the ContextData utility class where the addAll() method now directly populates the ThreadContext data of the LogEvent from the appropriate context objects. This is actually a side effect as the change was made to allow all context data from ContextDataProviders to be included in the LogEvent's map as well as in Lookups and Filters.

@jengebr
Copy link
Contributor

jengebr commented May 22, 2024

@rgoers thank you for explaining. Unfortunately I support several applications that rely on CloseableThreadContext to track short-lived context, and those operations trigger copies. CloseableThreadContext$Instance.putAll() is a good example: it calls ThreadContext.putAll(), which in turn delegates the work to the various implementations and performs copies.

@rgoers
Copy link
Member

rgoers commented May 22, 2024

@jengebr I would suggest you look at ScopedContext that I recently added to Log4j-API. It is a much better way to manage a short-lived context.

@jengebr
Copy link
Contributor

jengebr commented May 22, 2024

I agree that it looks good, but the current api meets our needs, especially once perf improves via StringArrayThreadContextMap. There's no upside to modifying our existing applications to use it.

New applications are of course a different beast.

@rgoers
Copy link
Member

rgoers commented May 22, 2024

@jengebr Thanks. I personally would love your feedback on ScopedContext and if there is anything it might be missing that would make it easier to support your use case. I would generally agree that modifying existing applications that are already working should be avoided. If you can provide feedback on that I would suggest not adding it to this PR as it isn't really related.

Also, I am curious as to your perspective on qos-ch/slf4j#372 and https://www.slf4j.org/apidocs-2.1.0-alpha0/org.slf4j/org/slf4j/MDCAmbit.html. Would you prefer that over ScopedContext or vice versa?

@ppkarwasz
Copy link
Contributor Author

Question: With the changes I recently made are these alternative implementations really of value to anything? The primary motivation for them seemed to be to reduce the number of garbage objects being created or the amount of copying being done. But now the algorithm used no longer needs to copy the maps, which makes me wonder if they are even needed.

You are right, maybe we can drop some implementations, but others must stay:

  • John's StringArrayThreadContextMap outperforms DefaultThreadContextMap and is web-application safe, even if the user forgets to clear the map at the end of processing! We could however remove DefaultThreadContextMap and rename StringArrayThreadContextMap to DefaultThreadContextMap,
  • GarbageFreeSortedArrayThreadContextMap should stay, since it is the only one that provides a provably garbage-free experience, when accessed through ThreadContext. Both ScopedContext and CloseableThreadContext generate temporary objects. We could rewrite it to use a mutable version of John's UnmodifiableArrayBackedMap though,
  • CopyOnWriteSortedArrayThreadContextMap can probably be removed, since Java 8 made StringMap-based implementations obsolete.

In general I think that people will be using ThreadContext for a very long time, so we might as well improve its implementations. Sure the ThreadContext API is very error prone compared to ScopedContext, but so is the usage of locks vs synchronized blocks or ThreadLocal vs the new ScopedValue. Experienced programmers have been using those correctly for years.

Anyway, the main point of this PR is to make log4j-api lighter:

  • the API started at 115 KiB (which already twice the size of the current slf4j-api),
  • by the time 2.7 was published, it reached 213 KiB,
  • release 2.17.1 has a size of 294 KiB,
  • release 2.23.1: 334 KiB,
  • now we are oscillating around 372 KiB.

Since only log4j-core uses our ThreadContextMap implementations (log4j-to-slf4j uses whatever Logback provides, log4j-to-jul uses No-op), we can move the implementations to log4j-core, where we can publish a major version every couple of years. In Log4j API on the other hand we can not even remove the interfaces that we introduced so that Java 7 users can profit from Java 8 improvements. Those interfaces will be there long after Java 7 ceases to be used.

@rgoers
Copy link
Member

rgoers commented May 23, 2024

@ppkarwasz In general I agree with this proposal. If StringArrayThreadContextMap performs the best the I would be in favor of making it the default implementation.
I am not sure how you are using the term "garbage free" here. Our documentation talks about "garbage free logging" and that is what the unit tests check for - no garbage being created while logging. Garbage created outside of logging is considered to be OK. By that definition ScopedContext is always garbage free and ThreadContextMap probably is now too, since we no longer copy maps when populating LogEvents. Note that with ScopedContext you should be able to even limit the object creation there by keeping the reference to an Instance.

Yes, I understand ThreadContextMap isn't going away. It can't as long as SLF4J is supported. But I really dislike the efforts to "fix" it. I really don't want to support "MDCArmpit" (misspelling is intentional) - https://www.slf4j.org/apidocs-2.1.0/org/slf4j/MDCAmbit.html - as it doesn't really "fix" anything and is just lipstick on a pig.

As far as making log4j-api lighter goes.... Using the size of the jar file is meaningless. It could be 1TB but if you are only using 1 class from it that is all that is going to be loaded. I don't believe it is even going to read the whole file from disk. It will just read the directory and directly load the class files it needs as they are referenced. What is far more impactful is the number and size of the classes being loaded. By far the worst culprit is Logger. But we made the conscious decision to explode that interface to improve performance by having so many overloaded methods. Also for comparison:
commons-lang: 277K
commons-text: 241K
guava: 2,982K

Toi be honest, this is the first time I have ever actually looked at the size of these jars. I simply don't care since I know there is very little value in it.

My point was simply that I am not sure we even need other implementations other than the default.

I do agree that the API should contain 1 functional implementation that meets the majority of use cases.

@jengebr
Copy link
Contributor

jengebr commented May 23, 2024

Responding to a few details:

As far as making log4j-api lighter goes.... Using the size of the jar file is meaningless. It could be 1TB but if you are only using 1 class from it that is all that is going to be loaded.

Spring and Tomcat both scan the entire classpath to look for annotated classes, and more classes translate into a slower scan. However, moving classes from one library to another is neutral in this regard.

Yes, I understand ThreadContextMap isn't going away. It can't as long as SLF4J is supported. But I really dislike the efforts to "fix" it.

Improving library implementations with no application change is a clear win, since it's effectively a free improvement to every application. I'm not sure if that's what you mean by "fix"?

@rgoers
Copy link
Member

rgoers commented May 23, 2024

No - SLF4J's "fix" introduces a new class, MDCAmbit, that you would use instead of MDC. As far as I can tell all it really does is add a close method. In contrast, ScopedContext doesn't need a close method since everything is automatically cleaned up when you exit the block. As Piotr pointed out it is similar to the difference between syncrhonized and lock.

@ppkarwasz
Copy link
Contributor Author

@ppkarwasz In general I agree with this proposal. If StringArrayThreadContextMap performs the best the I would be in favor of making it the default implementation. I am not sure how you are using the term "garbage free" here. Our documentation talks about "garbage free logging" and that is what the unit tests check for - no garbage being created while logging. Garbage created outside of logging is considered to be OK.

The GarbageFreeSortedArrayStringMap satisfies a stronger definition of garbage-free. Code like this one:

try {
    ThreadContext.put("key1", "value1");
    ThreadContext.put("key2", "value2");
    logger.info("Hello {}!", "ThreadContext");
} finally {
    ThreadContext.remove("key1");
    ThreadContext.remove("key2");
}

generate no garbage, whereas:

ScopedContext.where("key1", "value1")
        .where("key2", "value2")
        .run(() -> logger.info("Hello {}!", "ScopedContext"));

creates several temporary objects.

@rgoers
Copy link
Member

rgoers commented May 24, 2024

Yes, I assumed that is what you meant. However, nowhere in https://logging.apache.org/log4j/2.x/manual/garbagefree.html or https://logging.apache.org/log4j/2.x/manual/thread-context.html do we make such a guarantee. We do say ThreadContextMap is not garbage free by default can be be made garbage free by setting a property, but notably this appears under the section "Disabling Garbage Free Logging". I take that to mean that, like everything else in this section of the manual, it is only guaranteeing garbage free during logging. If we really want to make such a guarantee we should explicitly state it. Personally, I don't think we should.

}
}
public static void putAll(final Map<String, String> map) {
contextMap.putAll(map);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please maintain separate if/else branches for the most common instance types, as in the previous version. This avoids virtual method lookups for those types.

contextMap.remove(key);
}
}
contextMap.removeAll(keys);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please maintain separate if/else branches for the most common instance types, as in the previous version. This avoids virtual method lookups for those types.

@ppkarwasz ppkarwasz marked this pull request as draft June 18, 2024 11:36
@ppkarwasz
Copy link
Contributor Author

I am converting this to a draft, since I'll use a slightly different approach:

  • I'll move StringArrayThreadContextMap to be the DefaultThreadContextMap,
  • I'll remove CopyOnWriteSortedArrayThreadContextMap,
  • I'll move GarbageFreeSortedArrayThreadContextMap to log4j-core.

@rgoers
Copy link
Member

rgoers commented Jun 18, 2024

@ppkarwasz Have you reconciled the problems I was seeing with StringArrayThreadContextMap in the perf tests?

@ppkarwasz
Copy link
Contributor Author

@rgoers,

No, I didn't have time to look at it yet.

@ppkarwasz
Copy link
Contributor Author

Closing, since #2690 deprecates it.

@ppkarwasz ppkarwasz closed this Jun 24, 2024
@ppkarwasz ppkarwasz deleted the feature/2.x/cleanup-thread-context2 branch June 24, 2024 14:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants