Skip to content

Fix missing application logs in Spring Boot services#553

Draft
dushyantk1509 wants to merge 2 commits intolinkedin:mainfrom
dushyantk1509:dushyantk1509/fix-tables-service-logging
Draft

Fix missing application logs in Spring Boot services#553
dushyantk1509 wants to merge 2 commits intolinkedin:mainfrom
dushyantk1509:dushyantk1509/fix-tables-service-logging

Conversation

@dushyantk1509
Copy link
Copy Markdown
Collaborator

Summary

Application-level logs from the tables, housetables, and jobs Spring Boot services are silently dropped in the Docker recipes (oh-only, oh-hadoop, oh-hadoop-spark). Only Spring Boot's own banner and a handful of early startup lines reach stdout; anything logged via LoggerFactory.getLogger(...) in application code (controllers, catalog, metrics aspect, audit, etc.) disappears. This PR restores those logs.

Root cause

Iceberg 1.5.x (com.linkedin.iceberg:iceberg-data:1.5.2.10) pulls slf4j-api up to 2.x on the runtime classpath. SLF4J 2.x discovers bindings via ServiceLoader (org.slf4j.spi.SLF4JServiceProvider) and ignores the SLF4J 1.x-style bindings that were already packaged:

  • log4j-slf4j-impl:2.13.3 — via spring-boot-starter-log4j2:2.3.4.RELEASE
  • slf4j-log4j12:1.7.25 — transitively via Hadoop

With no 2.x-compatible provider on the classpath, SLF4J falls back to the NOP logger. The only clue in docker logs is a handful of lines:

SLF4J: No SLF4J providers were found.
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: Class path contains SLF4J bindings targeting slf4j-api versions 1.7.x or earlier.
SLF4J: Ignoring binding found at [.../log4j-slf4j-impl-2.13.3.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Ignoring binding found at [.../slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]

Spring Boot's own early startup lines still appear because they bypass SLF4J, which is why the issue isn't obvious at first glance.

Fix

In the shared buildSrc/src/main/groovy/openhouse.springboot-conventions.gradle (applied by all three Spring Boot services and by iceberg/openhouse/internalcatalog / htscatalog):

  1. Exclude the two SLF4J 1.x-style bindings from every configuration:
    • log4j-slf4j-impl
    • slf4j-log4j12
  2. Add org.apache.logging.log4j:log4j-slf4j2-impl:2.25.4, which implements SLF4JServiceProvider and is discovered by SLF4J 2.x.

The version is pinned to 2.25.4 to match what log4j-core was already resolving to via the existing CVE-driven [2.17.1, 3[ constraint. This keeps the full log4j family at exactly 2.25.4 — no downgrade from the pre-fix classpath.

Packaged logging jars (bootJar diff)

Before:

slf4j-api-2.0.7.jar
log4j-slf4j-impl-2.13.3.jar      <- SLF4J 1.x binding, ignored by slf4j-api 2.x
slf4j-log4j12-1.7.25.jar         <- SLF4J 1.x binding, ignored by slf4j-api 2.x
log4j-core-2.25.4.jar
log4j-api-2.25.4.jar

After:

slf4j-api-2.0.17.jar
log4j-slf4j2-impl-2.25.4.jar     <- SLF4J 2.x binding, active
log4j-core-2.25.4.jar            <- unchanged
log4j-api-2.25.4.jar             <- unchanged

Changes

  • Client-facing API Changes
  • Internal API Changes
  • Bug Fixes
  • New Features
  • Performance Improvements
  • Code Style
  • Refactoring
  • Documentation
  • Tests

For all the boxes checked, please include additional details of the changes made in this pull request.

Testing Done

  • Manually Tested on local docker setup.
./gradlew dockerUp -Precipe=oh-only

TOKEN=$(cat build/common/resources/main/dummy.token)
curl -H "content-type: application/json" -H "authorization: Bearer $TOKEN" \
  -XPOST http://localhost:8000/v1/databases/d3/tables/ \
  --data-raw '{"tableId":"t1","databaseId":"d3","baseTableVersion":"INITIAL_VERSION","clusterId":"LocalFSCluster","schema":"{\"type\":\"struct\",\"fields\":[{\"id\":1,\"required\":true,\"name\":\"id\",\"type\":\"string\"}]}","tableProperties":{"key":"value"}}'

docker logs oh-only-openhouse-tables-1

Before: the request returns 201, but docker logs shows nothing from the request path — no OpenHouseInternalCatalog, no DefaultStorageSelector, no MetricsAspect timings, no audit or request-payload JSON. Same NOP warning block as above in openhouse-housetables too.

After: the same request produces the expected trace, e.g.:

c.l.o.i.c.OpenHouseInternalCatalog       : House table entry not found d3.t1
c.l.o.c.s.s.i.DefaultStorageSelector     : Selected storage type=local for d3.t1
c.l.o.t.r.i.MetricsAspect                : OpenHouseInternalRepositoryImpl.findById for table d3.t1 took 360 ms
.o.t.r.i.OpenHouseInternalRepositoryImpl : Creating a new user table: d3.t1 with schema: table { ... }
o.a.i.BaseMetastoreTableOperations       : Successfully committed to table d3.t1 in 109 ms
c.l.o.t.r.i.MetricsAspect                : OpenHouseInternalRepositoryImpl.save for table d3.t1 took 166 ms
{"eventTimestamp":"...","operationType":"CREATE","operationStatus":"SUCCESS",...}
{"startTimestamp":"...","method":"POST","uri":"/v1/databases/d3/tables/","statusCode":201,...}

For all the boxes checked, include a detailed description of the testing done for the changes made in this pull request.

Additional Information

  • Breaking Changes
  • Deprecations
  • Large PR broken into smaller PRs, and PR plan linked in the description.

For all the boxes checked, include additional details of the changes made in this pull request.

Dushyant Kumar and others added 2 commits April 18, 2026 15:39
Iceberg 1.5.x pulls slf4j-api 2.x onto the runtime classpath, but the
log4j2 starter still ships the SLF4J 1.x-style log4j-slf4j-impl and
Hadoop transitively brings slf4j-log4j12. SLF4J 2.x discovers bindings
via ServiceLoader and ignores 1.x bindings, so it silently defaults to
NOP - every LoggerFactory.getLogger(...) call in application code was
dropped in the docker setup (Spring Boot's own startup lines still
appeared because they bypass SLF4J).

Exclude the two 1.x bindings in the shared springboot convention and
add log4j-slf4j2-impl, which implements SLF4JServiceProvider. Verified
end-to-end with './gradlew dockerUp -Precipe=oh-only': creating a
table now surfaces OpenHouseInternalCatalog, DefaultStorageSelector,
MetricsAspect, and request audit logs that were previously silent.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The previous pin at 2.20.0 made log4j-slf4j2-impl the highest explicit
requester for log4j-core, pulling the whole log4j family down from
2.25.4 (the version that was shipping pre-fix, selected via the
[2.17.1, 3[ CVE constraint) to 2.20.0. Functionally safe - all critical
Log4Shell-class CVEs are fixed in 2.17.1 - but it's a needless
5-minor downgrade.

Bumping the pin to 2.25.4 keeps log4j-core at 2.25.4, identical to the
pre-fix classpath state.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@dushyantk1509 dushyantk1509 changed the title Dushyantk1509/fix tables service logging Fix missing application logs in Spring Boot services Apr 18, 2026
@cbb330
Copy link
Copy Markdown
Collaborator

cbb330 commented Apr 18, 2026

Thanks for looking at this.

#511 overlaps with this PR and extends it a bit further. It addresses same root cause (SLF4J 2.x ignoring the 1.x-style bindings → NOP fallback), applies same core fix (exclude log4j-slf4j-impl / slf4j-log4j12, add log4j-slf4j2-impl).

Bit on top of that, #511 also:

  • Adds log4j-1.2-api to bridge Hadoop's direct Log4j 1.x API callers through Log4j2 (needed for Actuator to control Hadoop loggers, which allows us the insight needed to tune our hdfs client for latency)
  • Excludes EOL log4j:log4j (the same fatal bug CVE-2019-17571)
  • Scopes the logging unification to modules with the Spring Boot plugin, and keeps test fixtures logging-neutral
  • adds a script to toggle log level /actuator/loggers during runtime without needing restarts

Mind taking a look at #511? If it looks good to you, we can merge that one and close this in its favor.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants