Skip to content

Conversation

@suvodeep-pyne
Copy link
Contributor

@suvodeep-pyne suvodeep-pyne commented Nov 19, 2025

Introduces ThrottledLogger utility to provide rate-limited exception logging for record transformers, solving the problem where high-frequency errors pollute logs and starve low-frequency critical errors.

Key Changes

  • Use throttled logging by default to log 5 exceptions per exception class per min
  • Remove record logging in transformers in case of exceptions. This prevents both performance and data sharing related issues

Core Implementation

  • New ThrottledLogger utility

    • Implements per-exception-class rate limiting using custom Token Bucket algorithm
    • Each exception type gets independent rate limiter (So we still at least see which exception with min perf overhead)
    • Non-thread-safe by design (needs to be fast)
    • Falls back to DEBUG logging when rate=0 (backward compatible behavior)
  • New RateLimiter class

    • Token Bucket implementation
    • No additional dependency
    • Designed for fast single-threaded access
  • Updated 6 transformers to use throttled logger:

    • FilterTransformer
    • DataTypeTransformer
    • ExpressionTransformer
    • ComplexTypeTransformer
    • SchemaConformingTransformer
    • TimeValidationTransformer
  • New configuration: IngestionConfig.ingestionExceptionLogRateLimitPerMin (default: 5)

Testing

  • Added tests for RateLimiter and ThrottledLogger to cover important cases.

Benefits

  1. Solves Noisy Neighbor Problem: High-frequency NumberFormatException won't starve low-frequency ConnectException
  2. Production Visibility: When configured, exceptions log at WARN/ERROR level instead of DEBUG
  3. Controlled Log Volume: Prevents log flooding during data quality issues
  4. Zero Dependencies: Uses custom Token Bucket implementation, no Guava required
  5. Efficient: Package-private implementation, minimal allocation overhead

Example Output

WARN  [FilterTransformer] Caught exception while executing filter function...
java.lang.NumberFormatException: For input string: "abc"

[... 4 more similar logs within 1 minute ...]

[After rate limit window passes]
WARN  [FilterTransformer] Dropped 9995 occurrences of NumberFormatException
WARN  [FilterTransformer] Caught exception while executing filter function...

Meanwhile, different exception types (e.g., ConnectException) log immediately using independent rate limiters.

Implementation Details

  • API simplified: Transformers pass IngestionConfig directly to logger
  • Rate conversion logic (per-min → tokens/nanosecond) encapsulated in RateLimiter
  • Memory-bounded: Exception class count typically 10-50
  • Token bucket algorithm provides smooth rate limiting with burst handling
  • @VisibleForTesting constructor allows controlled time in tests

- Simplified API: PinotThrottledLogger now accepts IngestionConfig and
  table name directly, eliminating duplicated rate conversion logic across
  transformers
- Backward compatible: Changed default ingestionExceptionLogRateLimitPerMin
  from 5 to 0; rate=0 falls back to DEBUG logging (original behavior)
- Added ServerMeter emission: LOGS_DROPPED_BY_THROTTLED_LOGGER metric tracks
  suppressed logs per table when table name is provided
- Updated all transformers: FilterTransformer, DataTypeTransformer,
  ComplexTypeTransformer, ExpressionTransformer, SchemaConformingTransformer,
  TimeValidationTransformer now use simplified constructor
- Test updates: PinotThrottledLoggerTest updated to verify DEBUG fallback
  behavior when rate=0
…dLogger

Key changes:
- Log throttled exceptions at DEBUG level to ensure no exception info is lost
- Simplify from ConcurrentHashMap/AtomicLong to HashMap/long for single-threaded usage
- Combine two maps into one ExceptionState wrapper for single lookup
- Update javadoc to clarify three-level logging behavior
- Remove thread-safety test (single-threaded per TransformPipeline instance)

Logging behavior after this change:
- Rate ≤ 0: All logs at DEBUG (backward compatible)
- Rate > 0, within quota: WARN/ERROR with suppression counts
- Rate > 0, quota exceeded: DEBUG while tracking suppression counts
@codecov-commenter
Copy link

codecov-commenter commented Nov 20, 2025

Codecov Report

❌ Patch coverage is 88.31169% with 9 lines in your changes missing coverage. Please review.
✅ Project coverage is 63.19%. Comparing base (3224ce6) to head (13181e8).
⚠️ Report is 49 commits behind head on master.

Files with missing lines Patch % Lines
...org/apache/pinot/common/utils/ThrottledLogger.java 86.20% 3 Missing and 1 partial ⚠️
...ocal/recordtransformer/ComplexTypeTransformer.java 50.00% 2 Missing ⚠️
...recordtransformer/SchemaConformingTransformer.java 50.00% 1 Missing ⚠️
...l/recordtransformer/TimeValidationTransformer.java 66.66% 1 Missing ⚠️
...va/org/apache/pinot/spi/utils/CommonConstants.java 0.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #17239      +/-   ##
============================================
+ Coverage     63.17%   63.19%   +0.01%     
- Complexity     1428     1432       +4     
============================================
  Files          3121     3133      +12     
  Lines        184814   185904    +1090     
  Branches      28332    28401      +69     
============================================
+ Hits         116760   117473     +713     
- Misses        59033    59366     +333     
- Partials       9021     9065      +44     
Flag Coverage Δ
custom-integration1 100.00% <ø> (ø)
integration 100.00% <ø> (ø)
integration1 100.00% <ø> (ø)
integration2 0.00% <ø> (ø)
java-11 63.16% <88.31%> (+0.01%) ⬆️
java-21 63.15% <88.31%> (+<0.01%) ⬆️
temurin 63.19% <88.31%> (+0.01%) ⬆️
unittests 63.18% <88.31%> (+0.01%) ⬆️
unittests1 55.59% <77.92%> (-0.36%) ⬇️
unittests2 33.87% <68.83%> (+0.17%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly good. Please update the PR description about the concurrency part

}

private void logWithRateLimit(String msg, Throwable t, BiConsumer<String, Throwable> consumer) {
if (_permitsPerSecond <= 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider modeling this as a separate DebugOnlyLogger. We can also have a UnthrottledLogger if necessary

…ensive unit tests. Simplify `ThrottledLogger` by removing inlined RateLimiter and update related tests.
Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM otherwise

}
LOGGER.debug("Caught exception while transforming complex types for record: {}", record.toString(), e);
_throttledLogger.warn(
String.format("Caught exception while transforming complex types for record: %s", record.toString()), e);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid using String.format as it is quite expensive. Concatanation should be cheaper.

Given LOGGER.debug() doesn't do anything when debug level log is not enabled, and we are already paying overhead of constructing the string, we should consider either:

  • Expose acquire() from _throttledLogger
  • Pass arguments the same way as what LOGGER expects

Not introduced in this PR, but we shouldn't call record.toString(). Instead, we should directly pass record

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed record logging itself. that address both performance and data risks.

@suvodeep-pyne suvodeep-pyne changed the title Add rate-limited exception logging for transformers [ingestion] Add rate-limited exception logging for transformers Nov 25, 2025
@Jackie-Jiang Jackie-Jiang merged commit d78b39b into apache:master Nov 25, 2025
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants