Description
The product uses a regular expression with a worst-case computational complexity that is inefficient and possibly exponential.
Summary
A DATASOURCE WRITE user can hang Overlord worker threads indefinitely with a single sampler request, degrading or denying service on the cluster control plane.
Root cause
Line numbers pinned to druid-31.0.2@230605ec33db326c37154a03bcc4edfccc40203b.
processing/src/main/java/org/apache/druid/data/input/impl/RegexInputFormat.java:50-60:
public RegexInputFormat(
@JsonProperty("pattern") String pattern,
@JsonProperty("listDelimiter") @Nullable String listDelimiter,
@JsonProperty("columns") @Nullable List<String> columns
)
{
this.pattern = pattern;
this.listDelimiter = listDelimiter;
this.columns = columns;
this.compiledPatternSupplier = Suppliers.memoize(() -> Pattern.compile(pattern));
}
RegexInputFormat compiles @JsonProperty pattern with no complexity/length limit and applies Matcher.matches() per line. The sampler runs in the Overlord JVM (CliOverlord.java:460); TimedShutoffInputSourceReader only checks the volatile closed flag at iterator boundaries (:89-101) and cannot interrupt an in-progress Matcher.matches(). Attacker also controls timeoutMs and can set it to 0.
Exploit scenario (static hypothesis — unverified):
DATASOURCE WRITE user POSTs sampler spec with InlineInputSource.data='aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaX' and RegexInputFormat.pattern='^(.*a){20}$', timeoutMs=0. Overlord thread enters catastrophic backtracking and never returns to the iterator boundary. A few concurrent requests exhaust the Jetty pool.
Description
The product uses a regular expression with a worst-case computational complexity that is inefficient and possibly exponential.
Summary
A DATASOURCE WRITE user can hang Overlord worker threads indefinitely with a single sampler request, degrading or denying service on the cluster control plane.
Root cause
processing/src/main/java/org/apache/druid/data/input/impl/RegexInputFormat.java:50-60:RegexInputFormat compiles @JsonProperty pattern with no complexity/length limit and applies Matcher.matches() per line. The sampler runs in the Overlord JVM (CliOverlord.java:460); TimedShutoffInputSourceReader only checks the volatile closed flag at iterator boundaries (:89-101) and cannot interrupt an in-progress Matcher.matches(). Attacker also controls timeoutMs and can set it to 0.
Exploit scenario (static hypothesis — unverified):
DATASOURCE WRITE user POSTs sampler spec with InlineInputSource.data='aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaX' and RegexInputFormat.pattern='^(.*a){20}$', timeoutMs=0. Overlord thread enters catastrophic backtracking and never returns to the iterator boundary. A few concurrent requests exhaust the Jetty pool.