Policy2 by cecemei · Pull Request #17865 · apache/druid

cecemei · 2025-04-02T19:35:33Z

Fixes #XXXX.

Description

Fixed the bug ...

Renamed the class ...

Added a forbidden-apis entry ...

Release note

Key changed/added classes in this PR

MyFoo
OurBar
TheirBaz

This PR has:

Implement pushTaskPayload/streamTaskPayload as introduced in apache#14887 for HDFS storage to allow larger mm-less ingestion payloads when using HDFS as the deep storage location.

* Add deprecated com.google.common.io.Files#write to forbiddenApis * Replace deprecated Files.write()

Mistakenly categories under deep storage instead of metadata store.

Changes --------- - Bind `SegmentMetadataCache` only once to `HeapMemorySegmentMetadataCache` in `SQLMetadataStorageDruidModule` - Invoke start and stop of the cache from `DruidOverlord` rather than on lifecycle start/stop - Do not override the binding in `CliOverlord`

…task time (apache#17770) Changes --------- - Use `maxIntervalToKill` to determine search interval for killing unused segments. - If no segment has been killed for the datasource yet, use durationToRetain

…pec was unmodified (apache#17707) Add an optional query parameter called skipRestartIfUnmodified to the /druid/indexer/v1/supervisor endpoint. Callers can set skipRestartIfUnmodified=true to not restart the supervisor if the spec is unchanged. Example: curl -X POST --header "Content-Type: application/json" -d @supervisor.json localhost:8888/druid/indexer/v1/supervisor?skipRestartIfUnmodified=true

Changes --------- - Emit time lag from Kafka similar to Kinesis as metrics `ingest/kafka/lag/time`, `ingest/kafka/maxLag/time`, `ingest/kafka/avgLag/time` - Add new method in `KafkaSupervisor` to fetch timestamps of latest records in stream to compute time lag - Add new field `emitTimeLagMetrics` in `KafkaSupervisorIOConfig` to toggle emission of new metrics

* suggest filter values when known * update snapshots * add more d * fix load rule clamp * better segment timeline init

Changes --------- - Usages of skife config had been deprecated in apache#14695 and `LegacyBrokerParallelMergeConfig` is the last config class that still uses it. - Remove `org.skife.config` from pom, licenses, log4j2.xml, etc. - Add validation for deleted property paths in `StartupInjectorBuilder.PropertiesValidator` - Use the replacement flattened configs (which remove the `.task` and `.pool` substring)

Changes --------- - Add field `taskLimits` to the following worker select strategies `equalDistribution`, `equalDistributionWithCategorySpec`, `fillCapacityWithCategorySpec`, `fillCapacity` - Add sub-fields `maxSlotCountByType` and `maxSlotRatioByType` to `taskLimits` - Apply these limits per worker when assigning new tasks --------- Co-authored-by: sviatahorau <mikhail.sviatahorau@deep.bi> Co-authored-by: Benedict Jin <asdf2014@apache.org> Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>

…che#17778)

* Docs: Add query example * Update after review * Update query * Update docs/api-reference/sql-api.md --------- Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>

remove usage of dependency:go-offline from build scripts - as it tries to download excluded artifacts --------- Co-authored-by: Zoltan Haindrich <kirk@rxd.hu>

…7790) Currently, query stack traces are logged only when "debug: true" is set in the query context. This patch additionally logs stack traces targeted at the DEVELOPER or OPERATOR personas, because for these personas, stack traces are useful more often than not. We continue to omit stack traces by default for USER and ADMIN, because these personas are meant to interact with the API, not with code or logs. Skipping stack traces minimizes clutter in the logs.

* fix go to task selecting correct task type * support autocompact also * support scheduled_batch, refactor * one more state and update tests

Enables Calcite*Test-s and quidem tests to run queries with Dart. needed some minor tweaks: changed to use interfaces at some places renamed DartWorkerClient to DartWorkerClientImpl and made DartWorkerClient an interface reused existing parts of the MSQ test system to run the query

* Fix single container config creates failing peon tasks * More obvious array error output

Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com>

…I response (apache#17840) Adds support for an optional filename query parameter to the /druid/v2/sql/statements/{queryId}/results API. When provided, the response will include a header Content-Disposition: attachment; filename="{filename}", which will instruct a web browser to save the response as a file rather than displaying it inline. This save-as-attachment behavior could be achieved by adding a "download" attribute to the results link, but this only works for same-origin URLs (as in the Web Console). If the UI origin is different from the Druid API origin, browsers will ignore the attribute and serve the results inline, which is poor UX for files that are potentially very large. For the sake of consistency, all successful responses in SqlStatementResource.doGetResults may include this header, even if there are no results. Release note Improved: The "Get query results" statements API supports an optional filename query parameter. When provided, the response will instruct web browsers to save the results as a file instead of showing them inline (via the Content-Disposition header).

* set filename * update download button * added markdown support * add test * better download * fix TSV * better download behaviour and tests * always show download all button

…est (apache#17841) Changes: - Fix flakiness in SegmentBootstrapperTest - Make TestSegmentCacheManager thread safe by moving from ArrayList to CopyOnWriteArrayList - Modify assertions to disregard list ordering since order of list modifications is not always deterministic - Fix flaky KinesisIndexTask tests.

…ixing bugs (apache#17844) * better debounce * better cumpose filter * hook up preview filters * better stack handling * fix some props * refactor stack to facet * fix hover part 1 * line hover part 2 * start adding moduleWhere * info popover * add filter icon * toggle button * module filter bar

cecemei and others added 30 commits March 3, 2025 08:16

Some policy config

8640a92

add checks for SegmentMetadataQuery

b086f18

Add thread.sleep for flaky.

ff21a33

auth config

ad864d6

format, and remove temp folder rules

3b9fabc

added NoopPolicyEnforcer and RestrictAllTablesPolicyEnforcer class

cf18a05

Support pushing and streaming task payload for HDFS (apache#17742)

7e9aef5

Implement pushTaskPayload/streamTaskPayload as introduced in apache#14887 for HDFS storage to allow larger mm-less ingestion payloads when using HDFS as the deep storage location.

Remove usages of deprecated API Files.write() (apache#17761)

65755d2

* Add deprecated com.google.common.io.Files#write to forbiddenApis * Replace deprecated Files.write()

Doc: Fix description typo for sqlserver metadata store (apache#17771)

4da2b33

Mistakenly categories under deep storage instead of metadata store.

Docs: Remove semicolon from example (apache#17759)

dc2780a

Restrict segment metadata kill query till maxInterval from last kill …

b1fcf95

…task time (apache#17770) Changes --------- - Use `maxIntervalToKill` to determine search interval for killing unused segments. - If no segment has been killed for the datasource yet, use durationToRetain

Reduce noisy coordinator logs (apache#17779)

0420632

fix processed row formatting (apache#17756)

33d49a5

Web console: add suggestions for table status filtering. (apache#17765)

ae84dda

* suggest filter values when known * update snapshots * add more d * fix load rule clamp * better segment timeline init

remove NullValueHandlingConfig, NullHandlingModule, NullHandling (apa…

06ad7b1

…che#17778)

Docs: Add SQL query example (apache#17593)

2bb9d3d

* Docs: Add query example * Update after review * Update query * Update docs/api-reference/sql-api.md --------- Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>

More logging cleanup on Overlord (apache#17780)

45424ba

Remove maven.twttr repo from pom (apache#17797)

c4a4f4d

remove usage of dependency:go-offline from build scripts - as it tries to download excluded artifacts --------- Co-authored-by: Zoltan Haindrich <kirk@rxd.hu>

fix bug (apache#17791)

980c492

Set useMaxMemoryEstimates=false for MSQ tasks (apache#17792)

e417008

Web console: fix go to task selecting correct task type (apache#17788)

5096d5b

* fix go to task selecting correct task type * support autocompact also * support scheduled_batch, refactor * one more state and update tests

Fix single container config creates failing peon tasks (apache#17794)

f6fbfce

* Fix single container config creates failing peon tasks * More obvious array error output

Update k8s-jobs.md reference (apache#17805)

f4440b5

Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com>

adarshsanjeev and others added 14 commits March 31, 2025 00:50

Web console: download follow up (apache#17845)

9910777

* set filename * update download button * added markdown support * add test * better download * fix TSV * better download behaviour and tests * always show download all button

update TestSegmentCacheManager

2bca650

Merge commit '3e62978a96' into policy

3682332

revert some style changes

159f50e

validate datasource in CachingClusteredClient as well

0ad463a

fix build failure and update style

06bfcce

SegmentMetadataQuery should allow NoRestrictionPolicy

af913e8

changes

e610403

Merge branch 'master' into policy

106cfed

add inlineds test

fbccd53

Merge branch 'segmentquery' into policy2

6abddd1

github-actions bot added the Area - Dependencies label Apr 2, 2025

cecemei added 7 commits April 2, 2025 18:00

add sanity check on segment

ef89d30

update exception message

e4baa4e

Merge branch 'master' into policy2

366af99

Merge branch 'policy' into policy2

8bdac85

inject policy enforcer

b2a9ef4

Merge branch 'policy' into policy2

aa58334

Merge branch 'segmentquery' into policy2

2d888af

github-actions bot added Area - Batch Ingestion Area - Querying Area - Segment Format and Ser/De Area - MSQ For multi stage queries - https://github.com/apache/druid/issues/12262 labels Apr 3, 2025

cecemei added 3 commits April 3, 2025 15:32

Merge branch 'master' into policy

92417f3

add PolicyEnforcer binding in MSQTestBase

9063bd0

Merge branch 'policy' into policy2

6c2d6cd

cecemei closed this Apr 7, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Policy2#17865

Policy2#17865
cecemei wants to merge 82 commits intoapache:masterfrom
cecemei:policy2

cecemei commented Apr 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

Conversation

cecemei commented Apr 2, 2025

Description

Fixed the bug ...

Renamed the class ...

Added a forbidden-apis entry ...

Release note

Key changed/added classes in this PR

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants