Segmentcountasheader by ravisharm · Pull Request #16847 · apache/druid

ravisharm · 2024-08-06T11:07:03Z

Fixes #XXXX.

Description

Fixed the bug ...

Renamed the class ...

Added a forbidden-apis entry ...

Release note

Key changed/added classes in this PR

MyFoo
OurBar
TheirBaz

This PR has:

The extension packaging included both shaded and unshaded dependencies in the classpath. Shading should not be necessary in this case. Also excludes guava dependencies, which are already provided by Druid and don't need to be added to the extensions jars.

* METRICS-1302: Added prefix support for resource labels. * Addressed review comments. * Added and moved configs to ingestion spec, optimized code. * Addressed review comments * Updated metric dimesnion and other review comments * Flipped ternary operator * Moved from NullHandling to StringUtils. * Removed unnecessary HashMap. * Removed verbosity for instance variables.

* Added getters for configs, labels for distribution metric. * Addressed review comments * Removed extra brackets in JsonProperty.

Align protobuf dependencies to use the main pom one

- fix millisecond resolution being dropped when converting timestamps - remove unnecessary conversion of ByteBuffer to ByteString - make test code a little more concise

- remove the need to parse timestamps into their own column - reduce the number of times we copy maps of labels - pre-size hashmaps and arrays when possible - use loops instead of streams in critical sections Combined these changes improve parsing performance by about 15% - added benchmark for reference

…tFormat (#26) InputRowParsers have been deprecated in favor or InputFormat. This implements the InputFormat version of the OpenCensus Protobuf parser, and deprecates the existing InputRowParser implementation. - the existing InputRowParser behavior is unchanged. - the InputFormat behaves like the InputRowParser, except for the default resource prefix which now defaults to "resource." instead of empty. - both implementations internally delegate to OpenCensusProtobufReader, which is covered by the existing InputRowParser tests.

…che#14281) (#139) (cherry picked from commit 4ff6026)

Co-authored-by: Rishabh Singh <6513075+findingrish@users.noreply.github.com>

* Downgrade busybox version to fix k8s IT (apache#14518) * Add TargetArch needed in distribution/Dockerfile * Fix linting --------- Co-authored-by: Rishabh Singh <6513075+findingrish@users.noreply.github.com>

- remove our custom profile to build using dockerfile-maven-plugin, since that plugin is no longer maintained. - remove our custom Dockerfile patches since we can now use the BUILD_FROM_SOURCE argument to decide if we want to build the tarball outside of docker.

…" (#147) This reverts our custom patch from commit 7cf2de4. The necessary Java 17 exports are now included as part of 25.0.0 in https://github.com/confluentinc/druid/blob/25.0.0-confluent/examples/bin/run-java#L27-L56 which is now called by the druid.sh docker startup script as well. The exports for java.base/jdk.internal.perf=ALL-UNNAMED are no longer needed since apache#12481 (comment)

… cache (#145) (#148) * utilize workflow level caching to publish the built artifacts to the tests. otherwise turn off all caching of .m2 etc * remove .m2/settings.xml to ensure build passes without internal artifact store --------- Co-authored-by: Jeremy Kuhnash <111304461+jkuhnashconfluent@users.noreply.github.com>

* Debeian based base image upgrade * updated suggestions * Update Dockerfile * minor correction ---------

…erlying inputRow map instead of eagerly copying (apache#13406) (apache#13447)" (#155) This reverts commit 23500a4.

Metrics that contain the NoRecordedValue Flag are being written to Druid with a 0 value. We should properly handle them in the backend

…d TLS support (apache#14827) (#159) This PR updates the library used for Memcached client to AWS Elasticache Client : https://github.com/awslabs/aws-elasticache-cluster-client-memcached-for-java This enables us to use the option of encrypting data in transit: Amazon ElastiCache for Memcached now supports encryption of data in transit For clusters running the Memcached engine, ElastiCache supports Auto Discovery—the ability for client programs to automatically identify all of the nodes in a cache cluster, and to initiate and maintain connections to all of these nodes. Benefits of Auto Discovery - Amazon ElastiCache AWS has forked spymemcached 2.12.1, and has since added all the patches included in 2.12.2 and 2.12.3 as part of the 1.2.0 release. So, this can now be considered as an equivalent drop-in replacement. GitHub - awslabs/aws-elasticache-cluster-client-memcached-for-java: Amazon ElastiCache Cluster Client for Java - enhanced library to connect to ElastiCache clusters. https://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/elasticache/AmazonElastiCacheClient.html#AmazonElastiCacheClient-- How to enable TLS with Elasticache On server side: https://docs.aws.amazon.com/AmazonElastiCache/latest/mem-ug/in-transit-encryption-mc.html#in-transit-encryption-enable-existing-mc On client side: GitHub - awslabs/aws-elasticache-cluster-client-memcached-for-java: Amazon ElastiCache Cluster Client for Java - enhanced library to connect to ElastiCache clusters.

…ress CVEs (#164) * Bump org.xerial.snappy:snappy-java from 1.1.8.4 to 1.1.10.5 * Add licenses

Upgraded Avro to 1.11.1 (cherry picked from commit 72cf91f) Co-authored-by: Tejaswini Bandlamudi <96047043+tejaswini-imply@users.noreply.github.com>

…n to address CVEs (#164)" (#166) This reverts commit 185d655.

Create new profiles to enable only the used extensions during the build. This helps address CVEs that were being flagged due to the unused extensions. --------- Co-authored-by: Keerthana Srikanth <ksrikanth@confluent.io>

…pache#14995) (#189)

…re visibility (#191) * Add indexer level task metrics to provide more visibility in the task distribution (apache#15991) Changes: Add the following indexer level task metrics: - `worker/task/running/count` - `worker/task/assigned/count` - `worker/task/completed/count` These metrics will provide more visibility into the tasks distribution across indexers (We often see a task skew issue across indexers and with this issue it would be easier to catch the imbalance)

Statsd client sometimes drops metrics when this queueSize of statsd client with max unprocessed messages is completely full. This causes some high cardinality metrics like per partition lag being droppped. There are multiple parameters of statsdclient that can be initialized and can help increase the load/capacity of client to not to drop metrics more frequently. Properties like queueSize, poolSize, processorWorkers and senderWorkers will now be configurable at runtime

…#15) * Add additional header to support segment count * Fix import and header emit code

kfaraz and others added 30 commits December 22, 2022 18:38

[maven-release-plugin] prepare release druid-25.0.0-rc2

9a78059

Bring dockerfile up to date

dc78d75

add opencensus extension

64265a6

make checkstyle happy

19d4a9c

bump pom version for opencensus extension

533a141

METRICS-516: Adding Resource labels in OpenCensus Extension

f3527c9

bump extension version to match release

c2534d6

confluent-extensions with custom transform specs (#9)

7426bcd

fix extraction transform serde (#10)

923dc0e

fix check-style build errors

52f79ac

setup semaphore build

0d27196

add checkstyle

03ff750

fix edge cases for internal topics

6c083c9

Added getters for configs, labels for distribution metric. (#15)

9f2d969

* Added getters for configs, labels for distribution metric. * Addressed review comments * Removed extra brackets in JsonProperty.

Default resource label prefix to blank - Backward Compatibility (#16)

0b7f53c

update opencensus parent pom version

951f9e6

update opencensus extensions for 0.19.x

45d4ae7

update parent pom version for confluent-extensions

f6d3e6d

Add the capability to speed up S3 uploads using AWS transfer manager

bc7ede5

fix conflicting protobuf dependencies

034c91b

Align protobuf dependencies to use the main pom one

fix timestamp milliseconds in OpenCensusProtobufInputRowParser

f928e42

- fix millisecond resolution being dropped when converting timestamps - remove unnecessary conversion of ByteBuffer to ByteString - make test code a little more concise

add default query context and update timeout to 30 sec

cb5094f

Setting default query lane from druid console.

e96c25f

Giving more heap space for test jvm in semaphore config.

d3247e9

update parent pom version for Confluent extensions

4229ba8

Add Java 11 image build and remove unused MySQL images

8fe4601

m-ghazanfar and others added 29 commits June 6, 2023 10:25

Fix jest and prettify checks

dc00aaf

Adding SegmentMetadataEvent and publishing them via KafkaEmitter (apa…

12cc93c

…che#14281) (#139) (cherry picked from commit 4ff6026)

Downgrade busybox version to fix k8s IT (apache#14518) (#143)

515ad51

Co-authored-by: Rishabh Singh <6513075+findingrish@users.noreply.github.com>

Passing TARGETARCH in build_args to Docker build (#144)

8ed07a9

* Downgrade busybox version to fix k8s IT (apache#14518) * Add TargetArch needed in distribution/Dockerfile * Fix linting --------- Co-authored-by: Rishabh Singh <6513075+findingrish@users.noreply.github.com>

OBSDATA-1365: add support for debian based base images (#149)

acfc3f5

* Debeian based base image upgrade * updated suggestions * Update Dockerfile * minor correction ---------

Revert "fix KafkaInputFormat with nested columns by delegating to und…

716544c

…erlying inputRow map instead of eagerly copying (apache#13406) (apache#13447)" (#155) This reverts commit 23500a4.

Filter Out Metrics with NoRecordedValue Flag Set (#157)

cc0c452

Metrics that contain the NoRecordedValue Flag are being written to Druid with a 0 value. We should properly handle them in the backend

PRSP-3603 Bump org.xerial.snappy:snappy-java to latest version to add…

185d655

…ress CVEs (#164) * Bump org.xerial.snappy:snappy-java from 1.1.8.4 to 1.1.10.5 * Add licenses

[backport] Upgrade Avro to latest version (apache#14440) (#162)

64a76fe

Upgraded Avro to 1.11.1 (cherry picked from commit 72cf91f) Co-authored-by: Tejaswini Bandlamudi <96047043+tejaswini-imply@users.noreply.github.com>

Revert "PRSP-3603 Bump org.xerial.snappy:snappy-java to latest versio…

9812817

…n to address CVEs (#164)" (#166) This reverts commit 185d655.

Upgrade Avro to latest version to address CVEs (#167)

b08bded

Fix dist-used profile to use Hadoop compile version (#173)

743dd74

Change user and package manager (#178)

6be7af8

Prevent multiple attempts to publish segments for the same sequence (a…

80e7a13

…pache#14995) (#189)

chore: update repo semaphore project

55ce769

Adding more logs to debug multiple checkpointing issue

4cd230b

minor change

6f3ec3d

[RCCA-17777]: Adding more details in the checkpointing logs. (#196)

6ed4208

[RCCA-17777]: Adding condition for log line. (#197)

8345538

OBSTEL-1601: Observability GH Team Cleanup (#201)

8c4c561

Enable BackPressure Metric (#205)

72ff8ed

OBSDATA-4891 Add additional header to broker to support segment count (…

6dcb126

…#15) * Add additional header to support segment count * Fix import and header emit code

ravisharm closed this Aug 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Segmentcountasheader#16847

Segmentcountasheader#16847
ravisharm wants to merge 109 commits intoapache:25.0.0from
confluentinc:segmentcountasheader

ravisharm commented Aug 6, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

Conversation

ravisharm commented Aug 6, 2024

Description

Fixed the bug ...

Renamed the class ...

Added a forbidden-apis entry ...

Release note

Key changed/added classes in this PR

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants