Skip to content

CAMEL-23387: camel-telemetry - Add span decorators for AWS Polly, Rekognition, Textract, Transcribe, Translate, Comprehend and S3 Vectors#23083

Merged
oscerd merged 2 commits into
apache:mainfrom
oscerd:ci-issue-CAMEL-23387-part6
May 8, 2026
Merged

Conversation

@oscerd
Copy link
Copy Markdown
Contributor

@oscerd oscerd commented May 8, 2026

Summary

Final AWS batch of span decorators for camel-telemetry. Closes out AWS coverage for CAMEL-23387 by adding decorators for the AI/ML group: text-to-speech (Polly), image AI (Rekognition), OCR (Textract), speech-to-text (Transcribe), translation (Translate), NLP (Comprehend) and vector search (S3 Vectors).

After this PR, all 36 AWS components in components/camel-aws/ that have a Camel scheme will have a corresponding SpanDecorator. The only remaining follow-up on CAMEL-23387 is the Google Cloud decorators mentioned in the original ticket, which is in scope for a separate JIRA.

Changes

New SpanDecorator implementations under org.apache.camel.telemetry.decorators:

  • AwsPollySpanDecorator (aws2-polly) — Text-to-speech. Tags: operation, voiceId, outputFormat, engine, languageCode. Lexicon content (PLS XML), the synthesized audio's S3 destination (bucket/key), the SNS topic ARN for notifications, and the requestCharacters response counter are not surfaced.
  • AwsRekognitionSpanDecorator (aws2-rekognition) — Image/video AI. Tags: operation, collectionId, jobId, jobName, faceId. Image data (binary), kms key id, large config objects (operations/output/human-loop config) and bulk facial-attribute / feature collections are not surfaced.
  • AwsTextractSpanDecorator (aws2-textract) — Document OCR. Tags: operation, s3Bucket, s3Object, jobId. The S3 object version, pagination tokens and feature-type collection are not surfaced.
  • AwsTranscribeSpanDecorator (aws2-transcribe) — Speech-to-text. Tags: transcriptionJobName, languageCode, mediaFormat, mediaUri. The Transcribe2Constants interface does not define an OPERATION header — operations are configured via the URI — so no operation tag is emitted (the span name from the URI already conveys the action). Vocabulary phrase lists, tag maps and the resource ARN are not surfaced.
  • AwsTranslateSpanDecorator (aws2-translate) — Translation. Tags: operation, sourceLanguage, targetLanguage. Custom-terminology name collections are not surfaced.
  • AwsComprehendSpanDecorator (aws2-comprehend) — NLP. Tags: operation, languageCode. Detection results (detected language, sentiment, scores) live on the OUT message and are not visible in beforeTracingEvent, so they are not surfaced.
  • AwsS3VectorsSpanDecorator (aws2-s3-vectors) — Vector search. Tags: operation, vectorBucketName, vectorIndexName, vectorId. Vector embedding data and query vectors (floats), metadata maps, similarity thresholds, distance metrics and response payloads (similarity scores, result counts, index status, bucket ARN) are not surfaced.

All seven decorators extend AbstractSpanDecorator (these are producer-only or producer+polling-consumer components without messaging-style ordering semantics) and are registered alphabetically in META-INF/services/org.apache.camel.telemetry.SpanDecorator. Unit tests cover header-to-tag extraction for each decorator.

Header constants are mirrored from each component's *Constants interface (with a Javadoc reference back to the source), matching the convention used by previous batches and AzureServiceBusSpanDecorator. This avoids creating hard dependencies from camel-telemetry to the AWS component modules.

Tag selection rationale

Same two rules applied across batches 3 through 6:

  1. Never emit values that may contain secrets, large payloads or PII — image bytes, audio bytes, vector embeddings, lexicon content, vocabulary phrases, encrypted vector metadata.
  2. Prefer the request target over the response payloadvoiceId, s3Bucket/s3Object, transcriptionJobName, vectorIndexName, collectionId etc. Response data (detected sentiment in Comprehend, similarity scores in S3 Vectors, request character counts in Polly) is response-shaped and not visible in beforeTracingEvent.

In addition to the two rules above, this batch follows the IAM-principal-minimization principle established in earlier review fixes (KMS keyId, CloudTrail username, IAM userName, EKS roleArn): no userId from Rekognition collections, no kmsKeyId from Rekognition, no resourceArn from Transcribe.

Review-driven adjustments (commit 7e9fdfb)

After review feedback from @davsclaus, one field was dropped from the initial draft:

  • AwsComprehendSpanDecorator no longer emits endpointArn. The Comprehend custom-classifier endpoint ARN embeds the AWS account ID (e.g. arn:aws:comprehend:us-east-1:123456789012:document-classifier-endpoint/MyEndpoint), and surfacing account identifiers to observability backends is the same kind of identity disclosure that drove the IAM userName, KMS keyId, CloudTrail username and EKS roleArn drops in earlier batches.

Test plan

  • mvn test in components/camel-telemetry passes (133 tests, including 44 AWS decorator tests covering 36 components total — all AWS coverage on CAMEL-23387)
  • Module-specific build (mvn -DskipTests install) succeeds
  • No code style or formatter changes required

Coverage on CAMEL-23387 (AWS — complete after this PR)

Batch PR Components
1 #23038 (merged) SQS, SNS, Kinesis, S3
2 #23040 (merged) DDB, DDB Streams, Lambda, EventBridge, SES, MQ, Kinesis Firehose, Bedrock
3 #23045 (merged) Athena, CloudWatch, KMS, MSK, Step Functions, Timestream, Redshift Data, CloudTrail
4 #23077 (merged) STS, IAM, Secrets Manager, Parameter Store, Security Hub, Config
5 #23081 (merged) EC2, ECS, EKS
6 this PR Polly, Rekognition, Textract, Transcribe, Translate, Comprehend, S3 Vectors

Follow-ups still pending

  • Google Cloud decorators mentioned in CAMEL-23387's description — separate scope, separate JIRA. Not in scope for this PR.
  • Note on aws-xray: the original "Compute & Tracing" follow-up listed in earlier PRs mentioned camel-aws-xray, but that module was deprecated and removed in commit ba9f8c5 — it was a tracer integration with its own SegmentDecorator system, not a producer-style component, and there is nothing to add a SpanDecorator for.

After this PR merges the AWS portion of CAMEL-23387 is fully complete and the JIRA can be closed pending the Google Cloud follow-up decision.


Claude Code on behalf of Andrea Cosentino

…ognition, Textract, Transcribe, Translate, Comprehend and S3 Vectors

Signed-off-by: Andrea Cosentino <ancosen@gmail.com>
@oscerd oscerd requested review from davsclaus and squakez May 8, 2026 09:47
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 8, 2026

🌟 Thank you for your contribution to the Apache Camel project! 🌟
🤖 CI automation will test this PR automatically.

🐫 Apache Camel Committers, please review the following items:

  • First-time contributors require MANUAL approval for the GitHub Actions to run
  • You can use the command /component-test (camel-)component-name1 (camel-)component-name2.. to request a test from the test bot although they are normally detected and executed by CI.
  • You can label PRs using skip-tests and test-dependents to fine-tune the checks executed by this PR.
  • Build and test logs are available in the summary page. Only Apache Camel committers have access to the summary.

⚠️ Be careful when sharing logs. Review their contents before sharing them publicly.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 8, 2026

🧪 CI tested the following changed modules:

  • components/camel-telemetry
All tested modules (12 modules)
  • Camel :: Common Telemetry
  • Camel :: JBang :: MCP
  • Camel :: JBang :: Plugin :: Route Parser
  • Camel :: JBang :: Plugin :: TUI
  • Camel :: JBang :: Plugin :: Validate
  • Camel :: Launcher :: Container
  • Camel :: Micrometer :: Observability 2
  • Camel :: Observability Services
  • Camel :: Opentelemetry 2
  • Camel :: Telemetry :: Dev
  • Camel :: YAML DSL :: Validator
  • Camel :: YAML DSL :: Validator Maven Plugin

⚙️ View full build and test results

Signed-off-by: Andrea Cosentino <ancosen@gmail.com>
@oscerd oscerd requested review from davsclaus and squakez May 8, 2026 11:39
@oscerd oscerd merged commit d65414d into apache:main May 8, 2026
6 checks passed
@oscerd oscerd deleted the ci-issue-CAMEL-23387-part6 branch May 8, 2026 12:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants