FEAT: Support for S3 object tagging in file task#59
FEAT: Support for S3 object tagging in file task#59Divyanshu Tiwari (divyanshu-tiwari) merged 3 commits intomainfrom
Conversation
Adds a `tags` field to the file task that is applied as the `x-amz-tagging` header on S3 PutObject (including the optional _SUCCESS marker). Tag values support macro/context templating so they can be evaluated per record. Validates S3 limits: up to 10 tags per object, key length up to 128 UTF-16 code units, and value length up to 256 UTF-16 code units. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Wiz Scan Summary
To detect these findings earlier in the dev lifecycle, try using Wiz Code VS Code Extension. |
There was a problem hiding this comment.
Pull request overview
Adds support for applying S3 object tags when writing via the file pipeline task, including config/schema updates and documentation.
Changes:
- Introduces
tagsinfiletask config (map of tag key → templated value). - Applies tags to S3
PutObjectrequests and adds validation for S3 tag limits. - Documents the new
tagsoption, limits, and an example configuration.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| internal/pkg/pipeline/task/file/s3.go | Build/encode S3 tagging header per record; validate S3 tag constraints; add UTF-16 length helper. |
| internal/pkg/pipeline/task/file/file.go | Add Tags to task config; validate tags at startup; propagate tags to success marker writer. |
| internal/pkg/pipeline/task/file/README.md | Document tags option, limits, and usage example. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Move tag validation out of task startup into writeS3File so read
mode and local-scheme writes are unaffected by tag config.
- Fix UTF-16 length accounting to use utf16.Encode (handles surrogate
code points safely instead of letting RuneLen return -1).
- Substitute an empty record when buildTags is called with nil (the
_SUCCESS marker case) so unresolved {{ context }} placeholders
surface as an explicit error rather than being uploaded verbatim.
- Document the success-marker restriction in the file task README.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Adds first-class support for applying Amazon S3 object tags when the file pipeline task writes to S3, including per-record tag value evaluation and documented configuration/limits.
Changes:
- Introduces a
tagsfield on thefiletask config (map of tag key → templated value). - Applies resolved tags on S3
PutObjectuploads, including the optional_SUCCESSmarker. - Adds validation for S3 tag count and UTF-16 key/value length limits, and updates task documentation.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| internal/pkg/pipeline/task/file/s3.go | Builds URL-encoded tag strings for S3 uploads and validates tagging constraints. |
| internal/pkg/pipeline/task/file/file.go | Adds Tags to the task configuration and propagates tags to success-marker writes. |
| internal/pkg/pipeline/task/file/README.md | Documents the new tags option, constraints, and usage examples. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
validateS3Tags runs on every writeS3File call, not just the first. Update the README to match so the described timing matches the code. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
|
||
| The `_SUCCESS` marker is not tied to any record, so tag values for the success marker must only use static strings or startup-time templates (`env`, `secret`, `macro`). A tag that references `{{ context "..." }}` will fail at the success-marker write with `context keys were not set: ...`, since there is no record context to resolve against. | ||
|
|
||
| If you need record-derived tag values, either drop the context reference from the success-marker tags, or disable `success_file`. |
There was a problem hiding this comment.
Nit pick:
context reference from the success-marker tags
Same tags would be applied for both files, right?
So do you think we should change success-marker tags > tags ?
Description
This pull request adds support for applying S3 object tags when writing files to S3 using the
filepipeline task. It introduces a newtagsconfiguration option, validates S3 tagging constraints, and ensures tags are applied to both data files and optional success markers. The documentation is updated with usage details and examples.S3 Object Tagging Support
tagsfield to thefiletask configuration, allowing users to specify S3 object tags as a map of key-value pairs. Tag values support macros and context templates. [1] [2]success_filemarker when writing.Documentation Updates
README.mdfor thefiletask to document the newtagsoption, S3 tagging constraints, and provided a configuration example. [1] [2]Types of changes
Checklist