Skip to content

importinto: require S3-like auth for nextgen import#68231

Merged
ti-chi-bot[bot] merged 3 commits into
pingcap:masterfrom
D3Hunter:forbid-empty-auth-for-s3like
May 8, 2026
Merged

importinto: require S3-like auth for nextgen import#68231
ti-chi-bot[bot] merged 3 commits into
pingcap:masterfrom
D3Hunter:forbid-empty-auth-for-s3like

Conversation

@D3Hunter
Copy link
Copy Markdown
Contributor

@D3Hunter D3Hunter commented May 8, 2026

What problem does this PR solve?

Issue Number: close #68226

Problem Summary:

In NextGen security enhanced mode, IMPORT INTO accepted S3-like storage URIs without explicit user-provided credentials. That allowed the object-store client to fall back to TiDB node-role credentials, which weakens the expected boundary for user-specified import sources.

What changed and how does it work?

This PR requires explicit authentication for S3-like IMPORT INTO sources when NextGen and SEM are enabled.

  • Adds normalized object-store query parameter matching so both dash and underscore spellings are handled consistently.
  • Defines shared S3-like query keys for access key, secret access key, and role ARN.
  • Rejects S3-like import paths unless they provide either a non-empty access key/secret access key pair or a non-empty role ARN.
  • Preserves the existing NextGen SEM behavior that rejects explicit external ID and injects the keyspace name as the external ID for allowed paths.

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No need to test
    • I checked and no code files have been changed.

Unit tests:

  • ./tools/check/failpoint-go-test.sh pkg/planner/core -tags=intest,deadlock,nextgen -run TestProcessNextGenS3Path -count=1
  • ./tools/check/failpoint-go-test.sh pkg/executor -tags=intest,deadlock,nextgen -run TestNextGenS3ExternalID -count=1
  • make lint

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

In NextGen security enhanced mode, IMPORT INTO from S3-like storage now requires access key/secret access key credentials or a role ARN.

Summary by CodeRabbit

  • Bug Fixes

    • IMPORT INTO now enforces S3-like credential requirements under Security Enhanced Mode: requests without proper credentials are rejected; explicit external IDs are disallowed.
    • Query-parameter formats for credentials (kebab/snake case) are handled consistently.
  • Tests

    • Expanded test coverage for S3/OSS URL parsing and credential validation, including more error and no-error cases.

@ti-chi-bot ti-chi-bot Bot added the release-note Denotes a PR that will be considered when it comes time to generate release notes. label May 8, 2026
@pantheon-ai
Copy link
Copy Markdown

pantheon-ai Bot commented May 8, 2026

@D3Hunter I've received your pull request and will start the review. I'll conduct a thorough review covering code quality, potential issues, and implementation details.

⏳ This process typically takes 10-30 minutes depending on the complexity of the changes.

ℹ️ Learn more details on Pantheon AI.

@ti-chi-bot ti-chi-bot Bot added sig/planner SIG: Planner size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels May 8, 2026
@tiprow
Copy link
Copy Markdown

tiprow Bot commented May 8, 2026

Hi @D3Hunter. Thanks for your PR.

PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test all.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 8, 2026

Review Change Stack
No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 2c88a389-bf98-426e-90e5-5c9cf280980c

📥 Commits

Reviewing files that changed from the base of the PR and between feff8ac and 8b30d74.

📒 Files selected for processing (2)
  • pkg/util/sem/compat/BUILD.bazel
  • pkg/util/sem/compat/sem_integration_test.go

📝 Walkthrough

Walkthrough

This PR enforces explicit S3-like storage authentication in NextGen IMPORT INTO operations under Security Enhanced Mode. It introduces S3 credential constants, centralizes query parameter key normalization, changes planner validation to require access-key+secret-access-key or role-arn while rejecting explicit external IDs, and updates unit/integration tests including sem-compat adjustments.

Changes

S3 Authentication Enforcement for NextGen IMPORT INTO

Layer / File(s) Summary
S3 Storage Constants
pkg/objstore/s3like/store.go
Introduces S3AccessKey, S3SecretAccessKey, and S3RoleARN exported string constants for S3 authentication parameter names.
Query Parameter Normalization
pkg/objstore/parse.go
Adds NormalizeQueryParameterKey helper function that lowercases and converts underscores to dashes; refactors ExtractQueryParameters to use this helper for consistent key matching.
NextGen S3 Authentication Validation
pkg/planner/core/planbuilder.go
buildImportInto conditionally validates NextGen S3-like paths only when SEM is enabled; checkNextGenS3PathWithSem now rejects explicit external IDs and requires either access-key + secret-access-key or role-arn, matching normalized query parameter keys.
Plan Builder Unit Tests
pkg/planner/core/planbuilder_test.go
TestProcessNextGenS3Path expanded: external ID variants with differing casing, new loop accepting valid credentials in kebab and snake case, error cases broadened to include missing/empty/partial credentials and unsupported profile parameter.
Executor Integration Tests
pkg/executor/import_into_test.go
TestNextGenS3ExternalID adds SEM-enabled subtest for credential rejection; existing tests updated to include access-key and secret-access-key in S3 URIs for external ID, local sort, and unsupported options scenarios.
SEM Compat Integration Adjustments
pkg/util/sem/compat/BUILD.bazel, pkg/util/sem/compat/sem_integration_test.go
Adds kerneltype import and BUILD dep; sem-compat test gains a NextGen-specific branch that asserts IMPORT INTO with explicit external ID is rejected and verifies no rows inserted.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested labels

component/import, type/bugfix, size/M

Suggested reviewers

  • joechenrh
  • GMHDBJD

Poem

🐰 I hop through code with keen delight,
I check each key both day and night.
No secrets fall back to the node,
External IDs? Blocked on the road.
NextGen imports keep tenants tight.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'importinto: require S3-like auth for nextgen import' clearly and concisely summarizes the main change: enforcing S3-like authentication for IMPORT INTO in NextGen mode.
Description check ✅ Passed The PR description follows the template well, includes the issue number, problem summary, what changed, checked test boxes, breaking compatibility flag, affected user behaviors, and release notes.
Linked Issues check ✅ Passed The PR fully addresses issue #68226 by enforcing explicit S3-like auth (AK/SK or role ARN), normalizing query parameters, and preserving tenant-isolation via external-id handling.
Out of Scope Changes check ✅ Passed All code changes are scoped to S3-like IMPORT INTO authentication validation, query parameter normalization, and test updates directly supporting the stated objectives.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.12.1)

Command failed


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@D3Hunter D3Hunter changed the title planner, executor, objstore: require S3-like auth for nextgen import importinto: require S3-like auth for nextgen import May 8, 2026
@D3Hunter
Copy link
Copy Markdown
Contributor Author

D3Hunter commented May 8, 2026

/cherry-pick release-nextgen-20251011
/cherry-pick release-nextgen-202603

@ti-chi-bot
Copy link
Copy Markdown
Member

@D3Hunter: once the present PR merges, I will cherry-pick it on top of release-nextgen-20251011/release-nextgen-202603 in the new PR and assign it to you.

Details

In response to this:

/cherry-pick release-nextgen-20251011
/cherry-pick release-nextgen-202603

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@ti-chi-bot ti-chi-bot Bot added the needs-1-more-lgtm Indicates a PR needs 1 more LGTM. label May 8, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
pkg/planner/core/planbuilder_test.go (1)

1147-1157: ⚡ Quick win

"No-error" block is missing cross-backend credential combinations.

The slice validates S3+AK/SK and OSS+role-ARN (in both dash and underscore forms), but omits:

  • oss://bucket?access-key=ak&secret-access-key=sk (and the _ form) — OSS with AK/SK
  • s3://bucket?role-arn=arn (and the _ form) — S3 with role ARN

Since the PR intent is "shared S3-like query keys for access key, secret access key, and role ARN" (applicable to both backends), a regression where checkNextGenS3PathWithSem only accepts role ARN for OSS or only accepts AK/SK for S3 would be invisible to this test.

🧪 Suggested additions
 	for _, str := range []string{
 		"s3://bucket?access-key=ak&secret-access-key=sk",
 		"s3://bucket?access_key=ak&secret_access_key=sk",
+		"s3://bucket?role-arn=arn",
+		"s3://bucket?role_arn=arn",
 		"oss://bucket?role-arn=arn",
 		"oss://bucket?role_arn=arn",
+		"oss://bucket?access-key=ak&secret-access-key=sk",
+		"oss://bucket?access_key=ak&secret_access_key=sk",
 	} {
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/planner/core/planbuilder_test.go` around lines 1147 - 1157, The test's
"no-error" cases currently only cover S3 with AK/SK and OSS with role-ARN; add
the missing cross-backend credential combinations so checkNextGenS3PathWithSem
is validated for all forms: include
"oss://bucket?access-key=ak&secret-access-key=sk" and
"oss://bucket?access_key=ak&secret_access_key=sk" (OSS with AK/SK) and
"s3://bucket?role-arn=arn" and "s3://bucket?role_arn=arn" (S3 with role ARN) in
the same test loop to ensure the shared query-key behavior is enforced for both
backends.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@pkg/planner/core/planbuilder_test.go`:
- Around line 1147-1157: The test's "no-error" cases currently only cover S3
with AK/SK and OSS with role-ARN; add the missing cross-backend credential
combinations so checkNextGenS3PathWithSem is validated for all forms: include
"oss://bucket?access-key=ak&secret-access-key=sk" and
"oss://bucket?access_key=ak&secret_access_key=sk" (OSS with AK/SK) and
"s3://bucket?role-arn=arn" and "s3://bucket?role_arn=arn" (S3 with role ARN) in
the same test loop to ensure the shared query-key behavior is enforced for both
backends.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: c6e6e6f9-590f-41db-b137-1920456d7112

📥 Commits

Reviewing files that changed from the base of the PR and between f30c9a6 and feff8ac.

📒 Files selected for processing (5)
  • pkg/executor/import_into_test.go
  • pkg/objstore/parse.go
  • pkg/objstore/s3like/store.go
  • pkg/planner/core/planbuilder.go
  • pkg/planner/core/planbuilder_test.go

@codecov
Copy link
Copy Markdown

codecov Bot commented May 8, 2026

Codecov Report

❌ Patch coverage is 0% with 21 lines in your changes missing coverage. Please review.
✅ Project coverage is 77.0293%. Comparing base (f30c9a6) to head (8b30d74).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files
@@               Coverage Diff                @@
##             master     #68231        +/-   ##
================================================
- Coverage   77.7304%   77.0293%   -0.7012%     
================================================
  Files          1990       1972        -18     
  Lines        551907     552793       +886     
================================================
- Hits         429000     425813      -3187     
- Misses       121987     126977      +4990     
+ Partials        920          3       -917     
Flag Coverage Δ
integration 41.5263% <0.0000%> (+1.7244%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
dumpling 60.4888% <ø> (ø)
parser ∅ <ø> (∅)
br 50.0597% <ø> (-13.0338%) ⬇️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@ti-chi-bot ti-chi-bot Bot added lgtm and removed needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels May 8, 2026
@ti-chi-bot
Copy link
Copy Markdown

ti-chi-bot Bot commented May 8, 2026

[LGTM Timeline notifier]

Timeline:

  • 2026-05-08 09:29:31.908079555 +0000 UTC m=+432844.781429537: ☑️ agreed by GMHDBJD.
  • 2026-05-08 09:39:48.183500799 +0000 UTC m=+433461.056850772: ☑️ agreed by joechenrh.

@ti-chi-bot
Copy link
Copy Markdown

ti-chi-bot Bot commented May 8, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: GMHDBJD, hawkingrei, joechenrh

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot Bot added the approved label May 8, 2026
@ti-chi-bot ti-chi-bot Bot merged commit 84548db into pingcap:master May 8, 2026
33 checks passed
@ti-chi-bot
Copy link
Copy Markdown
Member

@D3Hunter: new pull request created to branch release-nextgen-20251011: #68233.
But this PR has conflicts, please resolve them!

Details

In response to this:

/cherry-pick release-nextgen-20251011
/cherry-pick release-nextgen-202603

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@ti-chi-bot
Copy link
Copy Markdown
Member

@D3Hunter: new pull request created to branch release-nextgen-202603: #68234.

Details

In response to this:

/cherry-pick release-nextgen-20251011
/cherry-pick release-nextgen-202603

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@D3Hunter D3Hunter deleted the forbid-empty-auth-for-s3like branch May 8, 2026 12:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved lgtm release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/planner SIG: Planner size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

importinto: require explicit auth for S3-like storage in nextgen

5 participants