Skip to content

Fix #27079: [BUG] S3 Ingestion Failure: Pydantic ValidationError 'ext…#27180

Merged
harshach merged 2 commits intomainfrom
S3_list_apis
Apr 9, 2026
Merged

Fix #27079: [BUG] S3 Ingestion Failure: Pydantic ValidationError 'ext…#27180
harshach merged 2 commits intomainfrom
S3_list_apis

Conversation

@harshach
Copy link
Copy Markdown
Collaborator

@harshach harshach commented Apr 8, 2026

…ra_forbidden' for BucketArn in S3BucketResponse

Describe your changes:

Fixes

I worked on ... because ...

Type of change:

  • Bug fix
  • Improvement
  • New feature
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation

Checklist:

  • I have read the CONTRIBUTING document.
  • My PR title is Fixes <issue-number>: <short explanation>
  • I have commented on my code, particularly in hard-to-understand areas.
  • For JSON Schema changes: I updated the migration scripts or explained why it is not needed.

Summary by Gitar

  • Fixed validation error:
    • Changed Pydantic extra config from forbid to ignore in S3BucketResponse, GCSBucketResponse, and SageMakerModel to handle extra fields like BucketArn and BucketRegion
  • Updated test data:
    • Added BucketArn and BucketRegion fields to S3 bucket mock responses in test cases

This will update automatically on new commits.

…ra_forbidden' for BucketArn in S3BucketResponse
@harshach harshach requested a review from a team as a code owner April 8, 2026 19:36
Copilot AI review requested due to automatic review settings April 8, 2026 19:36
@github-actions github-actions bot added backend safe to test Add this label to run secure Github workflows on PRs labels Apr 8, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses ingestion-time Pydantic ValidationError failures caused by cloud SDK responses including additional (previously forbidden) fields, primarily for S3 bucket listing responses.

Changes:

  • Relaxed Pydantic models to ignore unexpected fields in S3 bucket responses (and similarly for GCS bucket + SageMaker model response models).
  • Updated S3 unit test fixtures to include additional bucket fields (e.g., BucketArn, BucketRegion) to reproduce/guard the regression.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
ingestion/tests/unit/topology/storage/test_s3_storage.py Extends mocked S3 list_buckets response with extra fields to validate tolerant parsing.
ingestion/src/metadata/ingestion/source/storage/s3/models.py Changes S3BucketResponse to ignore extra keys to prevent extra_forbidden failures.
ingestion/src/metadata/ingestion/source/storage/gcs/models.py Migrates GCSBucketResponse config to Pydantic v2 ConfigDict and ignores extra keys.
ingestion/src/metadata/ingestion/source/mlmodel/sagemaker/metadata.py Updates SageMakerModel to ignore extra keys.

from typing import List, Optional

from pydantic import BaseModel, Extra, Field
from pydantic import BaseModel, ConfigDict, Field
Copy link

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra was removed from the pydantic imports, but this module still references Extra in GCSContainerDetails.Config later in the file. This will raise a NameError at import time and break GCS ingestion/tests. Re-add the Extra import or (preferred, Pydantic v2 style) migrate GCSContainerDetails to model_config = ConfigDict(extra="forbid") and drop the inner Config class.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

class SageMakerModel(BaseModel):
model_config = ConfigDict(
extra="forbid",
extra="ignore",
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ignore is the default so we can just delete that completely

@gitar-bot
Copy link
Copy Markdown

gitar-bot bot commented Apr 8, 2026

Code Review ✅ Approved

Fixes S3 ingestion failure caused by Pydantic ValidationError. No issues found.

Options

Display: compact → Showing less information.

Comment with these commands to change:

Compact
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 8, 2026

🟡 Playwright Results — all passed (25 flaky)

✅ 3593 passed · ❌ 0 failed · 🟡 25 flaky · ⏭️ 207 skipped

Shard Passed Failed Flaky Skipped
🟡 Shard 1 453 0 4 2
🟡 Shard 2 639 0 3 32
🟡 Shard 3 647 0 4 26
🟡 Shard 4 618 0 4 47
🟡 Shard 5 606 0 1 67
🟡 Shard 6 630 0 9 33
🟡 25 flaky test(s) (passed on retry)
  • Features/DataAssetRulesDisabled.spec.ts › Verify the Database Service entity item action after rules disabled (shard 1, 1 retry)
  • Features/CustomizeDetailPage.spec.ts › Stored Procedure - customization should work (shard 1, 1 retry)
  • Features/CustomizeDetailPage.spec.ts › Glossary - customization should work (shard 1, 1 retry)
  • Pages/UserCreationWithPersona.spec.ts › Create user with persona and verify on profile (shard 1, 1 retry)
  • Features/BulkEditEntity.spec.ts › Glossary (shard 2, 1 retry)
  • Features/BulkImport.spec.ts › Keyboard Delete selection (shard 2, 1 retry)
  • Features/DomainTierCertificationVoting.spec.ts › DataProduct - Certification assign, update, and remove (shard 2, 1 retry)
  • Features/Permissions/GlossaryPermissions.spec.ts › Team-based permissions work correctly (shard 3, 1 retry)
  • Flow/ExploreDiscovery.spec.ts › Should display deleted assets when showDeleted is checked and deleted is not present in queryFilter (shard 3, 1 retry)
  • Flow/PersonaDeletionUserProfile.spec.ts › User profile loads correctly before and after persona deletion (shard 3, 1 retry)
  • Flow/PersonaFlow.spec.ts › Set default persona for team should work properly (shard 3, 1 retry)
  • Pages/Customproperties-part2.spec.ts › entityReferenceList shows item count, scrollable list, no expand toggle (shard 4, 1 retry)
  • Pages/DescriptionVisibility.spec.ts › Customized Table detail page Description widget shows long description (shard 4, 1 retry)
  • Pages/Entity.spec.ts › Glossary Term Add, Update and Remove (shard 4, 1 retry)
  • Pages/Entity.spec.ts › User should be denied access to edit description when deny policy rule is applied on an entity (shard 4, 1 retry)
  • Pages/ExploreTree.spec.ts › Verify Database and Database Schema available in explore tree (shard 5, 1 retry)
  • Pages/Lineage/LineageFilters.spec.ts › Verify lineage schema filter selection (shard 6, 1 retry)
  • Pages/Login.spec.ts › Refresh should work (shard 6, 2 retries)
  • Pages/ODCSImportExport.spec.ts › Multi-object ODCS contract - object selector shows all schema objects (shard 6, 1 retry)
  • Pages/ProfilerConfigurationPage.spec.ts › Non admin user (shard 6, 1 retry)
  • Pages/ServiceListing.spec.ts › should render the service listing page (shard 6, 1 retry)
  • Pages/Tag.spec.ts › Verify Owner Add Delete (shard 6, 1 retry)
  • Pages/Users.spec.ts › Permissions for table details page for Data Consumer (shard 6, 1 retry)
  • Pages/Users.spec.ts › Should add, remove, and navigate to persona pages for Personas section (shard 6, 1 retry)
  • VersionPages/EntityVersionPages.spec.ts › Directory (shard 6, 1 retry)

📦 Download artifacts

How to debug locally
# Download playwright-test-results-<shard> artifact and unzip
npx playwright show-trace path/to/trace.zip    # view trace

@harshach harshach added the To release Will cherry-pick this PR into the release branch label Apr 9, 2026
@harshach harshach merged commit 2089b6a into main Apr 9, 2026
57 checks passed
@harshach harshach deleted the S3_list_apis branch April 9, 2026 00:36
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 9, 2026

Failed to cherry-pick changes to the 1.12.5 branch.
Please cherry-pick the changes manually.
You can find more details here.

@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud bot commented Apr 9, 2026

harshach added a commit that referenced this pull request Apr 9, 2026
#27180)

* Fix #27079: [BUG] S3 Ingestion Failure: Pydantic ValidationError 'extra_forbidden' for BucketArn in S3BucketResponse

* Address comments
SaaiAravindhRaja pushed a commit to SaaiAravindhRaja/OpenMetadata that referenced this pull request Apr 12, 2026
…ionError 'ext… (open-metadata#27180)

* Fix open-metadata#27079: [BUG] S3 Ingestion Failure: Pydantic ValidationError 'extra_forbidden' for BucketArn in S3BucketResponse

* Address comments
SaaiAravindhRaja pushed a commit to SaaiAravindhRaja/OpenMetadata that referenced this pull request Apr 12, 2026
…ionError 'ext… (open-metadata#27180)

* Fix open-metadata#27079: [BUG] S3 Ingestion Failure: Pydantic ValidationError 'extra_forbidden' for BucketArn in S3BucketResponse

* Address comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend safe to test Add this label to run secure Github workflows on PRs To release Will cherry-pick this PR into the release branch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants