Skip to content

MINOR: Address review comments from #27236#27255

Merged
ulixius9 merged 1 commit intomainfrom
fix/pr-27236-review-comments
Apr 14, 2026
Merged

MINOR: Address review comments from #27236#27255
ulixius9 merged 1 commit intomainfrom
fix/pr-27236-review-comments

Conversation

@ulixius9
Copy link
Copy Markdown
Member

@ulixius9 ulixius9 commented Apr 10, 2026

Summary

  • Extracted inline URL credential sanitization to a generic sanitize_url_credentials() utility in logger.py (alongside existing redacted_config)
  • Fixed misleading log prefix GitHubCloneReader::_clone_clone_repo since the function is shared across GitHub/GitLab/Bitbucket/Azure DevOps
  • Added BigQuery INFORMATION_SCHEMA edge case explanation to split_table_name comment in fqn.py
  • Added unit tests for sanitize_url_credentials

Test plan

  • Verify test_sanitize_url_credentials passes (PAT, oauth-basic, token-auth, and no-URL cases)
  • Verify existing test_clone_repo_error_does_not_leak_credentials and test_clone_repo_error_sanitizes_all_credential_formats still pass

🤖 Generated with Claude Code

- Extract URL credential sanitization to generic `sanitize_url_credentials` in logger utils
- Fix misleading log prefix `GitHubCloneReader::_clone` → `_clone_repo`
- Add BigQuery INFORMATION_SCHEMA context to `split_table_name` comment
- Add unit tests for `sanitize_url_credentials`

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@ulixius9 ulixius9 requested a review from a team as a code owner April 10, 2026 16:56
Copilot AI review requested due to automatic review settings April 10, 2026 16:56
@github-actions github-actions Bot added Ingestion safe to test Add this label to run secure Github workflows on PRs labels Apr 10, 2026
@ulixius9 ulixius9 requested a review from pmbrull April 10, 2026 16:58
Comment thread ingestion/src/metadata/utils/logger.py
@gitar-bot
Copy link
Copy Markdown

gitar-bot Bot commented Apr 10, 2026

Code Review 👍 Approved with suggestions 0 resolved / 1 findings

Addresses review comments with minor cleanup and improvements. Consider updating the URL credential sanitization regex to handle http:// schemes in addition to https://.

💡 Edge Case: sanitize_url_credentials does not handle http:// URLs

📄 ingestion/src/metadata/utils/logger.py:360

The regex r"https://[^@]+@" only matches https:// schemes. If a git remote or error message ever contains an http:// URL with embedded credentials (e.g., http://token@host/repo), the credentials will not be masked. This is unlikely for production git hosts but could surface in local/dev environments or custom git servers.

Suggested fix
return re.sub(r"https?://[^@]+@", lambda m: m.group(0)[:m.group(0).index('://') + 3] + '****@', message)
🤖 Prompt for agents
Code Review: Addresses review comments with minor cleanup and improvements. Consider updating the URL credential sanitization regex to handle http:// schemes in addition to https://.

1. 💡 Edge Case: sanitize_url_credentials does not handle http:// URLs
   Files: ingestion/src/metadata/utils/logger.py:360

   The regex `r"https://[^@]+@"` only matches `https://` schemes. If a git remote or error message ever contains an `http://` URL with embedded credentials (e.g., `http://token@host/repo`), the credentials will not be masked. This is unlikely for production git hosts but could surface in local/dev environments or custom git servers.

   Suggested fix:
   return re.sub(r"https?://[^@]+@", lambda m: m.group(0)[:m.group(0).index('://') + 3] + '****@', message)

Options

Display: compact → Showing less information.

Comment with these commands to change:

Compact
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors URL credential redaction into a shared ingestion logging utility, updates clone-related error log labeling to match the shared _clone_repo helper, clarifies a BigQuery INFORMATION_SCHEMA edge case comment in FQN parsing, and adds unit tests for the new sanitizer.

Changes:

  • Added sanitize_url_credentials() in metadata.utils.logger and unit tests for common credential-in-URL formats.
  • Updated Looker repo-clone error handling to use the shared sanitizer and a more accurate log prefix.
  • Expanded split_table_name commentary to document BigQuery INFORMATION_SCHEMA multi-part table name behavior.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
ingestion/tests/unit/utils/test_logger.py Adds unit tests validating URL credential sanitization behavior.
ingestion/src/metadata/utils/logger.py Introduces sanitize_url_credentials() utility alongside existing redaction helpers.
ingestion/src/metadata/utils/fqn.py Improves inline documentation for multi-part BigQuery table names.
ingestion/src/metadata/ingestion/source/dashboard/looker/utils.py Switches clone error sanitization to the shared utility and fixes the log prefix.

Comment on lines +359 to +360
"""Mask credentials embedded in URLs (e.g., https://token@host)"""
return re.sub(r"https://[^@]+@", "https://****@", message)
@sonarqubecloud
Copy link
Copy Markdown

@github-actions
Copy link
Copy Markdown
Contributor

🟡 Playwright Results — all passed (26 flaky)

✅ 3597 passed · ❌ 0 failed · 🟡 26 flaky · ⏭️ 207 skipped

Shard Passed Failed Flaky Skipped
🟡 Shard 1 456 0 1 2
🟡 Shard 2 638 0 5 32
🟡 Shard 3 644 0 4 26
🟡 Shard 4 623 0 4 47
🟡 Shard 5 607 0 2 67
🟡 Shard 6 629 0 10 33
🟡 26 flaky test(s) (passed on retry)
  • Pages/UserCreationWithPersona.spec.ts › Create user with persona and verify on profile (shard 1, 1 retry)
  • Features/BulkEditEntity.spec.ts › Glossary (shard 2, 1 retry)
  • Features/BulkImport.spec.ts › Database (shard 2, 1 retry)
  • Features/ChangeSummaryBadge.spec.ts › Automated badge should appear on entity description with Automated source (shard 2, 1 retry)
  • Features/DataProductPersonaCustomization.spec.ts › Data Product - customization should work (shard 2, 1 retry)
  • Features/DataQuality/BundleSuiteBulkOperations.spec.ts › Add test case to existing Bundle Suite (shard 2, 1 retry)
  • Features/Permissions/GlossaryPermissions.spec.ts › Team-based permissions work correctly (shard 3, 1 retry)
  • Flow/ExploreDiscovery.spec.ts › Should display deleted assets when showDeleted is checked and deleted is not present in queryFilter (shard 3, 1 retry)
  • Flow/PersonaFlow.spec.ts › Set default persona for team should work properly (shard 3, 1 retry)
  • Flow/SchemaTable.spec.ts › schema table test (shard 3, 1 retry)
  • Pages/Customproperties-part2.spec.ts › entityReferenceList shows item count, scrollable list, no expand toggle (shard 4, 1 retry)
  • Pages/DataContracts.spec.ts › Create Data Contract and validate for Pipeline (shard 4, 1 retry)
  • Pages/DataContracts.spec.ts › Create Data Contract and validate for Chart (shard 4, 1 retry)
  • Pages/Domains.spec.ts › Verify domain data products count includes subdomain data products (shard 4, 1 retry)
  • Pages/ExploreTree.spec.ts › Verify Database and Database Schema available in explore tree (shard 5, 1 retry)
  • Pages/Glossary.spec.ts › Add and Remove Assets (shard 5, 1 retry)
  • Pages/Lineage/DataAssetLineage.spec.ts › verify create lineage for entity - Table (shard 6, 2 retries)
  • Pages/Lineage/LineageFilters.spec.ts › Verify lineage schema filter selection (shard 6, 1 retry)
  • Pages/Lineage/LineageRightPanel.spec.ts › Verify custom properties tab IS visible for supported type: searchIndex (shard 6, 1 retry)
  • Pages/ODCSImportExport.spec.ts › Multi-object ODCS contract - object selector shows all schema objects (shard 6, 1 retry)
  • Pages/ProfilerConfigurationPage.spec.ts › Non admin user (shard 6, 1 retry)
  • Pages/ServiceEntity.spec.ts › Tier Add, Update and Remove (shard 6, 1 retry)
  • Pages/ServiceEntity.spec.ts › Tier Add, Update and Remove (shard 6, 1 retry)
  • Pages/Users.spec.ts › Admin soft & hard delete and restore user (shard 6, 1 retry)
  • Pages/Users.spec.ts › Permissions for table details page for Data Consumer (shard 6, 1 retry)
  • VersionPages/EntityVersionPages.spec.ts › Directory (shard 6, 1 retry)

📦 Download artifacts

How to debug locally
# Download playwright-test-results-<shard> artifact and unzip
npx playwright show-trace path/to/trace.zip    # view trace

@ulixius9 ulixius9 merged commit e873c9d into main Apr 14, 2026
56 checks passed
@ulixius9 ulixius9 deleted the fix/pr-27236-review-comments branch April 14, 2026 18:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Ingestion safe to test Add this label to run secure Github workflows on PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants