Conversation
| ).fetchall() | ||
| query = REDSHIFT_PARTITION_DETAILS | ||
| if schema_name: | ||
| query += f" AND \"schema\" = '{schema_name}'" |
There was a problem hiding this comment.
⚠️ Security: SQL injection via string formatting in get_partition_details
The new schema_name parameter is concatenated directly into the SQL query via f-string interpolation:
query += f" AND \"schema\" = '{schema_name}'"While the schema_name value typically comes from Redshift's own metadata catalog (and is thus unlikely to contain malicious content in normal usage), this is still a SQL injection vector. A schema name containing a single quote (e.g., foo'; DROP TABLE ...--) would break the query or allow injection.
Note that the existing codebase has similar patterns in _get_schema_column_info and _get_all_relation_info, so this is consistent with the current style. However, for new code, it's better to use parameterized queries.
The method query_table_names_and_types in the same file already demonstrates the correct pattern using sql.text() with named parameters (line 236-246). The same approach should be used here.
Suggested fix: Use a parameterized query approach. Since the base query is a raw string constant, wrap the full query with sql.text() and use a bind parameter:
query = REDSHIFT_PARTITION_DETAILS
if schema_name:
query += " AND \"schema\" = :schema_name"
results = self.connection.execute(
statement=sql.text(query),
{"schema_name": schema_name}
).fetchall()Suggested fix:
query = REDSHIFT_PARTITION_DETAILS
if schema_name:
query += " AND \"schema\" = :schema_name"
results = self.connection.execute(
statement=sql.text(query),
{"schema_name": schema_name} if schema_name else {},
).fetchall()
Was this helpful? React with 👍 / 👎
🛡️ TRIVY SCAN RESULT 🛡️ Target:
|
| Package | Vulnerability ID | Severity | Installed Version | Fixed Version |
|---|---|---|---|---|
libpam-modules |
CVE-2025-6020 | 🚨 HIGH | 1.5.2-6+deb12u1 | 1.5.2-6+deb12u2 |
libpam-modules-bin |
CVE-2025-6020 | 🚨 HIGH | 1.5.2-6+deb12u1 | 1.5.2-6+deb12u2 |
libpam-runtime |
CVE-2025-6020 | 🚨 HIGH | 1.5.2-6+deb12u1 | 1.5.2-6+deb12u2 |
libpam0g |
CVE-2025-6020 | 🚨 HIGH | 1.5.2-6+deb12u1 | 1.5.2-6+deb12u2 |
🛡️ TRIVY SCAN RESULT 🛡️
Target: Java
Vulnerabilities (38)
| Package | Vulnerability ID | Severity | Installed Version | Fixed Version |
|---|---|---|---|---|
com.fasterxml.jackson.core:jackson-core |
CVE-2025-52999 | 🚨 HIGH | 2.12.7 | 2.15.0 |
com.fasterxml.jackson.core:jackson-core |
GHSA-72hv-8253-57qq | 🚨 HIGH | 2.12.7 | 2.18.6, 2.21.1, 3.1.0 |
com.fasterxml.jackson.core:jackson-core |
CVE-2025-52999 | 🚨 HIGH | 2.13.4 | 2.15.0 |
com.fasterxml.jackson.core:jackson-core |
GHSA-72hv-8253-57qq | 🚨 HIGH | 2.13.4 | 2.18.6, 2.21.1, 3.1.0 |
com.fasterxml.jackson.core:jackson-core |
GHSA-72hv-8253-57qq | 🚨 HIGH | 2.15.2 | 2.18.6, 2.21.1, 3.1.0 |
com.fasterxml.jackson.core:jackson-core |
GHSA-72hv-8253-57qq | 🚨 HIGH | 2.16.1 | 2.18.6, 2.21.1, 3.1.0 |
com.fasterxml.jackson.core:jackson-databind |
CVE-2022-42003 | 🚨 HIGH | 2.12.7 | 2.12.7.1, 2.13.4.2 |
com.fasterxml.jackson.core:jackson-databind |
CVE-2022-42004 | 🚨 HIGH | 2.12.7 | 2.12.7.1, 2.13.4 |
com.google.code.gson:gson |
CVE-2022-25647 | 🚨 HIGH | 2.2.4 | 2.8.9 |
com.google.protobuf:protobuf-java |
CVE-2021-22569 | 🚨 HIGH | 3.3.0 | 3.16.1, 3.18.2, 3.19.2 |
com.google.protobuf:protobuf-java |
CVE-2022-3509 | 🚨 HIGH | 3.3.0 | 3.16.3, 3.19.6, 3.20.3, 3.21.7 |
com.google.protobuf:protobuf-java |
CVE-2022-3510 | 🚨 HIGH | 3.3.0 | 3.16.3, 3.19.6, 3.20.3, 3.21.7 |
com.google.protobuf:protobuf-java |
CVE-2024-7254 | 🚨 HIGH | 3.3.0 | 3.25.5, 4.27.5, 4.28.2 |
com.google.protobuf:protobuf-java |
CVE-2021-22569 | 🚨 HIGH | 3.7.1 | 3.16.1, 3.18.2, 3.19.2 |
com.google.protobuf:protobuf-java |
CVE-2022-3509 | 🚨 HIGH | 3.7.1 | 3.16.3, 3.19.6, 3.20.3, 3.21.7 |
com.google.protobuf:protobuf-java |
CVE-2022-3510 | 🚨 HIGH | 3.7.1 | 3.16.3, 3.19.6, 3.20.3, 3.21.7 |
com.google.protobuf:protobuf-java |
CVE-2024-7254 | 🚨 HIGH | 3.7.1 | 3.25.5, 4.27.5, 4.28.2 |
com.nimbusds:nimbus-jose-jwt |
CVE-2023-52428 | 🚨 HIGH | 9.8.1 | 9.37.2 |
com.squareup.okhttp3:okhttp |
CVE-2021-0341 | 🚨 HIGH | 3.12.12 | 4.9.2 |
commons-beanutils:commons-beanutils |
CVE-2025-48734 | 🚨 HIGH | 1.9.4 | 1.11.0 |
commons-io:commons-io |
CVE-2024-47554 | 🚨 HIGH | 2.8.0 | 2.14.0 |
dnsjava:dnsjava |
CVE-2024-25638 | 🚨 HIGH | 2.1.7 | 3.6.0 |
io.airlift:aircompressor |
CVE-2025-67721 | 🚨 HIGH | 0.27 | 2.0.3 |
io.netty:netty-codec-http2 |
CVE-2025-55163 | 🚨 HIGH | 4.1.96.Final | 4.2.4.Final, 4.1.124.Final |
io.netty:netty-codec-http2 |
GHSA-xpw8-rcwv-8f8p | 🚨 HIGH | 4.1.96.Final | 4.1.100.Final |
io.netty:netty-handler |
CVE-2025-24970 | 🚨 HIGH | 4.1.96.Final | 4.1.118.Final |
net.minidev:json-smart |
CVE-2021-31684 | 🚨 HIGH | 1.3.2 | 1.3.3, 2.4.4 |
net.minidev:json-smart |
CVE-2023-1370 | 🚨 HIGH | 1.3.2 | 2.4.9 |
org.apache.avro:avro |
CVE-2024-47561 | 🔥 CRITICAL | 1.7.7 | 1.11.4 |
org.apache.avro:avro |
CVE-2023-39410 | 🚨 HIGH | 1.7.7 | 1.11.3 |
org.apache.derby:derby |
CVE-2022-46337 | 🔥 CRITICAL | 10.14.2.0 | 10.14.3, 10.15.2.1, 10.16.1.2, 10.17.1.0 |
org.apache.ivy:ivy |
CVE-2022-46751 | 🚨 HIGH | 2.5.1 | 2.5.2 |
org.apache.mesos:mesos |
CVE-2018-1330 | 🚨 HIGH | 1.4.3 | 1.6.0 |
org.apache.thrift:libthrift |
CVE-2019-0205 | 🚨 HIGH | 0.12.0 | 0.13.0 |
org.apache.thrift:libthrift |
CVE-2020-13949 | 🚨 HIGH | 0.12.0 | 0.14.0 |
org.apache.zookeeper:zookeeper |
CVE-2023-44981 | 🔥 CRITICAL | 3.6.3 | 3.7.2, 3.8.3, 3.9.1 |
org.eclipse.jetty:jetty-server |
CVE-2024-13009 | 🚨 HIGH | 9.4.56.v20240826 | 9.4.57.v20241219 |
org.lz4:lz4-java |
CVE-2025-12183 | 🚨 HIGH | 1.8.0 | 1.8.1 |
🛡️ TRIVY SCAN RESULT 🛡️
Target: Node.js
No Vulnerabilities Found
🛡️ TRIVY SCAN RESULT 🛡️
Target: Python
Vulnerabilities (21)
| Package | Vulnerability ID | Severity | Installed Version | Fixed Version |
|---|---|---|---|---|
Werkzeug |
CVE-2024-34069 | 🚨 HIGH | 2.2.3 | 3.0.3 |
aiohttp |
CVE-2025-69223 | 🚨 HIGH | 3.12.12 | 3.13.3 |
aiohttp |
CVE-2025-69223 | 🚨 HIGH | 3.13.2 | 3.13.3 |
apache-airflow |
CVE-2025-68438 | 🚨 HIGH | 3.1.5 | 3.1.6 |
apache-airflow |
CVE-2025-68675 | 🚨 HIGH | 3.1.5 | 3.1.6, 2.11.1 |
azure-core |
CVE-2026-21226 | 🚨 HIGH | 1.37.0 | 1.38.0 |
cryptography |
CVE-2026-26007 | 🚨 HIGH | 42.0.8 | 46.0.5 |
google-cloud-aiplatform |
CVE-2026-2472 | 🚨 HIGH | 1.130.0 | 1.131.0 |
google-cloud-aiplatform |
CVE-2026-2473 | 🚨 HIGH | 1.130.0 | 1.133.0 |
jaraco.context |
CVE-2026-23949 | 🚨 HIGH | 5.3.0 | 6.1.0 |
jaraco.context |
CVE-2026-23949 | 🚨 HIGH | 6.0.1 | 6.1.0 |
protobuf |
CVE-2026-0994 | 🚨 HIGH | 4.25.8 | 6.33.5, 5.29.6 |
pyasn1 |
CVE-2026-23490 | 🚨 HIGH | 0.6.1 | 0.6.2 |
python-multipart |
CVE-2026-24486 | 🚨 HIGH | 0.0.20 | 0.0.22 |
ray |
CVE-2025-62593 | 🔥 CRITICAL | 2.47.1 | 2.52.0 |
starlette |
CVE-2025-62727 | 🚨 HIGH | 0.48.0 | 0.49.1 |
urllib3 |
CVE-2025-66418 | 🚨 HIGH | 1.26.20 | 2.6.0 |
urllib3 |
CVE-2025-66471 | 🚨 HIGH | 1.26.20 | 2.6.0 |
urllib3 |
CVE-2026-21441 | 🚨 HIGH | 1.26.20 | 2.6.3 |
wheel |
CVE-2026-24049 | 🚨 HIGH | 0.45.1 | 0.46.2 |
wheel |
CVE-2026-24049 | 🚨 HIGH | 0.45.1 | 0.46.2 |
🛡️ TRIVY SCAN RESULT 🛡️
Target: usr/bin/docker
Vulnerabilities (4)
| Package | Vulnerability ID | Severity | Installed Version | Fixed Version |
|---|---|---|---|---|
stdlib |
CVE-2025-68121 | 🔥 CRITICAL | v1.25.5 | 1.24.13, 1.25.7, 1.26.0-rc.3 |
stdlib |
CVE-2025-61726 | 🚨 HIGH | v1.25.5 | 1.24.12, 1.25.6 |
stdlib |
CVE-2025-61728 | 🚨 HIGH | v1.25.5 | 1.24.12, 1.25.6 |
stdlib |
CVE-2025-61730 | 🚨 HIGH | v1.25.5 | 1.24.12, 1.25.6 |
🛡️ TRIVY SCAN RESULT 🛡️
Target: /etc/ssl/private/ssl-cert-snakeoil.key
No Vulnerabilities Found
🛡️ TRIVY SCAN RESULT 🛡️
Target: /home/airflow/openmetadata-airflow-apis/openmetadata_managed_apis.egg-info/PKG-INFO
No Vulnerabilities Found
🛡️ TRIVY SCAN RESULT 🛡️ Target:
|
| Package | Vulnerability ID | Severity | Installed Version | Fixed Version |
|---|---|---|---|---|
com.fasterxml.jackson.core:jackson-core |
CVE-2025-52999 | 🚨 HIGH | 2.12.7 | 2.15.0 |
com.fasterxml.jackson.core:jackson-core |
GHSA-72hv-8253-57qq | 🚨 HIGH | 2.12.7 | 2.18.6, 2.21.1, 3.1.0 |
com.fasterxml.jackson.core:jackson-core |
CVE-2025-52999 | 🚨 HIGH | 2.13.4 | 2.15.0 |
com.fasterxml.jackson.core:jackson-core |
GHSA-72hv-8253-57qq | 🚨 HIGH | 2.13.4 | 2.18.6, 2.21.1, 3.1.0 |
com.fasterxml.jackson.core:jackson-core |
GHSA-72hv-8253-57qq | 🚨 HIGH | 2.15.2 | 2.18.6, 2.21.1, 3.1.0 |
com.fasterxml.jackson.core:jackson-databind |
CVE-2022-42003 | 🚨 HIGH | 2.12.7 | 2.12.7.1, 2.13.4.2 |
com.fasterxml.jackson.core:jackson-databind |
CVE-2022-42004 | 🚨 HIGH | 2.12.7 | 2.12.7.1, 2.13.4 |
com.google.code.gson:gson |
CVE-2022-25647 | 🚨 HIGH | 2.2.4 | 2.8.9 |
com.google.protobuf:protobuf-java |
CVE-2021-22569 | 🚨 HIGH | 3.3.0 | 3.16.1, 3.18.2, 3.19.2 |
com.google.protobuf:protobuf-java |
CVE-2022-3509 | 🚨 HIGH | 3.3.0 | 3.16.3, 3.19.6, 3.20.3, 3.21.7 |
com.google.protobuf:protobuf-java |
CVE-2022-3510 | 🚨 HIGH | 3.3.0 | 3.16.3, 3.19.6, 3.20.3, 3.21.7 |
com.google.protobuf:protobuf-java |
CVE-2024-7254 | 🚨 HIGH | 3.3.0 | 3.25.5, 4.27.5, 4.28.2 |
com.google.protobuf:protobuf-java |
CVE-2021-22569 | 🚨 HIGH | 3.7.1 | 3.16.1, 3.18.2, 3.19.2 |
com.google.protobuf:protobuf-java |
CVE-2022-3509 | 🚨 HIGH | 3.7.1 | 3.16.3, 3.19.6, 3.20.3, 3.21.7 |
com.google.protobuf:protobuf-java |
CVE-2022-3510 | 🚨 HIGH | 3.7.1 | 3.16.3, 3.19.6, 3.20.3, 3.21.7 |
com.google.protobuf:protobuf-java |
CVE-2024-7254 | 🚨 HIGH | 3.7.1 | 3.25.5, 4.27.5, 4.28.2 |
com.nimbusds:nimbus-jose-jwt |
CVE-2023-52428 | 🚨 HIGH | 9.8.1 | 9.37.2 |
com.squareup.okhttp3:okhttp |
CVE-2021-0341 | 🚨 HIGH | 3.12.12 | 4.9.2 |
commons-beanutils:commons-beanutils |
CVE-2025-48734 | 🚨 HIGH | 1.9.4 | 1.11.0 |
commons-io:commons-io |
CVE-2024-47554 | 🚨 HIGH | 2.8.0 | 2.14.0 |
dnsjava:dnsjava |
CVE-2024-25638 | 🚨 HIGH | 2.1.7 | 3.6.0 |
io.airlift:aircompressor |
CVE-2025-67721 | 🚨 HIGH | 0.27 | 2.0.3 |
io.netty:netty-codec-http2 |
CVE-2025-55163 | 🚨 HIGH | 4.1.96.Final | 4.2.4.Final, 4.1.124.Final |
io.netty:netty-codec-http2 |
GHSA-xpw8-rcwv-8f8p | 🚨 HIGH | 4.1.96.Final | 4.1.100.Final |
io.netty:netty-handler |
CVE-2025-24970 | 🚨 HIGH | 4.1.96.Final | 4.1.118.Final |
net.minidev:json-smart |
CVE-2021-31684 | 🚨 HIGH | 1.3.2 | 1.3.3, 2.4.4 |
net.minidev:json-smart |
CVE-2023-1370 | 🚨 HIGH | 1.3.2 | 2.4.9 |
org.apache.avro:avro |
CVE-2024-47561 | 🔥 CRITICAL | 1.7.7 | 1.11.4 |
org.apache.avro:avro |
CVE-2023-39410 | 🚨 HIGH | 1.7.7 | 1.11.3 |
org.apache.derby:derby |
CVE-2022-46337 | 🔥 CRITICAL | 10.14.2.0 | 10.14.3, 10.15.2.1, 10.16.1.2, 10.17.1.0 |
org.apache.ivy:ivy |
CVE-2022-46751 | 🚨 HIGH | 2.5.1 | 2.5.2 |
org.apache.mesos:mesos |
CVE-2018-1330 | 🚨 HIGH | 1.4.3 | 1.6.0 |
org.apache.thrift:libthrift |
CVE-2019-0205 | 🚨 HIGH | 0.12.0 | 0.13.0 |
org.apache.thrift:libthrift |
CVE-2020-13949 | 🚨 HIGH | 0.12.0 | 0.14.0 |
org.apache.zookeeper:zookeeper |
CVE-2023-44981 | 🔥 CRITICAL | 3.6.3 | 3.7.2, 3.8.3, 3.9.1 |
org.eclipse.jetty:jetty-server |
CVE-2024-13009 | 🚨 HIGH | 9.4.56.v20240826 | 9.4.57.v20241219 |
org.lz4:lz4-java |
CVE-2025-12183 | 🚨 HIGH | 1.8.0 | 1.8.1 |
🛡️ TRIVY SCAN RESULT 🛡️
Target: Node.js
No Vulnerabilities Found
🛡️ TRIVY SCAN RESULT 🛡️
Target: Python
Vulnerabilities (9)
| Package | Vulnerability ID | Severity | Installed Version | Fixed Version |
|---|---|---|---|---|
apache-airflow |
CVE-2025-68438 | 🚨 HIGH | 3.1.5 | 3.1.6 |
apache-airflow |
CVE-2025-68675 | 🚨 HIGH | 3.1.5 | 3.1.6, 2.11.1 |
cryptography |
CVE-2026-26007 | 🚨 HIGH | 42.0.8 | 46.0.5 |
jaraco.context |
CVE-2026-23949 | 🚨 HIGH | 6.0.1 | 6.1.0 |
starlette |
CVE-2025-62727 | 🚨 HIGH | 0.48.0 | 0.49.1 |
urllib3 |
CVE-2025-66418 | 🚨 HIGH | 1.26.20 | 2.6.0 |
urllib3 |
CVE-2025-66471 | 🚨 HIGH | 1.26.20 | 2.6.0 |
urllib3 |
CVE-2026-21441 | 🚨 HIGH | 1.26.20 | 2.6.3 |
wheel |
CVE-2026-24049 | 🚨 HIGH | 0.45.1 | 0.46.2 |
🛡️ TRIVY SCAN RESULT 🛡️
Target: /etc/ssl/private/ssl-cert-snakeoil.key
No Vulnerabilities Found
🛡️ TRIVY SCAN RESULT 🛡️
Target: /ingestion/pipelines/extended_sample_data.yaml
No Vulnerabilities Found
🛡️ TRIVY SCAN RESULT 🛡️
Target: /ingestion/pipelines/lineage.yaml
No Vulnerabilities Found
🛡️ TRIVY SCAN RESULT 🛡️
Target: /ingestion/pipelines/sample_data.json
No Vulnerabilities Found
🛡️ TRIVY SCAN RESULT 🛡️
Target: /ingestion/pipelines/sample_data.yaml
No Vulnerabilities Found
🛡️ TRIVY SCAN RESULT 🛡️
Target: /ingestion/pipelines/sample_data_aut.yaml
No Vulnerabilities Found
🛡️ TRIVY SCAN RESULT 🛡️
Target: /ingestion/pipelines/sample_usage.json
No Vulnerabilities Found
🛡️ TRIVY SCAN RESULT 🛡️
Target: /ingestion/pipelines/sample_usage.yaml
No Vulnerabilities Found
🛡️ TRIVY SCAN RESULT 🛡️
Target: /ingestion/pipelines/sample_usage_aut.yaml
No Vulnerabilities Found
| # Single-schema cache: invalidate when schema changes | ||
| cached = getattr(self, "_relation_info_cache", None) | ||
| if cached is not None and cached[0] == schema: | ||
| return cached[1] |
There was a problem hiding this comment.
⚠️ Bug: Cache key ignores table_name, may return stale results
The single-schema cache in _get_all_relation_info keys only on schema, but the function also filters by table_name (lines 400-403). If the function is first called with (schema="public", table_name="users"), the cached result contains only the "users" relation. A subsequent call with (schema="public", table_name="orders") would hit the cache and return the stale result (only "users"), missing "orders".
The previous @reflection.cache decorator keyed on all arguments (including table_name), so this is a behavioral regression.
The fix should include table_name in the cache key, or (better) only cache when table_name is None (the common bulk-fetch path), since caching single-table lookups provides little benefit.
Suggested fix:
# Single-schema cache: invalidate when schema or table_name changes
table_name = kw.get("table_name", None)
cached = getattr(self, "_relation_info_cache", None)
if cached is not None and cached[0] == schema and cached[1] == table_name:
return cached[2]
Was this helpful? React with 👍 / 👎
|
Hey guys! |
| def _clear_reflection_cache(self) -> None: | ||
| """Clear the SQLAlchemy inspector's info_cache to release | ||
| cached column / relation data from prior schemas. | ||
|
|
||
| This prevents unbounded memory growth when ingesting many | ||
| schemas, since _get_schema_column_info, get_columns, and | ||
| _get_all_relation_info all use @reflection.cache. | ||
| """ |
There was a problem hiding this comment.
💡 Quality: Stale docstring references removed @reflection.cache
The docstring for _clear_reflection_cache() states:
"since _get_schema_column_info, get_columns, and _get_all_relation_info all use @reflection.cache"
However, this PR specifically removes @reflection.cache from all three of those functions. The docstring should be updated to reflect that the info_cache is being cleared for the remaining @reflection.cache-decorated functions (e.g., get_table_comment, get_view_definition, _get_pg_column_info) and to clean up any residual cached data.
Suggested fix:
def _clear_reflection_cache(self) -> None:
"""Clear the SQLAlchemy inspector's info_cache to release
cached column / relation data from prior schemas.
This prevents unbounded memory growth when ingesting many
schemas by clearing residual cached data from remaining
@reflection.cache-decorated functions (e.g., get_table_comment).
"""
Was this helpful? React with 👍 / 👎
🔍 CI failure analysis for 51b740d: SonarCloud Quality Gate failed due to insufficient code coverage (6.1%) on new code, below the required ≥20% threshold. Unit test and Playwright E2E failures from previous runs are unrelated to this PR's Redshift changes.IssueThe Root CauseCode coverage on new code is 6.1%, below the required threshold of ≥20%. The PR introduces new methods and logic in the Redshift connector without corresponding test coverage:
DetailsTo pass the Quality Gate, unit tests should be added covering:
Previously observed unrelated failures (may or may not still be present):
Code Review
|
| Auto-apply | Compact |
|
|
Was this helpful? React with 👍 / 👎 | Gitar
|


Describe your changes:
Fixes #20649
This PR optimizes memory usage in the Redshift connector during metadata ingestion by preventing unbounded memory growth when processing multiple schemas. The changes replace global SQLAlchemy reflection caches with per-schema caching and move partition detail loading from database-level to schema-level scope.
Summary by Gitar
@reflection.cachedecorators fromget_columns,_get_schema_column_info, and_get_all_relation_infoto prevent cross-schema cache accumulation_clear_reflection_cache()method to clear SQLAlchemy inspector cache before processing each schema_get_schema_column_infoand_get_all_relation_infothat auto-invalidates when schema changesget_partition_details()to accept optionalschema_nameparameter for targeted queriesquery_table_names_and_types()result.close()calls to release database cursorsThis will update automatically on new commits.
Type of change:
Checklist:
Fixes <issue-number>: <short explanation>