Skip to content

[SS-52 | SS-53 | SS-51] COPY FROM s3 testing#35313

Merged
patrickwwbutler merged 10 commits intoMaterializeInc:mainfrom
patrickwwbutler:patrick/copy-from-s3-qa
Mar 13, 2026
Merged

[SS-52 | SS-53 | SS-51] COPY FROM s3 testing#35313
patrickwwbutler merged 10 commits intoMaterializeInc:mainfrom
patrickwwbutler:patrick/copy-from-s3-qa

Conversation

@patrickwwbutler
Copy link
Contributor

@patrickwwbutler patrickwwbutler commented Mar 3, 2026

Updates platform check, parallel workload, and testdrive COPY TO S3 tests to now roundtrip test COPY TO S3 and COPY FROM S3.

Adds a new zippy test for copy to & from s3 roundtrip.

Motivation

https://linear.app/materializeinc/project/copy-from-s3-da125e5e04e1/issues

@github-actions
Copy link

github-actions bot commented Mar 3, 2026

Thanks for opening this PR! Here are a few tips to help make the review process smooth for everyone.

PR title guidelines

  • Use imperative mood: "Fix X" not "Fixed X" or "Fixes X"
  • Be specific: "Fix panic in catalog sync when controller restarts" not "Fix bug" or "Update catalog code"
  • Prefix with area if helpful: compute: , storage: , adapter: , sql:

Pre-merge checklist

  • The PR title is descriptive and will make sense in the git log.
  • This PR has adequate test coverage / QA involvement has been duly considered. (trigger-ci for additional test/nightly runs)
  • If this PR includes major user-facing behavior changes, I have pinged the relevant PM to schedule a changelog post.
  • This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
  • If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
  • If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).

@patrickwwbutler patrickwwbutler changed the title Patrick/copy from s3 qa [SS-52 | SS-53 | SS-51] Basic COPY FROM s3 testing Mar 4, 2026
@patrickwwbutler patrickwwbutler requested review from a team and def- March 4, 2026 15:58
@patrickwwbutler patrickwwbutler marked this pull request as ready for review March 4, 2026 15:58
@patrickwwbutler patrickwwbutler requested a review from a team as a code owner March 4, 2026 15:58
Copy link
Contributor

@def- def- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessarily as part of this PR, but it would be valuable to have it in data-ingest, bounded-memory and feature-benchmark too. Mostly to make sure large amounts of data work, memory usage stays low and performance stays good (and is good already) respectively.

@patrickwwbutler patrickwwbutler requested a review from def- March 9, 2026 18:36
@patrickwwbutler
Copy link
Contributor Author

patrickwwbutler commented Mar 9, 2026

Not necessarily as part of this PR, but it would be valuable to have it in data-ingest, bounded-memory and feature-benchmark too. Mostly to make sure large amounts of data work, memory usage stays low and performance stays good (and is good already) respectively.

Added feature-benchmark and bounded-memory, will leave data-ingest for another PR as it seems it will require a bit more surrounding infrastructure for this

@patrickwwbutler patrickwwbutler changed the title [SS-52 | SS-53 | SS-51] Basic COPY FROM s3 testing [SS-52 | SS-53 | SS-51] COPY FROM s3 testing Mar 9, 2026
@patrickwwbutler patrickwwbutler force-pushed the patrick/copy-from-s3-qa branch from ea10277 to 5ac021c Compare March 9, 2026 20:34
Copy link
Contributor

@def- def- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fresh nightly triggered: https://buildkite.com/materialize/nightly/builds/15543
Edit: Parallel Workload is red, because there is no COPY TO S3, your code should also account for it being empty, but probably has to be disabled until we can figure out that other bug: Cannot choose from an empty sequence
Feature benchmark also failed:

> COPY copy_from_s3_src TO 's3://copytos3/benchmark/copy_from_s3/18265b9b-de1c-410a-bf4b-b1247b50b6d7'
17:1: error: executing query failed: db error: ERROR: dispatch failure: other: Custom endpoint `minio:9000` was not a valid URI: URL scheme must be HTTP or HTTPS (found minio)
     |
   2 | > CREATE SECRET copy ... [rest of line truncated for security]
   6 |     SECRET ACCESS KE ... [rest of line truncated for security]
  16 |
  17 | > COPY copy_from_s3_src TO 's3://copytos3/benchmark/copy_from_s3/18265b9b-de1c-410a-bf4b-b1247b50b6d7'
     | ^

Zippy looks like either a test bug or correctness issue:

source: /var/lib/buildkite-agent/builds/hetzner-x86-64-8cpu-16gb-aa260bc3/materialize/nightly/misc/python/materialize/zippy/copy_s3_actions.py:76
6:1: error: non-matching rows: expected:
[["0", "0", "1", "1"]]
got:
[["7001", "7001", "1", "1"]]
Poor diff:
- 0 0 1 1
+ 7001 7001 1 1
     |
   2 | > CREATE SECRET zipp ... [rest of line truncated for security]
   3 | > CREATE CONNECTION  ... [rest of line truncated for security]
   5 | > COPY INTO zippy_s3_staging_24 FROM 's3://copytos3/zippy/20' (FORMAT CSV, AWS CONNECTION = zippy_aws_conn_24)
   6 | > SELECT MIN(f1), MAX(f1), COUNT(f1), COUNT(DISTINCT f1) FROM zippy_s3_staging_24
     | ^

I'd suggest running and iterating on tests locally first for faster turnaround times.

@patrickwwbutler patrickwwbutler force-pushed the patrick/copy-from-s3-qa branch from 5ac021c to a867917 Compare March 11, 2026 17:44
@patrickwwbutler
Copy link
Contributor Author

triggered nightly with all modules/tests here: https://buildkite.com/materialize/nightly/builds/15556/steps/canvas

@patrickwwbutler
Copy link
Contributor Author

@def- It seems like all the nightly pipeline failures are unrelated to these changes, with almost all of them being due to this unrecognized parameter enable_replica_targeted_materialized_views. Not sure if this is expected, or perhaps just some configuration issue with my PR?

@def- def- force-pushed the patrick/copy-from-s3-qa branch from a0ef4c6 to 01cd8bb Compare March 12, 2026 14:30
@def-
Copy link
Contributor

def- commented Mar 12, 2026

True, rebased on main, should be fine now: https://buildkite.com/materialize/nightly/builds/15572
Edit: New try: https://buildkite.com/materialize/nightly/builds/15574

Copy link
Contributor

@def- def- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test failures look unrelated, thanks!

@patrickwwbutler patrickwwbutler merged commit 3288bbe into MaterializeInc:main Mar 13, 2026
321 of 324 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants