Skip to content

[SS-44] Add COPY FROM s3 Docs#35301

Merged
maheshwarip merged 7 commits intoMaterializeInc:mainfrom
patrickwwbutler:patrick/copy-from-docs
Mar 6, 2026
Merged

[SS-44] Add COPY FROM s3 Docs#35301
maheshwarip merged 7 commits intoMaterializeInc:mainfrom
patrickwwbutler:patrick/copy-from-docs

Conversation

@patrickwwbutler
Copy link
Copy Markdown
Contributor

Updates the documentation for COPY FROM mzsql command to include information and syntax on the new COPY FROM s3 feature.

Motivation

https://linear.app/materializeinc/issue/SS-44/write-user-facing-docs-for-copy-from-s3-statement-csv

Description

Adds a new syntax file for copy from s3/url, adds tab to SQL command reference page, and information on how to use it.

@patrickwwbutler patrickwwbutler requested a review from a team March 2, 2026 15:27
@patrickwwbutler patrickwwbutler requested a review from a team as a code owner March 2, 2026 15:27
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 2, 2026

Thanks for opening this PR! Here are a few tips to help make the review process smooth for everyone.

PR title guidelines

  • Use imperative mood: "Fix X" not "Fixed X" or "Fixes X"
  • Be specific: "Fix panic in catalog sync when controller restarts" not "Fix bug" or "Update catalog code"
  • Prefix with area if helpful: compute: , storage: , adapter: , sql:

Pre-merge checklist

  • The PR title is descriptive and will make sense in the git log.
  • This PR has adequate test coverage / QA involvement has been duly considered. (trigger-ci for additional test/nightly runs)
  • If this PR includes major user-facing behavior changes, I have pinged the relevant PM to schedule a changelog post.
  • This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
  • If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
  • If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).

@patrickwwbutler patrickwwbutler requested a review from kay-kim March 2, 2026 15:27
@patrickwwbutler patrickwwbutler changed the title [SS-44] COPY FROM s3 Docs [SS-44] Add COPY FROM s3 Docs Mar 2, 2026
Copy link
Copy Markdown
Contributor

@martykulma martykulma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! The doc should also enumerate the S3 bucket/object ACLs needed for MZ to read the data.

@maheshwarip
Copy link
Copy Markdown
Contributor

@kay-kim would you be able to review these today? aiming to launch to customers tomorrow. if you're stretched thin, let me know and I can review

@kay-kim
Copy link
Copy Markdown
Contributor

kay-kim commented Mar 5, 2026

I'll take a look when I get into the office. Wanted to print out the draft freshness guide, so needed home printer.

Read* | [`s3:GetObject`](https://docs.aws.amazon.com/AmazonS3/latest/API/API_GetObject.html) | Grants permission to retrieve an object from a bucket.

{{< note >}}
*Read - The `s3:GetObject` Read action is only required if you wish to perform bulk imports into Materialize using [`COPY FROM s3`](/sql/copy-from/).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: This sink to s3 using COPY TO s3 page ... Why are we including COPY FROM s3 here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I figured it wouldn't hurt to add it as this is the main source of info on how to set up AWS connections, and then just added the asterisk to note that it's only required if you want to use COPY FROM. I figured it couldn't hurt?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh ... it'll be confusing for users who are sinking to S3 ... because it'll make people go "is this me? Will this tutorial make me do a COPY FROM step later?

It might be that we instead need an ingest data > Bulk copy page. Will let @maheshwarip decide (doesn't actually need to be a blocker but something on the backlog). Yeah ... our connections page is shenanigans.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In discussing, we can do the tutorial when we do parquet and we'll go with just the reference page. As such, I would just revert any changes to this tutorial.

@patrickwwbutler patrickwwbutler requested a review from kay-kim March 5, 2026 19:56
Read* | [`s3:GetObject`](https://docs.aws.amazon.com/AmazonS3/latest/API/API_GetObject.html) | Grants permission to retrieve an object from a bucket.

{{< note >}}
*Read - The `s3:GetObject` Read action is only required if you wish to perform bulk imports into Materialize using [`COPY FROM s3`](/sql/copy-from/).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh ... it'll be confusing for users who are sinking to S3 ... because it'll make people go "is this me? Will this tutorial make me do a COPY FROM step later?

It might be that we instead need an ingest data > Bulk copy page. Will let @maheshwarip decide (doesn't actually need to be a blocker but something on the backlog). Yeah ... our connections page is shenanigans.

| Read | [`s3:GetObject`](https://docs.aws.amazon.com/AmazonS3/latest/API/API_GetObject.html) | Grants permission to retrieve an object from a bucket. |
| List | [`s3:ListBucket`](https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListObjectsV2.html) | Grants permission to list some or all of the objects in a bucket. |

As we are not writing to the bucket, we do not need any write permissions, only read and list.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need this sentence.

### S3 Bucket IAM Policies

To prepare your S3 bucket for bulk import, follow the instructions in the [Amazon S3 Sink guide](/serve-results/sink/s3),
but, in your IAM policy, instead allow these actions:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah ... I would just specify that you need to allow these in your IAM policy. Depending on product's answer w.r.t. a tutorial, we can point people to that whenever that's done.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since no tutorial for now, just say you need to allow the following in your IAM policy.

@maheshwarip
Copy link
Copy Markdown
Contributor

Heh ... it'll be confusing for users who are sinking to S3 ... because it'll make people go "is this me? Will this tutorial make me do a COPY FROM step later? It might be that we instead need an ingest data > Bulk copy page. Will let @maheshwarip decide (doesn't actually need to be a blocker but something on the backlog). Yeah ... our connections page is shenanigans.

I can't reply to the parent comment for some reason. It is odd, but let's not block this PR. we can add the bulk copy page later

@maheshwarip
Copy link
Copy Markdown
Contributor

@kay-kim ready for review!

@maheshwarip maheshwarip merged commit d8e89b3 into MaterializeInc:main Mar 6, 2026
10 checks passed
antiguru pushed a commit to antiguru/materialize that referenced this pull request Mar 26, 2026
Updates the documentation for `COPY FROM` mzsql command to include
information and syntax on the new `COPY FROM s3` feature.

### Motivation


https://linear.app/materializeinc/issue/SS-44/write-user-facing-docs-for-copy-from-s3-statement-csv

### Description

Adds a new syntax file for copy from s3/url, adds tab to SQL command
reference page, and information on how to use it.

---------

Co-authored-by: Pranshu Maheshwari <pranshu.maheshwari@materialize.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants