Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature(v3): Write: partial writes write good lines and reject bad li… #5169

Merged
merged 10 commits into from
Dec 13, 2023

Conversation

jstirnaman
Copy link
Contributor

@jstirnaman jstirnaman commented Oct 10, 2023

…nes #5160

  • Documents the proposed partial writes feature for Serverless.
  • Updates Troubleshoot writes and API docs.

Closes #5160

Copy link
Contributor

@wiedld wiedld left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TY for doing these updates!

Made some suggestions, mostly to:

  • Help delineate the products (serverless vs dedicated) different options -- because only dedicated will have this be configurable.
  • Provide two different examples of 400 errors. (Hopefully the user can then "type" the difference.) Altho I'm unclear how best to do this. 🤷🏼‍♀️

api-docs/cloud-dedicated/ref.yml Outdated Show resolved Hide resolved
api-docs/cloud-dedicated/ref.yml Outdated Show resolved Hide resolved
api-docs/cloud-dedicated/ref.yml Outdated Show resolved Hide resolved
api-docs/cloud-dedicated/ref.yml Outdated Show resolved Hide resolved
api-docs/cloud-serverless/ref.yml Outdated Show resolved Hide resolved
api-docs/cloud-serverless/ref.yml Show resolved Hide resolved
api-docs/cloud-serverless/ref.yml Show resolved Hide resolved
@jstirnaman
Copy link
Contributor Author

@wiedld Thanks for the helpful suggestions here!

Copy link
Contributor

@wiedld wiedld left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving.
With the caveat of please wait until change is deployed before merging. 🙏🏼

Copy link
Collaborator

@sanderson sanderson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One minor suggestion. Looks great!

Copy link
Contributor

@wiedld wiedld left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes requested based on updated error code. TY!

api-docs/cloud-dedicated/ref.yml Outdated Show resolved Hide resolved
api-docs/cloud-dedicated/ref.yml Outdated Show resolved Hide resolved
api-docs/cloud-dedicated/ref.yml Outdated Show resolved Hide resolved
api-docs/cloud-dedicated/ref.yml Outdated Show resolved Hide resolved
api-docs/cloud-serverless/ref.yml Outdated Show resolved Hide resolved
api-docs/cloud-serverless/ref.yml Outdated Show resolved Hide resolved
api-docs/cloud-serverless/ref.yml Outdated Show resolved Hide resolved
#5160

- Documents the proposed partial writes feature for Serverless and Dedicated.
- If some points are written, returns 201 status code.
- Updates Troubleshoot writes and API docs.
#5160

- Documents the proposed partial writes feature for Serverless and Dedicated.
- If some points are written, returns 201 status code.
- Updates Troubleshoot writes and API docs.
Copy link
Member

@jacobmarble jacobmarble left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 201 HTTP response code is not what v2 products return on partial errors. We return 201 because of an idiosyncrasy in an upstream Rust library.

Before we merge this, I'd like to see if we can correct this behavior to avoid (1) customer surprises and (2) more surprises when we eventually fix this.

Relevant comment: https://github.com/influxdata/influxdb_iox/issues/5226#issuecomment-1850759520

examples:
partialWriteErrorWithRejectedPoints:
summary: Partial write rejects points with syntax errors
value:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Jayclifford345 201 (partial write) and 400 (batch failed) response body properties.

@jstirnaman jstirnaman marked this pull request as ready for review December 13, 2023 17:29
@jstirnaman
Copy link
Contributor Author

@wiedld @sanderson Please give this Serverless docs PR a look. I moved Dedicated changes to a separate branch.

api-docs/cloud-serverless/ref.yml Show resolved Hide resolved
Copy link
Collaborator

@sanderson sanderson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One comment about writing a tag and field with the same key, but nothing that should hold this PR up. Everything else looks great.

- an invalid timestamp
- a schema conflict with existing tags or fields in the database

A schema conflict can occur when points that fall within the same partition (default partitioning is measurement and day) as existing bucket data have a different data type for an existing field.
Copy link
Collaborator

@sanderson sanderson Dec 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Points with a tag and field that use the same key will also be rejected. I think these would be considered a "schema conflict," but I don't know if they'll get a line protocol parsing error or something else. Maybe something to note here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Points with a tag and field that use the same key will also be rejected. I think these would be considered a "schema conflict," but I don't know if they'll get a line protocol parsing error or something else. Maybe something to note here.

Good catch! @wiedld Do you know when duplicate keys get detected? If not, I'll test post-change and update docs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this should be a schema conflict per line, and therefore the same rules apply (if in partial versus non-partial writes).

Copy link
Member

@jacobmarble jacobmarble left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't merge this.

We shouldn't publicly document that partial write errors return 201.

  • This is inconsistent with our V2 API.
  • We will fix this behavior so that 204 is returned instead.
  • Documenting will only confuse customers going forward.

@jacobmarble
Copy link
Member

Here's the 201/204 fix for Serverless:
https://github.com/influxdata/idpe/pull/18398

@wiedld
Copy link
Contributor

wiedld commented Dec 13, 2023

Here's the 201/204 fix for Serverless: influxdata/idpe#18398

We are adding a change for Serverless, to make the 201 => 204.
The dedicated cluster will still have the 204 & 204.

@jacobmarble
Copy link
Member

The dedicated cluster will still have the 204 & 204.

I don't understand this. Typo?

@jstirnaman
Copy link
Contributor Author

jstirnaman commented Dec 13, 2023

Here's the 201/204 fix for Serverless: influxdata/idpe#18398

@wiedld @jacobmarble To be clear, a partial write response for Serverless will have status 204 No Content, but will have a response body?

@@ -14601,8 +14680,9 @@ tags:
|  Code  | Status | Description |
|:-----------:|:------------------------ |:--------------------- |
| `200` | Success | |
| `204` | Success. No content | InfluxDB doesn't return data for the request. For example, a successful write request returns `204` status code, acknowledging that data is written and queryable. |
| `400` | Bad request | InfluxDB can't parse the request due to an incorrect parameter or bad syntax. For _writes_, the error may indicate one of the following problems: <ul><li>Line protocol is malformed. The response body contains the first malformed line in the data and indicates what was expected. For partial writes, the number of points written and the number of points rejected are also included.</li><li>`Authorization` header is missing or malformed or the API token doesn't have permission for the operation.</li></ul> |
| `201` | Created | Successfully created a resource. The response body may contain details. |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With Jacob's change, the 201 will not exist in serverless (but will still exist in dedicated).

| `400 "Bad request"` | `message` contains the first malformed line | If request data is malformed |
| HTTP response code | Response body | Description |
| :-------------------------------| :--------------------------------------------------------------- | :------------- |
| `201 "Created"` | error details about rejected points, up to 100 points: `line` contains the first rejected line, `message` describes rejected points | If InfluxDB ingested some or all of the data |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

204

@jstirnaman jstirnaman merged commit 90948e6 into master Dec 13, 2023
2 checks passed
@jstirnaman jstirnaman deleted the feature-5160-partial-writes branch December 13, 2023 22:20
jstirnaman added a commit that referenced this pull request Dec 15, 2023
jstirnaman added a commit that referenced this pull request Dec 15, 2023
jstirnaman added a commit that referenced this pull request Jan 17, 2024
jstirnaman added a commit that referenced this pull request Jan 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Write: partial writes write good lines and reject bad lines
4 participants