Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add delete syntax #13674

Closed
wants to merge 3 commits into from
Closed

Conversation

adarshsanjeev
Copy link
Contributor

@adarshsanjeev adarshsanjeev commented Jan 16, 2023

This is currently a draft PR.

This PR aims to improve the syntax for commonly used INSERT and REPLACE queries by adding new syntax that is more in line with the intention of the user.

Currently, there is a need to delete all rows which match a certain condition, the user needs to run

REPLACE INTO datasource
SELECT * FROM datasource
WHERE ! <condition>
PARTITIONED BY
[CLUSTERED BY ]

A much more intuitive format would be

DELETE FROM datasource
WHERE <condition>
PARTITIONED BY
[CLUSTERED BY ]

There is no new capabilities being added here, this is simply a better syntax for existing queries.
Therefore, it is important to note that this is not a 'true' DELETE like in SQL. DELETE would still trigger a reingest of the datasource filtering out any rows that match the condition. There is some scope for improvement in the future depending on the condition here (such as directly dropping a segment if the condition aligns with the partition boundaries).

The partitioning and clustering is still required as delete still performs a reingest of the table behind the scenes, but this might go away once the catalog PR is merged.


This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • a release note entry in the PR description.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

@paul-rogers
Copy link
Contributor

This is a change to SQL syntax. SQL syntax changes typically involve much discussion as we want to make the syntax a SQL-compatible as possible, but we usually need some Druid-specific bits. While this is a draft, please add to the description the specific syntax being proposed.

Copy link
Contributor

@paul-rogers paul-rogers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before this PR is finalized, please add documentation to the reference.md page in the MSQ docs. That doc helps the user, but it also helps us reviewers understand the intent of this PR.

At present, there is no description of the intent of this PR: I had to reverse engineer the meaning from the code. That wasn't too hard in this case (though I can't quite figure out why we need partitioning or clustering to do a delete...) Still, it will help other reviewers when the time comes.

If I guessed correctly, the goal is to do a REPLACE but with no data? We delete data as we do in REPLACE but we replace it with nothing?

In that case, there should be no SELECT in the DELETE statement. That means there is nothing for Calcite to validate. So, we'd have to validate the DELETE statement itself.

Recall that, in SQL, DELETE is normally:

DELETE FROM table_name WHERE condition

That should be sufficient, assuming the WHERE condition is a time range. The user doesn't need to tell us the partitioning (which doesn't really matter) nor the clustering (we'll delete all secondary partitions in the time range.)

This is why we need a bit of discussion about syntax before we get too deep into implementation. Thanks!

]
[
<PARTITIONED> <BY>
partitionedBy = PartitionGranularity()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this syntax is changing a bit in this PR: #13686

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the heads up. I'm looking forward to the catalog changes, I think this would make the delete syntax (and all the others actually) a lot more "SQL-like" since we can drop the partitioned by as a necessary part of the query.

]
[
<CLUSTERED> <BY>
clusteredBy = ClusterItems()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need clustering for delete? What does it mean to cluster deletions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since in the backend, Delete triggers a reingest of the datasource, the above is the clustering for that. I think this would be improved with the catalog.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If our code that does delete needs partitioning, then that code is, I would assert, broken and needs to be fixed. Let's not add SQL syntax exposes partitioning that isn't actually needed.

Let's imagine some cases. I create segments with 1-hour partitioning. Then, I create more segments, in the same time period, with 1 day partitioning. Now, I realize I messed up and want to delete this entire group of segments. Quick: which partitioning do I use? Hour? Or Day? Or, do I need different delete tasks, one for the hour segments, one for Day? Is this the user experience we want?

Often, just writing down the proposed syntax, and thinking through the user experience, can save us lots of back-and-forth at the level of individual lines of code.

So, I propose we either fix the need for partitioning in the reingest, or we give it a fake partitioning. Remember: there is no new data to partition.

@paul-rogers paul-rogers added Design Review Area - MSQ For multi stage queries - https://github.com/apache/druid/issues/12262 labels Jan 18, 2023
Copy link
Contributor

@paul-rogers paul-rogers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again for adding DELETE functionality.

It is clever to try to reuse REPLACE to do a delete. However, REPLACE needs a source of data, while DELETE does not. REPLACE needs partitioning and clustering, where DELETE does not.

I suspect we'll need to do a bit of work on the MSQ side to explicitly support a DELETE operation. In this operation, there will be no native query (because we're not loading data.) That is, this PR needs to do more than just "Add delete syntax", it probably also has to "add delete support to MSQ."

Suggestion: sketch out a design in either an Apache issue, or in a comment in this PR. Then, we can work out the overall approach before we dive into discussing specific lines of code.

]
[
<CLUSTERED> <BY>
clusteredBy = ClusterItems()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If our code that does delete needs partitioning, then that code is, I would assert, broken and needs to be fixed. Let's not add SQL syntax exposes partitioning that isn't actually needed.

Let's imagine some cases. I create segments with 1-hour partitioning. Then, I create more segments, in the same time period, with 1 day partitioning. Now, I realize I messed up and want to delete this entire group of segments. Quick: which partitioning do I use? Hour? Or Day? Or, do I need different delete tasks, one for the hour segments, one for Day? Is this the user experience we want?

Often, just writing down the proposed syntax, and thinking through the user experience, can save us lots of back-and-forth at the level of individual lines of code.

So, I propose we either fix the need for partitioning in the reingest, or we give it a fake partitioning. Remember: there is no new data to partition.

@paul-rogers
Copy link
Contributor

Yet another suggestion. We have the WHERE clause: WHERE __time >= ? AND __time < ?, for two parameters we'll call start and end. This actually can help us infer the time partitioning. The time interval has to be a multiple of one of our supported time partitions. Perhaps we can do a bit of math to work out the partitioning if we need it. Or, at least, a partitioning. For example, if end - start is 3 hours, infer partitioning is 1 hour. I suspect that, even if partitioning is DAY, and delete says it is HOUR, it will still work. The result is that the user doesn't have to tell Druid what time partitioning Druid has used to store Druid's segments.

For clustering, it is probably OK to omit clustering and just "cluster" all deleted segments into one big (empty) cluster.

Copy link

github-actions bot commented Feb 1, 2024

This pull request has been marked as stale due to 60 days of inactivity.
It will be closed in 4 weeks if no further activity occurs. If you think
that's incorrect or this pull request should instead be reviewed, please simply
write any comment. Even if closed, you can still revive the PR at any time or
discuss it on the dev@druid.apache.org list.
Thank you for your contributions.

@github-actions github-actions bot added the stale label Feb 1, 2024
Copy link

github-actions bot commented Mar 1, 2024

This pull request/issue has been closed due to lack of activity. If you think that
is incorrect, or the pull request requires review, you can revive the PR at any time.

@github-actions github-actions bot closed this Mar 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area - Documentation Area - MSQ For multi stage queries - https://github.com/apache/druid/issues/12262 Design Review stale
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants