Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for "ceiling" filtering/partitioning #604

Open
rulle-io opened this issue Mar 30, 2023 · 3 comments
Open

Support for "ceiling" filtering/partitioning #604

rulle-io opened this issue Mar 30, 2023 · 3 comments

Comments

@rulle-io
Copy link
Contributor

We do a DB table dump daily and would like to include partition date's data or earlier (but not later), so DB dumps are actually reproducible/deterministic.

So, is it possibe to achieve configuration like below?
Parameters: "--table=some_table --partition=2027-07-31 --partitionColumn=col" + some other option (?)
=>
SQL: "SELECT * FROM some_table WHERE 1=1 AND col < '2027-08-01'"),

P.S. It is probably possible to achive this using user-provided SQL file and add some parsing of partition value,
but would much more simple to employ dedicated parameter(s).

@rulle-io
Copy link
Contributor Author

@labianchin ?

@labianchin
Copy link
Collaborator

Yes. --partitionColumn=col is the way to do it. NO need for user-provided SQL. There are several examples internally in Spotify. I can show some examples, if needed.

@rulle-io
Copy link
Contributor Author

rulle-io commented Apr 5, 2023

Yes, actually tried all(?) the combinations of parameters
["--partitionColumn", "--partition", "--minPartitionPeriod", "--partitionPeriod"] without the success.

So, please, provide an example.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants