Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

depr(python): Rename write_database parameter if_exists to if_table_exists #12783

Merged

Conversation

alexander-beedie
Copy link
Collaborator

@alexander-beedie alexander-beedie commented Nov 29, 2023

Closes #12779.

Deprecates if_exists in favour of if_table_exists, clarifying what it is that exists.

As per the linked Issue, it's not unreasonable that somebody may think they are replacing some existing data in a table. Given that the downside of getting this wrong is potentially deleting an entire table, it's good to remove the potential for ambiguity...


@stinodego: as a follow-up we should probably make the write_database "if_table_exists" and write_delta "mode" parameters consistent, given that they have the same options/behaviour, but use different param/value names 🤔 (Can be left to a subsequent PR).

@github-actions github-actions bot added enhancement New feature or an improvement of an existing feature python Related to Python Polars labels Nov 29, 2023
@stinodego
Copy link
Member

@stinodego: as a follow-up we should probably make the write_database "if_table_exists" and write_delta "mode" parameters consistent, given that they have the same options/behaviour, but use different param/value names 🤔 (Can be left to a subsequent PR).

I'm not sure about that. mode is pretty well established in Spark/Delta Lake docs. I don't want to change that.
https://docs.delta.io/latest/delta-batch.html

@alexander-beedie
Copy link
Collaborator Author

alexander-beedie commented Nov 30, 2023

I'm not sure about that. mode is pretty well established in Spark/Delta Lake docs. I don't want to change that. https://docs.delta.io/latest/delta-batch.html

If it's well established already then that's fair, but it does seem prone to the same ambiguity that this PR addresses - and if you're going to write via polars then there is a reasonable argument that we should line these up on our side (as a consistent API is one of our selling points ;))

Parameter correspondence:

write_database
("if_table_exists")
write_delta
("mode")
replace overwrite
append append
fail error
<n/a> ignore

You can see that the parameters line-up exactly in terms of what they do and how they behave (with an extra "ignore" option for write_delta). Anyway, let's have a think and come back to it later 👌

@alexander-beedie alexander-beedie merged commit 4c50e41 into pola-rs:main Nov 30, 2023
23 checks passed
@alexander-beedie alexander-beedie deleted the write-database-if-exists-param branch November 30, 2023 09:34
@stinodego stinodego changed the title feat(python): improve/deprecate write_database "if_exists" param name as "if_table_exists" instead feat(python): Rename write_database parameter if_exists to if_table_exists Dec 15, 2023
@stinodego stinodego changed the title feat(python): Rename write_database parameter if_exists to if_table_exists depr(python): Rename write_database parameter if_exists to if_table_exists Dec 15, 2023
@stinodego stinodego removed the enhancement New feature or an improvement of an existing feature label Dec 15, 2023
@github-actions github-actions bot added the deprecation Add a deprecation warning to outdated functionality label Dec 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
deprecation Add a deprecation warning to outdated functionality python Related to Python Polars
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Ambiguity in DataFrame.write_database(if_exists='replace') could lead to data loss.
2 participants