Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI: releasing CLI to PyPI #9452

Closed
wants to merge 5 commits into from
Closed

Conversation

MohamedAbdeen21
Copy link
Contributor

@MohamedAbdeen21 MohamedAbdeen21 commented Mar 4, 2024

Which issue does this PR close?

Closes #9294.

Rationale for this change

Documentation states that datafusion-cli can be installed through pip; however, that is not the case. This adds a workflow to deploy datafusion-cli (and only the CLI, not the python module) to PyPI , allowing installation using pip.

What changes are included in this PR?

This is meant to be a minimal CI, to be reviewed and edited as necessary (handle licensing, tag names, maybe maintainers prefer manual releases, etc)

CC @andygrove as he's involved in arrow-datafusion-python workflows and can have some comments.

A final note: builds for 32-bit systems fail because of src/main.rs:304 which expects 1 << 40 to be usize. This causes cargo to panic and abort builds.

Are these changes tested?

You can try testing the deployment yourself using pip install datafusion-cli. I thought I changed the name and the repo to reference my fork, but apparently not.

I'll remove it from PyPI once the PR is ready to merge, to be redeployed by the maintainers.

Are there any user-facing changes?

Should be able to use pip to install datafusion-cli

@MohamedAbdeen21 MohamedAbdeen21 changed the title Draft CI for CLI release to PyPI CI: releasing CLI to PyPI Mar 5, 2024
@MohamedAbdeen21
Copy link
Contributor Author

Hi @alamb

Can you please give this a quick look?

@andygrove
Copy link
Member

Thanks @MohamedAbdeen21. I will review this in the next day or two.

@andygrove
Copy link
Member

@MohamedAbdeen21 My first question on this is why would we not do this from the arrow-datafusion-python repo? Many (perhaps most) of the maintainers of this crate are not interested in Python support and this would add an extra maintenance burden here.

@MohamedAbdeen21
Copy link
Contributor Author

Hey @andygrove this is not an issue of Python support, this is making DF more accessible to users through a more popular package manager i.e. pip, and it just happens that the easiest way to do so is through maturin.

Can we do it without maturin and the whole python bindings thing? probably, but that will be harder to maintain

Can we move this to arrow-datafusion-python? maybe, but I don't see what difference that makes given that there is no rust/python code involved in the deployment process.

@MohamedAbdeen21
Copy link
Contributor Author

Maybe once we require to manually bind some parts of the CLI before releasing we will have to move it to the python repo, but as for now I see no point in doing so

@andygrove
Copy link
Member

Hey @andygrove this is not an issue of Python support, this is making DF more accessible to users through a more popular package manager i.e. pip, and it just happens that the easiest way to do so is through maturin.

My point is that DataFusion is a Rust project and datafusion-cli is already available via cargo, which is the default packaging manager for Rust. If we want to use Python packaging for DataFusion, it seems logical to do that in the DataFusion Python repository.

@MohamedAbdeen21
Copy link
Contributor Author

MohamedAbdeen21 commented Mar 29, 2024

Logically it makes sense, but my point is that it doesn't have to be; as long as it can be deployed from the source why go through an extra layer of indirection? I'm looking at pip as just another package manager, not specifically Python's package manager.

Feel free to close this PR and re-open it in the Python repo, I'm happy to have DF on pip regardless of where it's deployed from.

I'll also remove the test from PyPI once the PR is closed.

@MohamedAbdeen21
Copy link
Contributor Author

Removed the release from PyPI and closing this issue, to be re-opened in the Python repo by the maintainers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

datafusion-cli not installed
2 participants