New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for a python client #1744
Comments
@philippefutureboy Hey Philippe! I believe it'll be a great contribution! I think the community can benefit if you prepare some proposal of what you're going to implement and we'll try to get some attention to it. What do you think? |
@philippefutureboy Hey! Thanks for sharing! I'd really like to learn more about the possible use cases for the data science community you're thinking about. |
Hey @paveltiunov & @igorlukanin! @paveltiunov: Noted! I'll do some research on my side to see what I can come up with. @igorlukanin: I don't have a clear use case for the data science community. However, to me Cube.js is a marvelous tool to increase accessibility to your data, and I see it as a consumer/provider that inserts itself at some point in a data pipeline. At Arthur, Cube.js is our previous to last layer before we present data to the client - we use a layer of python on top to reconcile different data sources and prep the final data that allows us to generate our reports. To us the primary value of Cube.js is to provide an API that is really easy to interact with and that abstracts away the complexity of SQL via reusable, templatable, approachable queries. Think of it like an ORM for OLAP and analytics. Hence, our primary use for a client like this one is interoperability with the python ecosystem to have a smooth transition to pandas and friends. |
In order to better design the Python client to match the needs of the Python data science/analytics/engineering community, I will be starting a few threads in major data Python projects. I'll update this message with links to the discussions in various community as I create them: Pandas/PyData: https://groups.google.com/g/pydata/c/YpJVfI5HT20/m/9ik0iH1HBgAJ Please come and say hi if you are coming from one of these threads! :) 👋 |
Hey there! I've investigated a bit the ecosystem and your @cubejs/client-core package code, and here are my conclusions so far:
These led to the following questions for the team:
I'll read up on the async model as well as the impl of the BQ library this week and then the API should start to form. Thanks! |
Hope there is a similar realization |
Data processing is generally done by back-end personnel, and back-end personnel may not be too familiar with js The simplest solution to support different languages should be a wrapper for REST API At present, in the data field, python is suitable for data scientists, data analysts, developers, etc. It is more popular, so if cubrs can enter this ecology, it would be better. |
If you are interested in working on this issue, please leave a comment below and we will be happy to assign the issue to you. |
Any news about python client? Its seems like cube can be very helpfull for data science and data engineering, and can be use like feature store. |
As an alternative https://cube.dev/docs/backend/sql/#sql-api can be tried instead. It should work with SQLAlchemy already. |
Description
Cube.js would greatly benefit from having a Python client to ease adoption by the data science community. I'm opening this issue because I am interested in championing the effort for the development of such a client, starting end of February, to delivery by max end of March.
Python is a different ecosystem from the JS ecosystem, and as such I'd suggest that a redesign of the API be considered to better integrate in the ecosystem/be more pythonic. As for what these changes imply, I do not have a clear opinion yet. Here are a few places to get started:
object[key]
, a bit like Proxy in JS), etc.). Leveraging these would be key to a successful integration.I am no Python expert; I only have ~6-8 months of experience in a data engineering, scripting & data science context, but I'd be happy to contribute to the best of my knowledge to the design of this API. To make this a resounding success however I'd suggest to get some experienced python data scientist's opinion on the API they would like to use as your future customers.
It would also possible (minus a few exceptions) to simply port the API as is, but I think there's much more power in integrating in the language and ecosystem. In the end, it's the team's prerogative to make a choice and I'll be happy to contribute either way :)
Cheers!
Philippe
The text was updated successfully, but these errors were encountered: