Add support for a python client #1744

philippefutureboy · 2021-01-10T02:14:25Z

Description

Cube.js would greatly benefit from having a Python client to ease adoption by the data science community. I'm opening this issue because I am interested in championing the effort for the development of such a client, starting end of February, to delivery by max end of March.

Python is a different ecosystem from the JS ecosystem, and as such I'd suggest that a redesign of the API be considered to better integrate in the ecosystem/be more pythonic. As for what these changes imply, I do not have a clear opinion yet. Here are a few places to get started:

Refactor to support pythonic syntax & constructs. In python, everything is customizable, which makes it a language with great potential for and examples of powerful APIs. Every class in python implement a subset of the data model hooks, which give the objects of said class specific attributes such as operator logic (addition, multiplication, gte, key access (object[key], a bit like Proxy in JS), etc.). Leveraging these would be key to a successful integration.
Refactor to integrate properly with python's approach to concurrency (none vs asyncio, futures, etc).
Integrate implementation with the data science ecosystem libraries (numpy, pandas, matplotlib, plotly, etc.).

I am no Python expert; I only have ~6-8 months of experience in a data engineering, scripting & data science context, but I'd be happy to contribute to the best of my knowledge to the design of this API. To make this a resounding success however I'd suggest to get some experienced python data scientist's opinion on the API they would like to use as your future customers.

It would also possible (minus a few exceptions) to simply port the API as is, but I think there's much more power in integrating in the language and ecosystem. In the end, it's the team's prerogative to make a choice and I'll be happy to contribute either way :)

Cheers!
Philippe

The text was updated successfully, but these errors were encountered:

paveltiunov · 2021-01-12T06:57:40Z

@philippefutureboy Hey Philippe! I believe it'll be a great contribution! I think the community can benefit if you prepare some proposal of what you're going to implement and we'll try to get some attention to it. What do you think?

igorlukanin · 2021-01-13T20:51:19Z

@philippefutureboy Hey! Thanks for sharing! I'd really like to learn more about the possible use cases for the data science community you're thinking about.

philippefutureboy · 2021-01-14T23:56:30Z

Hey @paveltiunov & @igorlukanin!

@paveltiunov: Noted! I'll do some research on my side to see what I can come up with.

@igorlukanin: I don't have a clear use case for the data science community. However, to me Cube.js is a marvelous tool to increase accessibility to your data, and I see it as a consumer/provider that inserts itself at some point in a data pipeline. At Arthur, Cube.js is our previous to last layer before we present data to the client - we use a layer of python on top to reconcile different data sources and prep the final data that allows us to generate our reports. To us the primary value of Cube.js is to provide an API that is really easy to interact with and that abstracts away the complexity of SQL via reusable, templatable, approachable queries. Think of it like an ORM for OLAP and analytics. Hence, our primary use for a client like this one is interoperability with the python ecosystem to have a smooth transition to pandas and friends.

philippefutureboy · 2021-02-17T17:45:00Z

In order to better design the Python client to match the needs of the Python data science/analytics/engineering community, I will be starting a few threads in major data Python projects. I'll update this message with links to the discussions in various community as I create them:

Pandas/PyData: https://groups.google.com/g/pydata/c/YpJVfI5HT20/m/9ik0iH1HBgAJ
Airflow: https://app.slack.com/client/TCQ18L22Z/CCY359SCV/thread/CCY359SCV-1609933583.050900
TensorFlow: Dropped
Plotly: plotly/dash#1573

Please come and say hi if you are coming from one of these threads! :) 👋
Also feel free to suggest other projects which could be receptive to contribute here :)

philippefutureboy · 2021-03-14T19:26:07Z

Hey there!
Little update on my side:

I've investigated a bit the ecosystem and your @cubejs/client-core package code, and here are my conclusions so far:

I think that the google.cloud.bigquery client library could be a very good template to get started from
I'd most likely implement ResultSet as a wrapper around a pandas DataFrame
The concurrency/async model in python is fairly different from that of JS, which will require some modification in how the library's API is exposed. I need to investigate more how the concurrent.futures, asyncio stdlib packages and google.cloud.bigquery client library manage asynchronicity.
My understanding of google.cloud.bigquery client library's way to manage asynchronicity is to first emit a call to the BQ API, then wait until the .result() method is called to fetch the answer. The initial call is non-blocking, while the .result() call is.

These led to the following questions for the team:

Does Cube.js support an initial non-blocking call then a follow-up polling call to get the result?
Does Cube.js support loading the resultset via a cursor, row by row?

I'll read up on the async model as well as the impl of the BQ library this week and then the API should start to form.

Thanks!

qianxuanyon · 2021-08-17T07:53:27Z

Hope there is a similar realization
https://github.com/DataBrewery/cubes

qianxuanyon · 2021-08-17T08:04:43Z

Data processing is generally done by back-end personnel, and back-end personnel may not be too familiar with js

The simplest solution to support different languages should be a wrapper for REST API

At present, in the data field, python is suitable for data scientists, data analysts, developers, etc. It is more popular, so if cubrs can enter this ecology, it would be better.

github-actions · 2022-07-28T14:13:07Z

If you are interested in working on this issue, please leave a comment below and we will be happy to assign the issue to you.
If this is the first time you are contributing a Pull Request to Cube.js, please check our contribution guidelines.
You can also post any questions while contributing in the #contributors channel in the Cube.js Slack.

oleg-savko · 2022-11-30T15:28:14Z

Any news about python client?

Its seems like cube can be very helpfull for data science and data engineering, and can be use like feature store.

paveltiunov · 2022-12-29T02:47:08Z

As an alternative https://cube.dev/docs/backend/sql/#sql-api can be tried instead. It should work with SQLAlchemy already.

paveltiunov added the enhancement New feature proposal label Jan 12, 2021

philippefutureboy mentioned this issue Mar 14, 2021

[Feature Request] Plotly Dash & Cube.js: A match made in heaven! plotly/dash#1573

Closed

ivan-vdovin added the help wanted Community contributions are welcome. label Jul 28, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for a python client #1744

Add support for a python client #1744

philippefutureboy commented Jan 10, 2021 •

edited

paveltiunov commented Jan 12, 2021

igorlukanin commented Jan 13, 2021

philippefutureboy commented Jan 14, 2021

philippefutureboy commented Feb 17, 2021 •

edited

philippefutureboy commented Mar 14, 2021 •

edited

qianxuanyon commented Aug 17, 2021

qianxuanyon commented Aug 17, 2021

github-actions bot commented Jul 28, 2022

oleg-savko commented Nov 30, 2022

paveltiunov commented Dec 29, 2022

Add support for a python client #1744

Add support for a python client #1744

Comments

philippefutureboy commented Jan 10, 2021 • edited

Description

paveltiunov commented Jan 12, 2021

igorlukanin commented Jan 13, 2021

philippefutureboy commented Jan 14, 2021

philippefutureboy commented Feb 17, 2021 • edited

philippefutureboy commented Mar 14, 2021 • edited

qianxuanyon commented Aug 17, 2021

qianxuanyon commented Aug 17, 2021

github-actions bot commented Jul 28, 2022

oleg-savko commented Nov 30, 2022

paveltiunov commented Dec 29, 2022

philippefutureboy commented Jan 10, 2021 •

edited

philippefutureboy commented Feb 17, 2021 •

edited

philippefutureboy commented Mar 14, 2021 •

edited