Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rtorre/ch56013/client for stream download of bq datasets #1512

Merged

Conversation

rafatower
Copy link
Contributor

This adds a client and functions to stream download of datasets. See https://app.clubhouse.io/cartoteam/story/56013/client-for-stream-download-of-bq-datasets

There are things missing and/or to be added outside of this repo:

  • the endpoint is hardcoded to run against local do server enrichment branch
  • it uses a new requests session instead of the customary AuthAPIClient, no easy way otherwise to run against local server.
  • The api_key is hardcoded.

Eventually to be moved to carto-python (at present it is as simple as moving the file). We don't do it yet not to get distracted with packaging, dependencies, etc. until the interfaces settle.

To run the test (in a virtualenv and with do server running locally):

pip install -e .[extra]
pytest tests/e2e/data/services/test_bq_datasets.py

Mind that this is a PoC and that we need to progress.

@rafatower
Copy link
Contributor Author

Note end-to-end tests don't seem to be executed in CI

@oleurud
Copy link
Contributor

oleurud commented Jan 22, 2020

Note end-to-end tests don't seem to be executed in CI

Yes, it is on purpose, but we should change it

@@ -1,7 +1,9 @@
from .geocoding import Geocoding
from .isolines import Isolines
from .bq_datasets import BQUserDataset
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not very important since we are going to move it to another repo, but I would move BQUserDataset to clients folder. The idea of this one is to be used for data services

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I wrongly took it as "services offered" to CF's users.

class BQUserDataset:

@staticmethod
def name(name_id):
Copy link
Contributor

@oleurud oleurud Jan 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now, in CF we are based on functions instead of classes, but in carto-python we are class-based.

So, in here in CF I would spec something like:

from not_sure_folder_and_file import read_bigquery

dataFrame = read_bigquery('dataset_name')

but for carto-python, the current one makes sense

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We talked offline about the python api.

To summarize and for the record: On the one hand, this function(input) -> output adapts better to existing recommendations. OTOH, it is not fully possible in all cases without "guessing" the schema.

Let's do the following: let's stick to the proposed python api (private, undocumented), complete a full iteration (putting all the components together to work), then let's consider possible code reorgs or alternate API's.

@rafatower rafatower merged commit 642b0a0 into enrichment Jan 22, 2020
@rafatower rafatower deleted the rtorre/ch56013/client-for-stream-download-of-bq-datasets branch January 22, 2020 16:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants