Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Doapi metadata enrichment #1575

Merged
merged 145 commits into from
Mar 11, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
145 commits
Select commit Hold shift + click to select a range
3989a16
Scaffolding for stream download of datasets/tables from BQ
Jan 20, 2020
b6260c2
First implementation of stream download client
Jan 20, 2020
b5bbd9d
Fix flake8 styling
Jan 21, 2020
8cf9993
Add test for stream download
Jan 22, 2020
642b0a0
Merge pull request #1512 from CartoDB/rtorre/ch56013/client-for-strea…
Jan 22, 2020
dc02d2f
Bootstrap the PR
dgaubert Jan 22, 2020
1b021d0
Linter
dgaubert Jan 22, 2020
d744330
Add client for dataset creation
Jan 22, 2020
cf9b703
Follow flake8 stylistic changes
Jan 22, 2020
635f599
Draft upload dataset
dgaubert Jan 22, 2020
adc6033
Hound
dgaubert Jan 22, 2020
8a08267
Rename method
dgaubert Jan 23, 2020
2f3598c
Linter
dgaubert Jan 23, 2020
08b7b43
Fix bad body content with requests's metadata
dgaubert Jan 23, 2020
d2ec7b4
Implement '.upload_file_object()' method
dgaubert Jan 23, 2020
13e3cec
Make it compatible with python3.5
Jan 23, 2020
7b18a71
Fix bug: take the attribute and not the function
Jan 23, 2020
4022e80
Implement client to create an import job in BQ
dgaubert Jan 23, 2020
9b7681e
Add method to get the status of an import
dgaubert Jan 23, 2020
2ba0aff
Implement method to wait for job completion
dgaubert Jan 23, 2020
0e23372
Linter
dgaubert Jan 23, 2020
a9da167
Implement method that creates, uploads, and imports a dataset and wai…
dgaubert Jan 23, 2020
50767b9
Unnecessay else statement
dgaubert Jan 23, 2020
d16889b
'Failed' is alse a valid job terminal status
dgaubert Jan 23, 2020
2d8aa0e
Use built-in assertion
dgaubert Jan 23, 2020
a634142
Use keywords params
dgaubert Jan 23, 2020
4a3aa9e
Use built-in assertions
dgaubert Jan 23, 2020
be5d9f3
Linter
dgaubert Jan 23, 2020
8c58ee5
Merge pull request #1514 from CartoDB/rtorre/ch56009/client-for-bq-da…
Jan 24, 2020
9065ebd
Improve the test by checking created dataset
Jan 24, 2020
17fdb63
Merge pull request #1515 from CartoDB/client-for-bq-dataset-creation-…
Jan 24, 2020
9429ed9
Fix bad condition
dgaubert Jan 24, 2020
bd073e7
Merge branch 'enrichment' into dgaubert/ch56011/client-for-bq-dataset…
dgaubert Jan 24, 2020
188939c
Fix TODO in e2e tests
dgaubert Jan 24, 2020
07d27c9
Keep trailing comma
dgaubert Jan 24, 2020
afea287
Reuse upload_file_object method
dgaubert Jan 24, 2020
41473fa
Linter
dgaubert Jan 24, 2020
7507b44
Follow style usage conventions
dgaubert Jan 24, 2020
618f1e3
Merge pull request #1513 from CartoDB/dgaubert/ch56011/client-for-bq-…
dgaubert Jan 24, 2020
6e8ccde
Draft client for points enrichment [ch56016]
dgaubert Jan 28, 2020
036bd1a
No need to expose the class
dgaubert Jan 28, 2020
a3f01d5
No need to import the class
dgaubert Jan 28, 2020
4779e79
Missing project and dataset
dgaubert Jan 28, 2020
047ec20
Use keyword args and old format way
dgaubert Jan 29, 2020
e8eebef
Adapt to latest changes with do-api
dgaubert Jan 29, 2020
15ae8b8
Perform the whole cycle
dgaubert Jan 29, 2020
03f8100
Fix test to check the whole workflow to enrich a dataset
dgaubert Jan 30, 2020
3e7c5bb
Linter
dgaubert Jan 30, 2020
6b40b5a
Wait 1 second between calls
dgaubert Jan 30, 2020
1bc5704
Implement test for polygons enrichment
dgaubert Jan 30, 2020
19a75b3
Merge branch 'develop' into enrichment
dgaubert Jan 30, 2020
e258248
Merge branch 'enrichment' into dgaubert/ch56016/client-points-enrichment
dgaubert Jan 30, 2020
e5839c5
Move fixtures to their own files
dgaubert Jan 30, 2020
e0801ae
Linter
dgaubert Jan 30, 2020
dba71d6
Missing sleep while pooling for status
dgaubert Feb 3, 2020
461c626
Merge pull request #1516 from CartoDB/dgaubert/ch56016/client-points-…
dgaubert Feb 3, 2020
7b609a0
Use DO API for public enrichment methods
dgaubert Feb 6, 2020
fb0e863
Use credentials
dgaubert Feb 6, 2020
a9984e3
Adapt Catalog to DO-Metadata API
juanrmn Feb 7, 2020
a0a5059
Linter
dgaubert Feb 7, 2020
6a91eeb
fix linter errors
juanrmn Feb 7, 2020
2ac4716
Remove temporary comment
dgaubert Feb 7, 2020
3aa6edd
Use credentials while testing
dgaubert Feb 7, 2020
abf1c78
Fail test if env variable is None
dgaubert Feb 7, 2020
d89861f
Skip test optional
dgaubert Feb 7, 2020
029580b
Be able to use credential while testing
dgaubert Feb 7, 2020
64689fe
Merge branch 'develop' of github.com:CartoDB/cartoframes into juanra/…
juanrmn Feb 10, 2020
2a32f35
pytyon 3.5 fixes
juanrmn Feb 10, 2020
b8346b8
Custom base_url in credentials
dgaubert Feb 10, 2020
6419f7a
Linter
dgaubert Feb 10, 2020
0f09f7b
bring back enrichment service
dgaubert Feb 10, 2020
02abc26
Typo
dgaubert Feb 10, 2020
2b17b43
Encupsulate enrichment into its service
dgaubert Feb 10, 2020
fe9d130
Linter
dgaubert Feb 10, 2020
d4a91b1
Mark method as private
dgaubert Feb 10, 2020
a17cabd
Improve naming
dgaubert Feb 10, 2020
fc737c7
Deleted unneeded test suites
dgaubert Feb 10, 2020
20ac1e0
Do not use mutable data structures for argument defaults.
dgaubert Feb 11, 2020
ad22ae4
Linter
dgaubert Feb 11, 2020
83fbc18
adapt geographies_gdf method to DO Metadata API
juanrmn Feb 11, 2020
a50a2f6
Be able to send filters and aggregations for enrichment
dgaubert Feb 11, 2020
2e1f136
Better default arguments
dgaubert Feb 12, 2020
c10073f
Flip condition
dgaubert Feb 12, 2020
aaa4ed4
Merge pull request #1535 from CartoDB/dgaubert/ch58149/finish-enrichm…
dgaubert Feb 12, 2020
de4a788
Merge pull request #1530 from CartoDB/enrichment-integration-do-api
dgaubert Feb 12, 2020
fed891d
Change endpoint enrichment do api
dgaubert Feb 12, 2020
7d5f02b
user subscriptions bugfix. And tests
juanrmn Feb 13, 2020
d2bfab0
python 3.5 fix
juanrmn Feb 13, 2020
ee8d557
fix py35 tests...
juanrmn Feb 13, 2020
ff716e9
Merge pull request #1531 from CartoDB/juanra/ch57069/metadata-api-int…
oleurud Feb 14, 2020
f485600
Merge pull request #1538 from CartoDB/enrichment-endpoint-change
dgaubert Feb 14, 2020
b8c317c
Possible solution to WKT/WKB transformation to geojson
Feb 14, 2020
22b97ff
Please hound and avoid mutability issue in default params
Feb 14, 2020
916cd49
Merge pull request #1542 from CartoDB/enrichment-upload-fix-to-json
Feb 14, 2020
8fd4332
set new url setting
juanrmn Feb 17, 2020
cf9f385
Use new base path for DO enrichment [ch58421]
dgaubert Feb 17, 2020
2373cad
Better funcion name
dgaubert Feb 17, 2020
9f070a3
Merge pull request #1544 from CartoDB/dgaubert/ch58421/new-url-for-api
dgaubert Feb 17, 2020
b5cb8f8
Don't use None as default value for aggregation
dgaubert Feb 17, 2020
c176d46
Merge pull request #1546 from CartoDB/dgaubert/ch57208/cartoframes-en…
dgaubert Feb 18, 2020
5f5c059
Raise error when a job has failed
dgaubert Feb 20, 2020
3914afa
Merge pull request #1548 from CartoDB/dgaubert/ch59562/provide-insigh…
dgaubert Feb 21, 2020
371995d
Be more tolerant with aggregation results while testing
dgaubert Feb 21, 2020
d01fa6b
Merge pull request #1553 from CartoDB/dgaubert/ch59461/cartoframe-s-e…
dgaubert Feb 21, 2020
db863ec
Merge pull request #1543 from CartoDB/juanra/ch58421/update-do-metada…
oleurud Feb 26, 2020
2bd1357
removing db datasets
oleurud Mar 5, 2020
aca860b
using carto-python from GH branch
oleurud Mar 5, 2020
ed5e9b9
add pyrestcli
oleurud Mar 5, 2020
526c24d
travis using carto-python from branch
oleurud Mar 5, 2020
9c8fd87
rm tmp stuff about carto-python version
oleurud Mar 5, 2020
f2a90be
rm bq tests
oleurud Mar 5, 2020
1ea27f3
using new naming DODataset
oleurud Mar 5, 2020
d7fd768
order by in download to ensure e2e tests
oleurud Mar 6, 2020
48be07a
Merge pull request #1573 from CartoDB/oleurud/ch61421/integrate-do-cl…
oleurud Mar 6, 2020
ac4bd93
rename do_dataset stuff
oleurud Mar 6, 2020
c179f6b
Merge branch 'oleurud/ch58426/move-client-from-cf-to-carto-python' of…
oleurud Mar 6, 2020
95804e2
Merge pull request #1574 from CartoDB/enrichment
Mar 6, 2020
f0dea4b
e2e do_dataset tests recovered and working
oleurud Mar 6, 2020
5fb6c13
cornflake8
oleurud Mar 6, 2020
6cbc90a
Merge pull request #1572 from CartoDB/oleurud/ch58426/move-client-fro…
Mar 9, 2020
f179cf4
Merge remote-tracking branch 'origin/develop' into doapi-metadata-enr…
Mar 9, 2020
5bfd3e5
Refactor RepoClient using DODataset
Jesus89 Mar 9, 2020
c306ae0
Use default do user in repo_client
Jesus89 Mar 9, 2020
3e591a0
Using user credentials to fetch catalog entities datasets/geographies
Jesus89 Mar 9, 2020
06b4944
Add repo_client unit tests
Jesus89 Mar 9, 2020
f4b4da5
Using list in filter_id to force returning a list
Jesus89 Mar 9, 2020
53a9d5a
Update discover_dataset example
Jesus89 Mar 9, 2020
05b1ecb
Merge pull request #1576 from CartoDB/jarroyo/ch61524/move-metadata-a…
Jesus89 Mar 10, 2020
be513c2
Raise min version of carto to 1.9.1
Mar 10, 2020
9985832
Fix for duplicate/wrong columns when reusing enrichment object
Mar 10, 2020
ed5b5d9
Fix typo
Jesus89 Mar 10, 2020
1024649
Update carto-python to 1.9.1
Jesus89 Mar 10, 2020
ae7cab1
Remove None entities in repo_client
Jesus89 Mar 10, 2020
3578e23
Add unit test
Jesus89 Mar 10, 2020
1d9326c
Merge pull request #1583 from CartoDB/jarroyo/ch61957/catalogerror-wh…
Mar 10, 2020
5593421
Add type filter to get_subscription_ids
Jesus89 Mar 10, 2020
397bd5d
Merge pull request #1581 from CartoDB/rtorre/ch61906/error-500-field-…
Mar 10, 2020
3a4cd9d
Merge pull request #1584 from CartoDB/jarroyo/ch61957/catalogerror-wh…
Jesus89 Mar 10, 2020
69dc563
support for staging tests
oleurud Mar 11, 2020
3740c67
refactoring _add_subscription_ids
oleurud Mar 11, 2020
f1722ef
enrichment details
oleurud Mar 11, 2020
2c29562
more details from CR
oleurud Mar 11, 2020
aa60646
lint
oleurud Mar 11, 2020
b453dc4
fix mock reference
oleurud Mar 11, 2020
89280aa
Merge pull request #1586 from CartoDB/doapi-metadata-enrichment-details
oleurud Mar 11, 2020
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ _build
# Distribution / packaging
.Python
env/
.venv/
build/
develop-eggs/
dist/
Expand Down
2 changes: 1 addition & 1 deletion .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,4 +9,4 @@ install:
script:
- tox
after_success:
- coveralls
- coveralls
2 changes: 1 addition & 1 deletion cartoframes/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@


# Check installed packages versions
check_package('carto', '>=1.8.3')
check_package('carto', '>=1.9.1')
check_package('pandas', '>=0.23.0')
check_package('geopandas', '>=0.6.0')

Expand Down
14 changes: 6 additions & 8 deletions cartoframes/data/observatory/catalog/dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
from shapely import wkt

from .entity import CatalogEntity
from .repository.dataset_repo import get_dataset_repo
from .repository.dataset_repo import get_dataset_repo, DATASET_TYPE
from .repository.geography_repo import get_geography_repo
from .repository.variable_repo import get_variable_repo
from .repository.variable_group_repo import get_variable_group_repo
Expand All @@ -17,8 +17,6 @@
from ....utils.utils import get_credentials, check_credentials, check_do_enabled
from ....exceptions import DOError

DATASET_TYPE = 'dataset'


class Dataset(CatalogEntity):
"""A Dataset represents the metadata of a particular dataset in the catalog.
Expand Down Expand Up @@ -375,7 +373,7 @@ def _join_geographies_geodataframes(geographies_gdf1, geographies_gdf2):
return join_gdf['id'].unique()

@check_do_enabled
def to_csv(self, file_path, credentials=None, limit=None):
def to_csv(self, file_path, credentials=None, limit=None, order_by=None):
"""Download dataset data as a local csv file. You need Data Observatory enabled in your CARTO
account, please contact us at support@carto.com for more information.

Expand All @@ -402,10 +400,10 @@ def to_csv(self, file_path, credentials=None, limit=None):
raise DOError('You are not subscribed to this Dataset yet. '
'Please, use the subscribe method first.')

self._download(_credentials, file_path, limit)
self._download(_credentials, file_path, limit, order_by)

@check_do_enabled
def to_dataframe(self, credentials=None, limit=None):
def to_dataframe(self, credentials=None, limit=None, order_by=None):
"""Download dataset data as a pandas.DataFrame. You need Data Observatory enabled in your CARTO
account, please contact us at support@carto.com for more information.

Expand Down Expand Up @@ -434,7 +432,7 @@ def to_dataframe(self, credentials=None, limit=None):
raise DOError('You are not subscribed to this Dataset yet. '
'Please, use the subscribe method first.')

return self._download(_credentials, limit=limit)
return self._download(_credentials, limit=limit, order_by=order_by)

@check_do_enabled
def subscribe(self, credentials=None):
Expand Down Expand Up @@ -468,7 +466,7 @@ def subscribe(self, credentials=None):

"""
_credentials = get_credentials(credentials)
_subscribed_ids = subscriptions.get_subscription_ids(_credentials)
_subscribed_ids = subscriptions.get_subscription_ids(_credentials, DATASET_TYPE)

if self.id in _subscribed_ids:
utils.display_existing_subscription_message(self.id, DATASET_TYPE)
Expand Down
4 changes: 3 additions & 1 deletion cartoframes/data/observatory/catalog/entity.py
Original file line number Diff line number Diff line change
Expand Up @@ -109,7 +109,7 @@ def _get_print_id(self):

return self.id

def _download(self, credentials, file_path=None, limit=None):
def _download(self, credentials, file_path=None, limit=None, order_by=None):
if not self._is_available_in('bq'):
raise DOError('{} is not ready for Download. Please, contact us for more information.'.format(self))

Expand All @@ -126,6 +126,8 @@ def _download(self, credentials, file_path=None, limit=None):
column_names = bq_client.get_table_column_names(project, dataset, table)

query = 'SELECT * FROM `{}`'.format(full_remote_table_name)
if order_by:
query = '{} ORDER BY {}'.format(query, order_by)
if limit:
query = '{} LIMIT {}'.format(query, limit)

Expand Down
14 changes: 6 additions & 8 deletions cartoframes/data/observatory/catalog/geography.py
Original file line number Diff line number Diff line change
@@ -1,15 +1,13 @@
from .entity import CatalogEntity
from .repository.dataset_repo import get_dataset_repo
from .repository.geography_repo import get_geography_repo
from .repository.geography_repo import get_geography_repo, GEOGRAPHY_TYPE
from .repository.constants import GEOGRAPHY_FILTER
from . import subscription_info
from . import subscriptions
from . import utils
from ....utils.utils import get_credentials, check_credentials, check_do_enabled
from ....exceptions import DOError

GEOGRAPHY_TYPE = 'geography'


class Geography(CatalogEntity):
"""A Geography represents the metadata of a particular geography dataset in the catalog.
Expand Down Expand Up @@ -178,7 +176,7 @@ def get_all(cls, filters=None, credentials=None):
return cls._entity_repo.get_all(filters, credentials)

@check_do_enabled
def to_csv(self, file_path, credentials=None, limit=None):
def to_csv(self, file_path, credentials=None, limit=None, order_by=None):
"""Download geography data as a local csv file. You need Data Observatory enabled in your CARTO
account, please contact us at support@carto.com for more information.

Expand All @@ -205,10 +203,10 @@ def to_csv(self, file_path, credentials=None, limit=None):
raise DOError('You are not subscribed to this Geography yet. '
'Please, use the subscribe method first.')

self._download(_credentials, file_path, limit)
self._download(_credentials, file_path, limit, order_by)

@check_do_enabled
def to_dataframe(self, credentials=None, limit=None):
def to_dataframe(self, credentials=None, limit=None, order_by=None):
"""Download geography data as a pandas.DataFrame. You need Data Observatory enabled in your CARTO
account, please contact us at support@carto.com for more information.

Expand Down Expand Up @@ -237,7 +235,7 @@ def to_dataframe(self, credentials=None, limit=None):
raise DOError('You are not subscribed to this Geography yet. '
'Please, use the subscribe method first.')

return self._download(_credentials, limit=limit)
return self._download(_credentials, limit=limit, order_by=order_by)

@check_do_enabled
def subscribe(self, credentials=None):
Expand Down Expand Up @@ -270,7 +268,7 @@ def subscribe(self, credentials=None):

"""
_credentials = get_credentials(credentials)
_subscribed_ids = subscriptions.get_subscription_ids(_credentials)
_subscribed_ids = subscriptions.get_subscription_ids(_credentials, GEOGRAPHY_TYPE)

if self.id in _subscribed_ids:
utils.display_existing_subscription_message(self.id, GEOGRAPHY_TYPE)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,9 +21,6 @@ def _get_entity_class(cls):
return Category

def _get_rows(self, filters=None):
if filters is not None and COUNTRY_FILTER in filters.keys():
return self.client.get_categories_joined_datasets(filters)

return self.client.get_categories(filters)

def _map_row(self, row):
Expand Down
14 changes: 7 additions & 7 deletions cartoframes/data/observatory/catalog/repository/constants.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
CATEGORY_FILTER = 'category_id'
COUNTRY_FILTER = 'country_id'
DATASET_FILTER = 'dataset_id'
GEOGRAPHY_FILTER = 'geography_id'
PROVIDER_FILTER = 'provider_id'
VARIABLE_FILTER = 'variable_id'
VARIABLE_GROUP_FILTER = 'variable_group_id'
CATEGORY_FILTER = 'category'
COUNTRY_FILTER = 'country'
DATASET_FILTER = 'dataset'
GEOGRAPHY_FILTER = 'geography'
PROVIDER_FILTER = 'provider'
VARIABLE_FILTER = 'variable'
VARIABLE_GROUP_FILTER = 'variable_group'
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
from .entity_repo import EntityRepository


_COUNTRY_ID_FIELD = 'country_id'
_COUNTRY_ID_FIELD = 'id'
_ALLOWED_FILTERS = [CATEGORY_FILTER]


Expand Down
23 changes: 12 additions & 11 deletions cartoframes/data/observatory/catalog/repository/dataset_repo.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
from .constants import CATEGORY_FILTER, COUNTRY_FILTER, GEOGRAPHY_FILTER, PROVIDER_FILTER, VARIABLE_FILTER
from .constants import CATEGORY_FILTER, COUNTRY_FILTER, GEOGRAPHY_FILTER, PROVIDER_FILTER
from .entity_repo import EntityRepository
from ..entity import CatalogList

DATASET_TYPE = 'dataset'

_DATASET_ID_FIELD = 'id'
_DATASET_SLUG_FIELD = 'slug'
_ALLOWED_FILTERS = [CATEGORY_FILTER, COUNTRY_FILTER, GEOGRAPHY_FILTER, PROVIDER_FILTER, VARIABLE_FILTER]
_ALLOWED_FILTERS = [CATEGORY_FILTER, COUNTRY_FILTER, GEOGRAPHY_FILTER, PROVIDER_FILTER]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is VARIABLE_FILTER removed here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't remember the reason. Not sure if we decided that makes no sense... but better to know the @juanrmn opinion

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's been a time since that, but I think I removed that because dataset's does not have the variable attribute, so if I'm not wrong, it would have failed with the previous code anyway.

Also, to include this filter in the DO API, it would need a quite heavy join between datasets and variables, I think. But please correct me if I'm wrong.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK then. Let's keep it as it is and change it later if necessary.



def get_dataset_repo():
Expand All @@ -18,10 +18,16 @@ def __init__(self):
super(DatasetRepository, self).__init__(_DATASET_ID_FIELD, _ALLOWED_FILTERS, _DATASET_SLUG_FIELD)

def get_all(self, filters=None, credentials=None):
if credentials is not None:
Jesus89 marked this conversation as resolved.
Show resolved Hide resolved
filters = self._add_subscription_ids(filters, credentials, DATASET_TYPE)
if filters is None:
return []

# Using user credentials to fetch entities
self.client.set_user_credentials(credentials)
response = self._get_filtered_entities(filters)
self.client.set_user_credentials(None)
return response
entities = self._get_filtered_entities(filters)
self.client.reset_user_credentials()
return entities

@classmethod
def _get_entity_class(cls):
Expand Down Expand Up @@ -56,10 +62,5 @@ def _map_row(self, row):
'available_in': self._normalize_field(row, 'available_in')
}

def get_datasets_for_geographies(self, geographies):
rows = self.client.get_datasets_for_geographies(geographies)
normalized_data = [self._get_entity_class()(self._map_row(row)) for row in rows]
return CatalogList(normalized_data)


_REPO = DatasetRepository()
20 changes: 14 additions & 6 deletions cartoframes/data/observatory/catalog/repository/entity_repo.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
from .repo_client import RepoClient
from ..entity import CatalogList, is_slug_value
from .....exceptions import CatalogError
from ..subscriptions import get_subscription_ids


def check_catalog_connection(method):
Expand Down Expand Up @@ -65,9 +66,9 @@ def _get_filters(self, filters):

def _get_id_filter(self, id_):
if self.slug_field is not None and is_slug_value(id_):
return {self.slug_field: id_}
return {self.slug_field: [id_]}

return {self.id_field: id_}
return {self.id_field: [id_]}

def _get_id_list_filters(self, id_list):
if self.slug_field is None:
Expand All @@ -91,16 +92,23 @@ def _get_id_list_filters(self, id_list):

return filters

def _add_subscription_ids(self, filters, credentials, entity_type):
ids = get_subscription_ids(credentials, entity_type)

if not isinstance(ids, list) or len(ids) == 0:
return None

filters = filters or {}
filters['id'] = ids
return filters

@classmethod
def _to_catalog_entity(cls, result):
return cls._get_entity_class()(result)

@classmethod
def _normalize_field(cls, row, field):
if field in row:
return row[field]

return None
return row.get(field, None)

@classmethod
@abstractmethod
Expand Down
29 changes: 16 additions & 13 deletions cartoframes/data/observatory/catalog/repository/geography_repo.py
Original file line number Diff line number Diff line change
@@ -1,17 +1,15 @@
from cartoframes.auth import Credentials
from geopandas import GeoDataFrame

from .....utils.geom_utils import set_geometry
from .constants import COUNTRY_FILTER, CATEGORY_FILTER, PROVIDER_FILTER
from .entity_repo import EntityRepository

from .....io.carto import read_carto

GEOGRAPHY_TYPE = 'geography'

_GEOGRAPHY_ID_FIELD = 'id'
_GEOGRAPHY_SLUG_FIELD = 'slug'
_ALLOWED_FILTERS = [COUNTRY_FILTER, CATEGORY_FILTER, PROVIDER_FILTER]

_DO_CREDENTIALS = Credentials('do-metadata', 'default_public')


def get_geography_repo():
return _REPO
Expand All @@ -23,20 +21,23 @@ def __init__(self):
super(GeographyRepository, self).__init__(_GEOGRAPHY_ID_FIELD, _ALLOWED_FILTERS, _GEOGRAPHY_SLUG_FIELD)

def get_all(self, filters=None, credentials=None):
if credentials is not None:
filters = self._add_subscription_ids(filters, credentials, GEOGRAPHY_TYPE)
if filters is None:
return []

# Using user credentials to fetch entities
self.client.set_user_credentials(credentials)
response = self._get_filtered_entities(filters)
self.client.set_user_credentials(None)
return response
entities = self._get_filtered_entities(filters)
self.client.reset_user_credentials()
return entities

@classmethod
def _get_entity_class(cls):
from cartoframes.data.observatory.catalog.geography import Geography
return Geography

def _get_rows(self, filters=None):
if filters is not None and (COUNTRY_FILTER in filters.keys() or CATEGORY_FILTER in filters.keys()):
return self.client.get_geographies_joined_datasets(filters)

return self.client.get_geographies(filters)

def _map_row(self, row):
Expand All @@ -59,8 +60,10 @@ def _map_row(self, row):
}

def get_geographies_gdf(self):
query = 'select id, geom_coverage as the_geom from geographies_public where geom_coverage is not null'
return read_carto(query, _DO_CREDENTIALS)
data = self.client.get_geographies({'get_geoms_coverage': True})
gdf = GeoDataFrame(data, crs='epsg:4326')
set_geometry(gdf, col='geom_coverage', inplace=True)
return gdf


_REPO = GeographyRepository()