Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DO] Use methods instead of properties for discovery #979

Merged
merged 14 commits into from
Sep 16, 2019

Conversation

esloho
Copy link
Contributor

@esloho esloho commented Sep 12, 2019

Closes #961

Besides the conversion of properties into methods, this PR also includes:

  • Raising an error if no entity is found in the server for a given id (instead of returning None)
  • Renaming of entities and repositories methods for a more Python-like flavor (less verbose)
  • Use of an internal instance for every Repository class to be used commonly instead of creating a new instance for each single use
  • Addition of a new notebook to be used as example for the discovery part (in examples/07_data_observatory/discovery.ipynb ). Still a work in progress, but it's a start (and useful for local testing while developing).

Copy link
Contributor

@alrocar alrocar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super clean implementation! ❤️

Left some minor comments about the messaging and some others to agree on the API design.

@@ -18,12 +18,11 @@ def _constructor_expanddim(self):
return Categories

@staticmethod
def get_by_id(category_id):
return get_category_repo().get_by_id(category_id)
def by_id(category_id):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is just syntax sugar but let's agree on this API design decision.

I liked the get_* version for a couple of reasons:

  • In general I see functions as verbs (where reasonable)
  • You have all the get functions under the same umbrella. It's easier if you use an auto-complete system to write get and see what different options you have to get Categories
  • There are other frameworks (such as Django) that follow that convention get_by_*

The case for countries() and categories() is a bit different, those are known classes and since they can't just be properties I'm OK with keeping them as nouns

Thoughts? (not sure if we changed it for any other reason)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like those reasons. I changed back all the methods names 👍

result = self.client.get_categories('id', category_id)

if len(result) == 0:
return None
raise CartoException('The id does not correspond with any existing category in Data Observatory.')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a comment here to take into account in the rest of classes

I'd use a subclass of CartoException for this, something like ObjectDoesNotExistError or similar. So we can let the user to just catch CartoException or doing a more fine grained exception handling.

Also, I'd replace in Data Observatory by in the catalog. I'd try to add some hint on what to do next, like "you can list the available list of categories IDs by calling to Category.all()" (or whatever)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be an extension of CartoException (which I see is defined in the CARTO's boilerplate) or an unrelated one?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mmm I don't have a strong opinion about that, maybe we can have a generic DiscoveryException or similar, unrelated from CartoException which tend to be more related to server side errors.

What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created a dedicated exception DiscoveryException in a new module exceptions at the root package. If we create new ones in the future then we could create a package exceptions with modules for each group. Does it sounds OK?

"source": [
"# Discoverying content of interest in the Data Observatory\n",
"\n",
"CARTO's Data Observatory is a powerful tool for exploring the available datasets in our data lake. Through the discovery API you would be able to navigate through the datasets and their properties, thus knowing in advanced which sources may be of interest for you before even requesting access to them. "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd write this as:

Suggested change
"CARTO's Data Observatory is a powerful tool for exploring the available datasets in our data lake. Through the discovery API you would be able to navigate through the datasets and their properties, thus knowing in advanced which sources may be of interest for you before even requesting access to them. "
"The Discovery API is a powerful tool for exploring the available datasets in our Data Observatory catalog. Through the Discovery API you would be able to navigate through the datasets and their variables, thus knowing in advanced which sources may be of interest for you before even requesting access to them. "

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done! I changed all the mentions in the notebook.

@@ -0,0 +1,253 @@
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's move this to a 07_discovery or 07_catalog folder

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

@esloho esloho assigned alrocar and unassigned esloho Sep 13, 2019
@@ -0,0 +1,56 @@
import pandas as pd

from cartoframes.data.observatory.repository.provider_repo import get_provider_repo
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just realized according to our convention we forgot to use relative imports in this PR.

@@ -0,0 +1,253 @@
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just ran the notebook and it seems it's outdated (still using properties for countries or datasets or variables. Please try to run it locally to fix those errors.

result = self.client.get_categories('id', category_id)

if len(result) == 0:
return None
raise CartoException('The id does not correspond with any existing category in Data Observatory.')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mmm I don't have a strong opinion about that, maybe we can have a generic DiscoveryException or similar, unrelated from CartoException which tend to be more related to server side errors.

What do you think?

Copy link
Contributor

@alrocar alrocar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's re-run and fix the notebook, seems it got outdated.

🏓

@alrocar alrocar assigned esloho and unassigned alrocar Sep 13, 2019
@esloho esloho assigned alrocar and unassigned esloho Sep 16, 2019
@esloho esloho requested a review from alrocar September 16, 2019 12:36
Copy link
Contributor

@alrocar alrocar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VAMOS!! 🚀

@alrocar alrocar removed their assignment Sep 16, 2019
@alrocar alrocar merged commit 18f0be2 into develop Sep 16, 2019
@alrocar alrocar deleted the feature/961-properties branch September 16, 2019 13:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Data Discovery - properties
3 participants