-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/do discovery #949
Feature/do discovery #949
Conversation
I see tests are in red (also we already merged the namespaces refactor to the Thanks! |
Tests are green, weeeeee!! |
try: | ||
from unittest.mock import Mock, patch | ||
except ImportError: | ||
from mock import Mock, patch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'mock.Mock' imported but unused
try: | ||
from unittest.mock import Mock, patch | ||
except ImportError: | ||
from mock import Mock, patch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'mock.Mock' imported but unused
try: | ||
from unittest.mock import Mock, patch | ||
except ImportError: | ||
from mock import Mock, patch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'mock.Mock' imported but unused
try: | ||
from unittest.mock import Mock, patch | ||
except ImportError: | ||
from mock import Mock, patch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'mock.Mock' imported but unused
try: | ||
from unittest.mock import Mock, patch | ||
except ImportError: | ||
from mock import Mock, patch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'mock.Mock' imported but unused
import pandas as pd | ||
|
||
from cartoframes.data.observatory.repository.category_repo import get_category_repo | ||
from cartoframes.data.observatory.repository.dataset_repo import get_dataset_repo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we want the API methods to be static we can't have the repositories instances in these classes, that's why I created those helper methods, but I'm not sure if I like it xD
Would be better to import the Repo instead and create a private instance for the module? Something like:
from cartoframes.data.catalog.repository.category_repo import CategoryRepository
__category_repo = CategoryRepository()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we are going to use the Repository pattern to switch to a different backend API when needed, what of these options would be easier to replace when we have a different backend API?
- The wrapper functions
- Instantiate the repository class
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer the helper. Less code and less coupling.
what of these options would be easier to replace when we have a different backend API?
I think, we could change to a different backend just changing the RepoClient
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Upss turns out I'm the author of this pull request, so I can just comment.
👏 Great work so far!
I added a bunch of comments mostly of aesthetic stuff. I'd do this:
- Let's try to change to relative paths to follow our library convention.
- The rest of my comments, don't need to be tackled in this PR, otherwise it's going to be a never ending task.
I'd merge this one (after the relative paths thing being changed) so we can start integrating with the enrichment functions and create a new issue with some of my suggestions (the ones we decide to tackle).
import pandas as pd | ||
|
||
from cartoframes.data.observatory.repository.category_repo import get_category_repo | ||
from cartoframes.data.observatory.repository.dataset_repo import get_dataset_repo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we are going to use the Repository pattern to switch to a different backend API when needed, what of these options would be easier to replace when we have a different backend API?
- The wrapper functions
- Instantiate the repository class
def _to_categories(results): | ||
from cartoframes.data.observatory.category import Categories | ||
|
||
return Categories([CategoryRepository._to_category(result) for result in results]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Related to my previous comment, I see we have two ways of creating a Category
instance, am I right?
CategoryRepository._to_category(result)
or using the constructor.
If that's true, I'd try to standardize that.
This comment applies to the rest of repositories.
discovery API.ipynb
Outdated
@@ -0,0 +1,1171 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since Github does not deal well with changes in notebook files, I think it's better to commit the notebooks without the outputs. So let's remove them, please.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, we added it here only for sharing and feedback gathering but I guess there's no need of that anymore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, that's useful. Let's remove them then before merging.
_GEOGRAPHY_FIELD_ID = 'id' | ||
|
||
|
||
class Geography(pd.Series): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm seeing in the data model, that Geography
has a geom_coverage
column of type GEOMETRY
. How are we dealing with this type of column?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're not, panda is dealing with it. We don't deal with columns anymore. Does it sound right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 It's not important right now, but let's take into account in the future, since if we want to do some spatial operation, we'd need to map Geography as a geodataframe instead of a dataframe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A first comment about singleton
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks cool :) I have added some comments and one more here: maybe we could add a parent class of the models and another one of the repositories, avoiding repeat some methods (JAVA vision :) ), but probably I would not do it now in this PR
import pandas as pd | ||
|
||
from cartoframes.data.observatory.repository.category_repo import get_category_repo | ||
from cartoframes.data.observatory.repository.dataset_repo import get_dataset_repo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer the helper. Less code and less coupling.
what of these options would be easier to replace when we have a different backend API?
I think, we could change to a different backend just changing the RepoClient
.
return Category({ | ||
'id': result['id'], | ||
'name': result['name'] | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mapping the classes, we could have a problem with previous versions in the future: if we release a version doing the mapping, and after that, we change a column from the database, the released version will fail.
As we want to have all the catalog public, we could put the responsibility of the API properties in the backend and making the CARTOframes client transparent
|
||
@staticmethod | ||
def _to_country(result): | ||
from cartoframes.data.observatory.country import Country |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer doing the import
on the top of the file. Are you trying to avoid the circular pain
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, see my comment above. This is the only way I found to make it work without changing the API. I think it's a design flaw due to mixing model and behavior :(
class TestCategoryRepo(unittest.TestCase): | ||
|
||
def setUp(self): | ||
RepoClient.get_categories = Mock(return_value=[db_category1, db_category2]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indentation is not a multiple of four
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀 LGTM!
I cannot approve the PR because I created it, but it's approved.
Let's clean the notebook output before merging, please.
Closes #960