New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
style(mypy): enforcing mypy typing for superset.connectors module #9824
style(mypy): enforcing mypy typing for superset.connectors module #9824
Conversation
9939433
to
507086a
Compare
Oh boy, gonna need to take some sedatives before this 😄 Thanks for taking this on @john-bodley , I'll review shortly. |
Codecov Report
@@ Coverage Diff @@
## master #9824 +/- ##
==========================================
- Coverage 70.88% 66.33% -4.55%
==========================================
Files 585 586 +1
Lines 30432 30670 +238
Branches 3152 3158 +6
==========================================
- Hits 21571 20346 -1225
- Misses 8749 10143 +1394
- Partials 112 181 +69
Continue to review full report at Codecov.
|
@villebro I started out thinking this would be a somewhat mindless/trivial change to kill some time whilst sheltering-in-place. It definitely became more complex/tedious than I initially had hoped. |
368dc7e
to
9cea662
Compare
9cea662
to
c9c7a59
Compare
superset/connectors/sqla/models.py
Outdated
@@ -841,6 +858,7 @@ def get_sqla_query( # sqla | |||
|
|||
where_clause_and = [] | |||
having_clause_and: List = [] | |||
assert filter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note this could be None
per the function signature and thus line #862 would throw an exception, however to the best of my knowledge this hasn't happened and thus filter
is probably never None
.
""" | ||
Delete function logic, override to implement diferent logic | ||
deletes the record with primary_key = primary_key | ||
|
||
:param primary_key: | ||
record primary key to delete | ||
""" | ||
item = self.datamodel.get(primary_key, self._base_filters) | ||
item = self.datamodel.get(primary_key, self._base_filters) # type: ignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because FAB isn't typed and there's no stubs it seems simpler to just ignore these.
self.x_metric = form_data.get("x") | ||
self.y_metric = form_data.get("y") | ||
self.z_metric = form_data.get("size") | ||
self.x_metric = form_data["x"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I perceive these are all required otherwise utils.get_metric_name(...)
etc. would return None
and cause issues when trying to index the Pandas DataFrame.
c9c7a59
to
8142a6c
Compare
eef8562
to
a7e5c32
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First pass
superset/connectors/base/models.py
Outdated
@@ -93,7 +93,7 @@ class BaseDatasource( | |||
update_from_object_fields: List[str] | |||
|
|||
@declared_attr | |||
def slices(self): | |||
def slices(self) -> relationship: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this should be RelationshipProperty
. See: https://docs.sqlalchemy.org/en/13/orm/relationship_api.html#sqlalchemy.orm.relationship
superset/connectors/base/models.py
Outdated
def select_star(self) -> str: | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This probably needs to be Optional[str]
or alternatively raise NotImplementedError
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@villebro a NotImplementedError
causes havoc for Druid hence why this is pass
. I made this an Optional[str]
.
superset/connectors/druid/models.py
Outdated
@@ -449,7 +460,7 @@ def get_perm(self) -> Optional[str]: | |||
|
|||
@classmethod | |||
def import_obj(cls, i_metric: "DruidMetric") -> "DruidMetric": | |||
def lookup_obj(lookup_metric: DruidMetric) -> Optional[DruidMetric]: | |||
def lookup_obj(lookup_metric: "DruidMetric") -> Optional["DruidMetric"]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure how a nested function referencing its parent method's class is handled by mypy, but if this was working before, we should probably be ok leaving this without quotes?
superset/typing.py
Outdated
DbapiDescriptionRow = Tuple[ | ||
str, str, Optional[str], Optional[str], Optional[int], Optional[int], bool | ||
] | ||
DbapiDescription = Union[List[DbapiDescriptionRow], Tuple[DbapiDescriptionRow, ...]] | ||
DbapiResult = List[Union[List[Any], Tuple[Any, ...]]] | ||
FilterValue = Union[float, int, str] | ||
FilterValues = Union[FilterValue, List[FilterValue], Tuple[FilterValue]] | ||
Granularity = Union[str, Dict[str, Union[str, float]]] | ||
Metric = Union[Dict[str, str], str] | ||
QueryObject = Dict[str, Any] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably call this QueryObjectDict
to distinguish from superset.common.query_object.QueryObject
.
@@ -558,8 +569,8 @@ def get_perm(self) -> str: | |||
obj=self | |||
) | |||
|
|||
def update_from_object(self, obj): | |||
return NotImplementedError() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😄
@@ -817,7 +829,7 @@ def granularity( | |||
"year": "P1Y", | |||
} | |||
|
|||
granularity: Dict[str, Union[str, float]] = {"type": "period"} | |||
granularity = {"type": "period"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would have thought this turns into a Dict[str, str]
unless being explicitly being defined as Granularity
, and thus causing problems below when numeric values are added?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@villebro I guess it differs from the Granularity
type which I defined previously (which adds to the confusion). When including the typing here it causes a conflict with the return type as it can also be a string per line 812.
A couple of learnings from having typed a bunch of Python code:
Optional[...]
values are difficult to handle and thus should be avoided if possible. It seems they're often used by laziness.- Mixed types, i.e., those requiring a
Union[...]
normally indicates poor quality and unnecessarily complex code.
I think the return type of granularity
is a clear indication of (2). Also I believe the true return type should be Union[Dict[str, Union[str, float]], str]
but mypy
complains on this.
superset/connectors/druid/models.py
Outdated
@@ -1240,7 +1268,7 @@ def run_query( # druid | |||
qry["limit"] = row_limit | |||
client.scan(**qry) | |||
elif (IS_SIP_38 and columns) or ( | |||
not IS_SIP_38 and len(groupby) == 0 and not having_filters | |||
not IS_SIP_38 and (not groupby or len(groupby) == 0) and not having_filters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if a simple falsy check here would be more pythonic: not groupby
? I can't think of an edge case for which it wouldn't give the correct results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@villebro thanks. I agree and made this change elsewhere.
superset/connectors/druid/models.py
Outdated
|
||
order_by = None | ||
order_by = None # type: ignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we define this as order_by: Optional[Metric]
, we can probably omit the ignore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@villebro the issue here is that order_by
has already been defined on line 1291,
order_by = utils.get_metric_name(timeseries_limit_metric)
I declared the type as order_by: Optional[str] = None
.
a7e5c32
to
dac1337
Compare
66c673b
to
867d57f
Compare
Codecov Report
@@ Coverage Diff @@
## master #9824 +/- ##
==========================================
- Coverage 71.22% 71.21% -0.01%
==========================================
Files 585 585
Lines 30828 30864 +36
Branches 3237 3237
==========================================
+ Hits 21957 21981 +24
- Misses 8762 8774 +12
Partials 109 109
Continue to review full report at Codecov.
|
867d57f
to
8c4c77a
Compare
8c4c77a
to
eef2c96
Compare
@@ -985,11 +988,11 @@ def is_adhoc_metric(metric) -> bool: | |||
) | |||
|
|||
|
|||
def get_metric_name(metric): | |||
return metric["label"] if is_adhoc_metric(metric) else metric | |||
def get_metric_name(metric: Metric) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@villebro the code works if the metric
is None
(this can happen when a key exists in the form-data but the value is None
) even though the Metric
type is explicitly Union[Dict[str, str], str]
. Generally I think we should strive to ensure only valid inputs are provided to functions (otherwise the logic can get quite complex), i.e., I prefer,
if metric:
name = get_metric_name(metric)
...
rather than,
name = get_metric_name(metric)
if name:
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree.
eef2c96
to
b66071f
Compare
6c67fde
to
3c40dd7
Compare
@villebro would you mind taking another pass at this PR? All the CI tasks are now passing. |
Sure thing,@john-bodley , I'll review during the weekend |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One non blocking comment/question, apart from that looking very good.
superset/typing.py
Outdated
FlaskResponse = Union[ | ||
flask.wrappers.Response, | ||
werkzeug.wrappers.Response, | ||
Base, | ||
Union[Base, Status], | ||
Union[Base, Status, Headers], | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we already have a wrapping Union
here, couldn't this be expressed as Union[..., Base, Status, Headers]
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, that makes sense. Unrelated comment, it's really strange how practically none of the major libraries have implemented type annotations yet. Yes, stubs exist for many, but that doesn't really help when reading code.
@@ -985,11 +988,11 @@ def is_adhoc_metric(metric) -> bool: | |||
) | |||
|
|||
|
|||
def get_metric_name(metric): | |||
return metric["label"] if is_adhoc_metric(metric) else metric | |||
def get_metric_name(metric: Metric) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree.
3c40dd7
to
fb2bcaa
Compare
LGTM after the |
fb2bcaa
to
eeb2729
Compare
Co-authored-by: John Bodley <john.bodley@airbnb.com>
@@ -256,8 +259,6 @@ def parse_human_datetime(s: Optional[str]) -> Optional[datetime]: | |||
>>> year_ago_1 == year_ago_2 | |||
True | |||
""" | |||
if not s: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FOUND IT. @john-bodley
Co-authored-by: John Bodley <john.bodley@airbnb.com>
Co-authored-by: John Bodley <john.bodley@airbnb.com>
SUMMARY
Adding
mypy
type enforcement to thesuperset.connectors
module.Note this PR was much more involved/complex than I had originally thought given the dependency on other modules which were untyped. Additionally the overall code quality is quite poor; large complex functions, numerous variables which were either optional or with mixed types (
Union
), variable redefinition, etc. meant that the logic was quite hard to grok.Note this PR may introduce some regressions due to the necessary restructuring, however I sense long-term benefits of having the entire repo typed outweighs the short-term concerns regarding possibly (minor) regressions.
BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
TEST PLAN
CI.
ADDITIONAL INFORMATION