Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CDK MyPy #3175

Merged
merged 16 commits into from
May 4, 2021
Merged

CDK MyPy #3175

merged 16 commits into from
May 4, 2021

Conversation

davinchia
Copy link
Contributor

@davinchia davinchia commented May 3, 2021

What

Before we write tests, run MyPy on the base_python module in order to ensure consistent interfaces/types throughout the module.

Have added comments where-ever necessary to explain the error.

Since there the jsonschema module doesn't have a typeshed definition, I've instructed MyPy to ignore it.

How

mypy airbyte_cdk/base_python shows no errors.

➜  python git:(davinchia/cdk-mypy) ✗ mypy airbyte_cdk/base_python
Success: no issues found in 22 source files

Note, I attempted to fix the singer module and didn't get far since the interfaces have changed quite drastically. We should look at this during our refactor.

""" Override to define additional parameters """
payload = {
payload: MutableMapping[str, Any] = {
Copy link
Contributor Author

@davinchia davinchia May 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

had to define this as the return type is a Mapping and MyPy threw and error as we are modifying this struct before returning it.

@@ -79,7 +79,7 @@ def as_airbyte_stream(self) -> AirbyteStream:

if self.supports_incremental:
stream.source_defined_cursor = self.source_defined_cursor
stream.supported_sync_modes.append(SyncMode.incremental)
stream.supported_sync_modes.append(SyncMode.incremental) # type: ignore
Copy link
Contributor Author

@davinchia davinchia May 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is ignored as it's I'm fairly certain this is assigned. MyPy isn't detecting and assumes the field is None for some reason.

cursor_field: List[str] = None,
stream_slice: Mapping[str, Any] = None,
Copy link
Contributor Author

@davinchia davinchia May 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this previously did not match the Stream interface's read_records method

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏼

return ConnectorSpecification.parse_obj(json.loads(raw_spec))

def check(self, logger: AirbyteLogger, config: json) -> AirbyteConnectionStatus:
def check(self, logger: AirbyteLogger, config: Mapping[str, Any]) -> AirbyteConnectionStatus:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These function signatures previously did not match the classes implementing them. This now matches.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏼

@@ -46,7 +46,7 @@
class BaseSource(Source):
"""Base source that designed to work with clients derived from BaseClient"""

client_class: Type[BaseClient] = None
client_class: Type[BaseClient]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the class type hint shouldn't be assigned to

@@ -116,7 +116,7 @@ def discover(self, logger: AirbyteLogger, config_container) -> AirbyteCatalog:

def read(
self, logger: AirbyteLogger, config_container: ConfigContainer, catalog_path: str, state_path: str = None
) -> Generator[AirbyteMessage, None, None]:
) -> Iterator[AirbyteMessage, None, None]:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To match the Source interface.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we make it simpler by changing both to Iterable[AirbyteMessage]?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeap. this is from the existing interface type.

@davinchia
Copy link
Contributor Author

davinchia commented May 3, 2021

/test connector=source-appstore-singer

🕑 source-appstore-singer https://github.com/airbytehq/airbyte/actions/runs/806507873
✅ source-appstore-singer https://github.com/airbytehq/airbyte/actions/runs/806507873

@davinchia davinchia requested a review from sherifnada May 3, 2021 11:08
@davinchia davinchia marked this pull request as ready for review May 3, 2021 11:18
@davinchia davinchia removed the request for review from jrhizor May 3, 2021 11:18
Copy link
Contributor

@ChristopheDuong ChristopheDuong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mypy does really help going through the code, Nice!

Copy link
Contributor

@sherifnada sherifnada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice! I think there is a breaking change in http.py.

Also, where is MyPy installed? Shouldn't it be in setup.py?

airbyte-cdk/python/README.md Outdated Show resolved Hide resolved
args = {"sync_mode": SyncMode.full_refresh, "cursor_field": configured_stream.cursor_field}
for slices in stream_instance.stream_slices(**args):
for record in stream_instance.read_records(stream_slice=slices, **args):
for slices in stream_instance.stream_slices(sync_mode=SyncMode.full_refresh, cursor_field=configured_stream.cursor_field):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why make this change?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

matter of taste, but I find it is more readable like this:

records = stream_instance.read_records(stream_slice=slices, sync_mode=SyncMode.full_refresh, cursor_field=configured_stream.cursor_field)
for record in records:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

➜  python git:(davinchia/cdk-mypy) ✗ mypy airbyte_cdk/base_python
airbyte_cdk/base_python/cdk/abstract_source.py:170: error: Argument 1 to "stream_slices" of "Stream" has incompatible type "**Dict[str, Union[SyncMode, List[str], None]]"; expected "SyncMode"
airbyte_cdk/base_python/cdk/abstract_source.py:170: error: Argument 1 to "stream_slices" of "Stream" has incompatible type "**Dict[str, Union[SyncMode, List[str], None]]"; expected "Optional[List[str]]"
airbyte_cdk/base_python/cdk/abstract_source.py:170: error: Argument 1 to "stream_slices" of "Stream" has incompatible type "**Dict[str, Union[SyncMode, List[str], None]]"; expected "Optional[Mapping[str, Any]]"
Found 3 errors in 1 file (checked 22 source files)

MyPy doesn't like kwargs, since it can't be sure that the entries actually exists and are bound with the right type.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems reasonable to make it here

headers=dict(self.request_headers(**args), **self.authenticator.get_auth_header()),
params=self.request_params(**args),
json=self.request_body_json(**args),
path=self.path(args),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems incorrect. **args and args are different. The former denests keyword args from a dictionary, the latter passes a dict as an argument.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay I figured this out - the header and request_body json functions do not accept an Optional with their first parameter, which is causing this error:

➜  python git:(davinchia/cdk-mypy) ✗ mypy airbyte_cdk/base_python
airbyte_cdk/base_python/cdk/streams/http.py:218: error: Argument 1 to "request_params" of "HttpStream" has incompatible type "**Dict[str, Optional[Mapping[str, Any]]]"; expected "Mapping[str, Any]"
airbyte_cdk/base_python/cdk/streams/http.py:219: error: Argument 1 to "request_body_json" of "HttpStream" has incompatible type "**Dict[str, Optional[Mapping[str, Any]]]"; expected "Mapping[str, Any]"
Found 2 errors in 1 file (checked 22 source files)

I think the function signature should continue being specific (instead of accepting Optional), so I'm going to break out the arguments. What do you think?

airbyte-cdk/python/airbyte_cdk/base_python/integration.py Outdated Show resolved Hide resolved
return ConnectorSpecification.parse_obj(json.loads(raw_spec))

def check(self, logger: AirbyteLogger, config: json) -> AirbyteConnectionStatus:
def check(self, logger: AirbyteLogger, config: Mapping[str, Any]) -> AirbyteConnectionStatus:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏼

@@ -116,7 +116,7 @@ def discover(self, logger: AirbyteLogger, config_container) -> AirbyteCatalog:

def read(
self, logger: AirbyteLogger, config_container: ConfigContainer, catalog_path: str, state_path: str = None
) -> Generator[AirbyteMessage, None, None]:
) -> Iterator[AirbyteMessage, None, None]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we make it simpler by changing both to Iterable[AirbyteMessage]?

Copy link
Contributor

@sherifnada sherifnada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is one more breaking change in http.py but otherwise lgtm

@davinchia davinchia changed the title Davinchia/cdk mypy CDK MyPy May 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants