Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 Source Github: return AirbyteMessage if max retry exeeded for 202 status code #32679

Merged
merged 10 commits into from
Nov 23, 2023

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ data:
connectorSubtype: api
connectorType: source
definitionId: ef69ef6e-aa7f-4af1-a01d-ef775033524e
dockerImageTag: 1.5.3
dockerImageTag: 1.5.4
dockerRepository: airbyte/source-github
documentationUrl: https://docs.airbyte.com/integrations/sources/github
githubIssueLabel: source-github
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,8 @@

import pendulum
import requests
from airbyte_cdk.models import SyncMode
from airbyte_cdk.models import AirbyteLogMessage, AirbyteMessage, Level, SyncMode
from airbyte_cdk.models import Type as MessageType
from airbyte_cdk.sources.streams.availability_strategy import AvailabilityStrategy
from airbyte_cdk.sources.streams.http import HttpStream
from airbyte_cdk.sources.streams.http.exceptions import DefaultBackoffException
Expand Down Expand Up @@ -1606,8 +1607,13 @@ def read_records(self, stream_slice: Mapping[str, Any] = None, **kwargs) -> Iter
yield from super().read_records(stream_slice=stream_slice, **kwargs)
except HTTPError as e:
if e.response.status_code == requests.codes.ACCEPTED:
self.logger.info(f"Syncing `{self.__class__.__name__}` stream isn't available for repository `{repository}`.")
yield
yield AirbyteMessage(
type=MessageType.LOG,
log=AirbyteLogMessage(
level=Level.INFO,
message=f"Syncing `{self.__class__.__name__}` " f"stream isn't available for repository `{repository}`.",
),
)
else:
raise e

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,11 +10,11 @@
import pytest
import requests
import responses
from airbyte_cdk.models import SyncMode
from airbyte_cdk.models import ConfiguredAirbyteCatalog, SyncMode
from airbyte_cdk.sources.streams.http.exceptions import BaseBackoffException, UserDefinedBackoffException
from requests import HTTPError
from responses import matchers
from source_github import constants
from source_github import SourceGithub, constants
from source_github.streams import (
Branches,
Collaborators,
Expand Down Expand Up @@ -1369,21 +1369,50 @@ def test_stream_contributor_activity_parse_empty_response(caplog):

@responses.activate
def test_stream_contributor_activity_accepted_response(caplog):
repository_args = {
"page_size_for_large_streams": 20,
"repositories": ["airbytehq/airbyte"],
}
stream = ContributorActivity(**repository_args)
responses.add(
responses.GET,
"https://api.github.com/repos/airbytehq/test_airbyte?per_page=100",
json={"full_name": "airbytehq/test_airbyte"},
status=200,
)
responses.add(
responses.GET,
"https://api.github.com/repos/airbytehq/test_airbyte?per_page=100",
json={"full_name": "airbytehq/test_airbyte", "default_branch": "default_branch"},
status=200,
)
responses.add(
responses.GET,
"https://api.github.com/repos/airbytehq/test_airbyte/branches?per_page=100",
json={},
status=200,
)
resp = responses.add(
responses.GET,
"https://api.github.com/repos/airbytehq/airbyte/stats/contributors",
"https://api.github.com/repos/airbytehq/test_airbyte/stats/contributors?per_page=100",
body="",
status=202,
)

source = SourceGithub()
configured_catalog = {
"streams": [
{
"stream": {"name": "contributor_activity", "json_schema": {}, "supported_sync_modes": ["full_refresh"],"source_defined_primary_key": [["id"]]},
"sync_mode": "full_refresh",
"destination_sync_mode": "overwrite"
}
]
}
catalog = ConfiguredAirbyteCatalog.parse_obj(configured_catalog)
config = {"access_token": "test_token", "repository": "airbytehq/test_airbyte"}
logger_mock = MagicMock()

with patch("time.sleep", return_value=0):
list(read_full_refresh(stream))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's reproduce this bug in a test so we would not miss it again

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this test reproduces this "bug", None is not returning now from read records so read results now can be correctly parsed in https://github.com/airbytehq/airbyte/blob/master/airbyte-cdk/python/airbyte_cdk/sources/utils/record_helper.py#L14.

records = list(source.read(config=config, logger=logger_mock, catalog=catalog, state={}))

assert records[2].log.message == "Syncing `ContributorActivity` stream isn't available for repository `airbytehq/test_airbyte`."
assert resp.call_count == 6
assert "Syncing `ContributorActivity` stream isn't available for repository `airbytehq/airbyte`." in caplog.messages


@responses.activate
Expand Down
3 changes: 2 additions & 1 deletion docs/integrations/sources/github.md
Original file line number Diff line number Diff line change
Expand Up @@ -193,7 +193,8 @@ Your token should have at least the `repo` scope. Depending on which streams you

| Version | Date | Pull Request | Subject |
|:--------|:-----------|:------------------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 1.5.3 | 2023-10-23 | [31702](https://github.com/airbytehq/airbyte/pull/31702) | Base image migration: remove Dockerfile and use the python-connector-base image |
| 1.5.4 | 2023-11-20 | [32679](https://github.com/airbytehq/airbyte/pull/32679) | Return AirbyteMessage if max retry exeeded for 202 status code |
| 1.5.3 | 2023-10-23 | [31702](https://github.com/airbytehq/airbyte/pull/31702) | Base image migration: remove Dockerfile and use the python-connector-base image |
| 1.5.2 | 2023-10-13 | [31386](https://github.com/airbytehq/airbyte/pull/31386) | Handle `ContributorActivity` continuous `ACCEPTED` response |
| 1.5.1 | 2023-10-12 | [31307](https://github.com/airbytehq/airbyte/pull/31307) | Increase backoff_time for stream `ContributorActivity` |
| 1.5.0 | 2023-10-11 | [31300](https://github.com/airbytehq/airbyte/pull/31300) | Update Schemas: Add date-time format to fields |
Expand Down