Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Source GitHub: handle ContributorActivity continuous ACCEPTED response #31386

Merged
merged 3 commits into from Oct 13, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
2 changes: 1 addition & 1 deletion airbyte-integrations/connectors/source-github/Dockerfile
Expand Up @@ -12,5 +12,5 @@ RUN pip install .
ENV AIRBYTE_ENTRYPOINT "python /airbyte/integration_code/main.py"
ENTRYPOINT ["python", "/airbyte/integration_code/main.py"]

LABEL io.airbyte.version=1.5.1
LABEL io.airbyte.version=1.5.2
LABEL io.airbyte.name=airbyte/source-github
Expand Up @@ -5,7 +5,7 @@ data:
connectorSubtype: api
connectorType: source
definitionId: ef69ef6e-aa7f-4af1-a01d-ef775033524e
dockerImageTag: 1.5.1
dockerImageTag: 1.5.2
maxSecondsBetweenMessages: 5400
dockerRepository: airbyte/source-github
githubIssueLabel: source-github
Expand Down
Expand Up @@ -1600,6 +1600,17 @@ def parse_response(
response, stream_state=stream_state, stream_slice=stream_slice, next_page_token=next_page_token
)

def read_records(self, stream_slice: Mapping[str, Any] = None, **kwargs) -> Iterable[Mapping[str, Any]]:
repository = stream_slice.get("repository", "")
try:
yield from super().read_records(stream_slice=stream_slice, **kwargs)
except HTTPError as e:
if e.response.status_code == requests.codes.ACCEPTED:
self.logger.info(f"Syncing `{self.__class__.__name__}` stream isn't available for repository `{repository}`.")
yield
else:
raise e


class IssueTimelineEvents(GithubStream):
"""
Expand Down
Expand Up @@ -1368,7 +1368,6 @@ def test_stream_contributor_activity_parse_empty_response(caplog):


@responses.activate
@patch("time.sleep", return_value=0)
def test_stream_contributor_activity_accepted_response(caplog):
repository_args = {
"page_size_for_large_streams": 20,
Expand All @@ -1381,9 +1380,10 @@ def test_stream_contributor_activity_accepted_response(caplog):
body="",
status=202,
)
with pytest.raises(UserDefinedBackoffException):
with patch("time.sleep", return_value=0):
list(read_full_refresh(stream))
assert resp.call_count == 6
assert "Syncing `ContributorActivity` stream isn't available for repository `airbytehq/airbyte`." in caplog.messages


@responses.activate
Expand Down
9 changes: 4 additions & 5 deletions docs/integrations/sources/github.inapp.md
@@ -1,12 +1,13 @@
## Prerequisites

- Access to a Github repository
- List of GitHub Repositories (and access for them in case they are private)

## Setup guide

1. Name your source.
2. Click `Authenticate your GitHub account` or use a [Personal Access Token](https://github.com/settings/tokens) for Authentication. For Personal Access Tokens, refer to the list of required [permissions and scopes](https://docs.airbyte.com/integrations/sources/github#permissions-and-scopes).
3. **Start date** Enter the date you'd like to replicate data from.
3. **GitHub Repositories** - Enter a list of GitHub organizations or repositories.
4. (Optional) **Start date** Enter the date you'd like to replicate data from.

These streams will only sync records generated on or after the **Start Date**:

Expand All @@ -16,8 +17,6 @@ The **Start Date** does not apply to the streams below and all data will be sync

`assignees`, `branches`, `collaborators`, `issue_labels`, `organizations`, `pull_request_commits`, `pull_request_stats`, `repositories`, `tags`, `teams`, `users`

4. **GitHub Repositories** - Enter a space-delimited list of GitHub organizations or repositories.

Example of a single repository:
```
airbytehq/airbyte
Expand All @@ -32,7 +31,7 @@ airbytehq/*
```
Repositories which have a misspelled name, do not exist, or have the wrong name format will return an error.

5. (Optional) **Branch** - Enter a space-delimited list of GitHub repository branches to pull commits for, e.g. `airbytehq/airbyte/master`. If no branches are specified for a repository, the default branch will be pulled. (e.g. `airbytehq/airbyte/master airbytehq/airbyte/my-branch`).
5. (Optional) **Branch** - Enter a list of GitHub repository branches to pull commits for, e.g. `airbytehq/airbyte/master`. If no branches are specified for a repository, the default branch will be pulled. (e.g. `airbytehq/airbyte/master airbytehq/airbyte/my-branch`).
6. (Optional) **Max requests per hour** - The GitHub API allows for a maximum of 5000 requests per hour (15,000 for Github Enterprise). You can specify a lower value to limit your use of the API quota.

### Incremental Sync Methods
Expand Down
1 change: 1 addition & 0 deletions docs/integrations/sources/github.md
Expand Up @@ -164,6 +164,7 @@ The GitHub connector should not run into GitHub API limitations under normal usa

| Version | Date | Pull Request | Subject |
|:--------|:-----------|:------------------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 1.5.2 | 2023-10-13 | [31386](https://github.com/airbytehq/airbyte/pull/31386) | Handle `ContributorActivity` continuous `ACCEPTED` response |
| 1.5.1 | 2023-10-12 | [31307](https://github.com/airbytehq/airbyte/pull/31307) | Increase backoff_time for stream `ContributorActivity` |
| 1.5.0 | 2023-10-11 | [31300](https://github.com/airbytehq/airbyte/pull/31300) | Update Schemas: Add date-time format to fields |
| 1.4.6 | 2023-10-04 | [31056](https://github.com/airbytehq/airbyte/pull/31056) | Migrate spec properties' `repository` and `branch` type to \<array\> |
Expand Down