Skip to content

Github Async Client#22734

Draft
HadhemiDD wants to merge 15 commits intomasterfrom
hs/async-client
Draft

Github Async Client#22734
HadhemiDD wants to merge 15 commits intomasterfrom
hs/async-client

Conversation

@HadhemiDD
Copy link
Contributor

What does this PR do?

Build the dispatcher github async client.

The GitHubAsyncClient is a non-blocking HTTP client that can make multiple GitHub API requests concurrently using Python's async/await syntax.

Sync (current github manager) vs Async (GitHubAsyncClient)

Aspect Sync (GitHubManager) Async (GitHubAsyncClient)
Blocking Yes - waits for each request No - continues while waiting
Concurrency Sequential only Multiple requests at once
Speed Slower (3 requests = 3× time) Faster (3 requests ≈ 1× time)
Syntax response = client.get(url) response = await client.get(url)
Context with async with
Use case Simple, one-at-a-time requests Bulk operations, monitoring

Motivation

Jira card

Review checklist (to be filled by reviewers)

  • Feature or bugfix MUST have appropriate tests (unit, integration, e2e)
  • Add the qa/skip-qa label if the PR doesn't need to be tested during QA.
  • If you need to backport this PR to another branch, you can add the backport/<branch-name> label to the PR and it will automatically open a backport PR once this one is merged

@codecov
Copy link

codecov bot commented Feb 26, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 89.27%. Comparing base (af090e4) to head (0c6fca2).
⚠️ Report is 33 commits behind head on master.

Additional details and impacted files
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@datadog-official
Copy link
Contributor

datadog-official bot commented Mar 6, 2026

⚠️ Tests

Fix all issues with BitsAI or with Cursor

⚠️ Warnings

🧪 1 Test failed

test_e2e from test_apache.py (Datadog) (Fix with Cursor)
[s6-init] making user provided files available at /var/run/s6/etc...exited 0.
[s6-init] ensuring user provided files have correct perms...exited 0.
[fix-attrs.d] applying ownership & permissions fixes...
[fix-attrs.d] done.
[cont-init.d] executing container initialization scripts...
[cont-init.d] 01-check-apikey.sh: executing... 
[cont-init.d] 01-check-apikey.sh: exited 0.
[cont-init.d] 50-ci.sh: executing... 
[cont-init.d] 50-ci.sh: exited 0.
[cont-init.d] 50-ecs-managed.sh: executing... 
...

ℹ️ Info

No other issues found (see more)

❄️ No new flaky tests detected

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: 0c6fca2 | Docs | Datadog PR Page | Was this helpful? React with 👍/👎 or give us feedback!

Copy link
Contributor

@AAraKKe AAraKKe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heyo! I have been taking a look at the code, there are still plenty of things we need to improve, some of them maybe miscommunication on the design or that they are not yet finished since it is in draft. Below are the comments I have been building with the help of Claude to make sure I was not forgetting anything since I was mainly focused on the client design part. We can follow up on this later when you are back! Thanks!!


Hi @HadhemiDD, this is a first quick review powered by Claude Code. This uses a team of agents reviewing different factors of the code changes such as code quality, functionality, correctness and other aspects I find important. Rules are tailored following my personal recommendations and the review has been first approved by me.

Please take a look at the comments and decide whether they should be implemented or not. When deciding not to implement a comment make sure to say why, I will be reviewing both the code and your comments personally. This is a first iteration trying to catch the most important things.

My Feedback Legend

Here's a quick guide to the prefixes I use in my comments:

praise: no action needed, just celebrate!
note: just a comment/information, no need to take any action.
question: I need clarification or I'm seeking to understand your approach.
nit: A minor, non-blocking issue (e.g., style, typo). Feel free to ignore.
suggestion: I'm proposing an improvement. This is optional but recommended.
request: A change I believe is necessary before this can be merged.

The only blocking comments are request, any other type of comment can be applied at discretion of the developer.

"datadog-checks-dev[cli]~=35.6",
"hatch>=1.8.1",
"httpx",
"respx",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Request: respx is a test-mocking library for httpx with no production use. It is added to [project].dependencies, which installs it in every environment where ddev is deployed. Move it to a test/dev dependency group (e.g. [tool.hatch.envs.default.dependencies] in hatch.toml). Check how pytest-asyncio is handled to follow the same pattern.

from ddev.config.file import ConfigFileWithOverrides

# Common GitHub API pagination keys
PAGINATION_KEYS = frozenset(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Request: PAGINATION_KEYS hardcodes GitHub API response field names (artifacts, workflow_runs, releases, etc.) inside the generic HTTP plumbing layer. This is domain knowledge — it belongs in each endpoint method, not in the HTTP infrastructure. It exists solely to support _find_list_key, which guesses the paginated list key at runtime. Once endpoint methods call model_validate on known response shapes, there's no guessing needed: list_workflow_run_artifacts already knows it wants response["artifacts"]. Remove PAGINATION_KEYS and _find_list_key entirely.


pagination: PaginationInfo = Field(default_factory=PaginationInfo)

# Store raw response data for backward compatibility
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Request: GitHubResponse has a fundamental design problem: _raw_data and _headers are private class-level variables, not Pydantic fields. Pydantic v2 completely ignores names starting with _ as model fields — they are not validated, not per-instance, and _headers: dict[str, str] = {} is a shared mutable default across all instances that haven't gone through from_response. The from_response classmethod works by mutating instance state after construction, bypassing Pydantic entirely.

The fix is to make data and headers proper Pydantic fields. data should be typed data: Tnot data: T | None globally. If an endpoint returns no body (204), use GitHubResponse[None]. If a body may or may not be present, encode that in T itself (e.g. GitHubResponse[WorkflowRun | None]). Universal nullability defeats the generic.

class GitHubResponse[T](BaseModel):
    data: T
    headers: dict[str, str] = Field(default_factory=dict)
    pagination: PaginationInfo = Field(default_factory=PaginationInfo)

With this, from_response, _raw_data, _headers, and the data/headers properties all disappear — construction becomes a direct GitHubResponse(data=..., headers=..., pagination=...).


@classmethod
def from_response(cls, data: T | None, headers: dict[str, str], pagination: PaginationInfo):
"""Create response from raw data."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: from_response is a @classmethod with no return type annotation. Per AGENTS.md, all new methods must be type-hinted. The correct return type is Self (Python 3.11+):

from typing import Self

@classmethod
def from_response(cls, data: T | None, headers: dict[str, str], pagination: PaginationInfo) -> Self:
    ...

Note: This finding is a no-op if the GitHubResponse redesign at line 57 is implemented — that change removes from_response entirely. Fix line 57 first.

return instance


class WorkflowRun(BaseModel):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Request: WorkflowRun, Workflow, Artifact, ArtifactsResponse, and IssueComment are all defined here but never instantiated anywhere — the endpoint methods all return GitHubResponse[dict[str, Any]] and call self.request(...) directly without calling model_validate. These models are completely dead code right now.

This is the core unimplemented design gap: each endpoint method must call model_validate on the raw dict it gets back from request before wrapping it in GitHubResponse. See the comment at lines 399-461 for the full picture.

class TestAsyncGitHubClient:
"""Tests for the async GitHub client."""

@pytest.mark.asyncio
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Request: pytest-asyncio is not listed as a test dependency anywhere in ddev/pyproject.toml or ddev/hatch.toml. The @pytest.mark.asyncio decorator on lines 13, 19, and 46 requires pytest-asyncio to be installed — without it, pytest either skips the coroutine silently or raises an unknown-mark warning. Add pytest-asyncio to [tool.hatch.envs.default.dependencies] in hatch.toml and set asyncio_mode = "auto" in [tool.pytest.ini_options] in pyproject.toml.

Suggestion: test_context_manager creates a real httpx.AsyncClient without respx_mock. While no network request is made in this test, the open client is socket-capable. Add respx_mock to prevent any accidental outbound connections and match the other tests in the file.

"""Tests for the async GitHub client."""

@pytest.mark.asyncio
async def test_context_manager(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Request: The four endpoint methods that are the core deliverables — create_workflow_dispatch, get_workflow_run, list_workflow_run_artifacts, create_issue_comment — have zero test coverage. URL construction, request body formatting, and query parameter handling are untested. list_workflow_run_artifacts has a branching path (auto_paginate=True vs False) that also needs coverage.

Request: iter_pages and request_all_pages are completely untested. The existing test_pagination calls client.request() once and checks header parsing — it never follows a next page. request_all_pages has multiple branches (single-page fallback, list response, dict-wrapped response) none of which are exercised.

Request: No error path tests. At minimum: mocked 4xx/5xx response verifying httpx.HTTPStatusError propagates, and a test that constructing the client without a token raises ValueError.

Suggestion: test_context_manager only checks client._client is not None mid-context. The meaningful guarantee is that client._client is None after the async with block exits. Add a post-exit assertion.

Suggestion: Parametrize the pagination test to cover all four link relations (next, prev, first, last) and the no-Link-header case (all fields None). The no-header branch in _extract_pagination is currently unreachable by any test.

async with AsyncGitHubClient(token='test_token') as client:
assert client._client is not None

@pytest.mark.asyncio
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: owner, repo = 'DataDog', 'integrations-core' is repeated in every test method. Extract into a module-level constant or pytest fixture to avoid duplication.

assert client._client is not None

@pytest.mark.asyncio
async def test_generic_request(self, respx_mock):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: test_generic_request tests client.request(...) directly — an internal HTTP method, not a public endpoint. Per the design intent, request is plumbing; the public API surface is the typed endpoint methods. Replace or supplement this test with endpoint-level tests (get_workflow_run, create_issue_comment, etc.) that assert on specific Pydantic model fields.

assert result.headers['x-ratelimit-remaining'] == '4999'

@pytest.mark.asyncio
async def test_pagination(self, respx_mock):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: test_pagination only asserts next_url and last_url, leaving prev_url and first_url untested. _extract_pagination maps all four Link relation types — a regression on first or prev would go undetected. Parametrize to cover all four relations individually plus the no-Link-header case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants