Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
51 changes: 38 additions & 13 deletions .codex/skills/codex-issue-digest/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ description: Run a GitHub issue digest for openai/codex by feature-area labels,

## Objective

Produce a concise, insight-oriented digest of `openai/codex` issues for the requested feature-area labels over the previous 24 hours by default. Honor a different duration when the user asks for one, for example "past week" or "48 hours".
Produce a headline-first, insight-oriented digest of `openai/codex` issues for the requested feature-area labels over the previous 24 hours by default. Honor a different duration when the user asks for one, for example "past week" or "48 hours". Default to a summary-only response; include details only when requested.

Include only issues that currently have `bug` or `enhancement` plus at least one requested owner label. If the user asks for all areas or all labels, collect `bug`/`enhancement` issues across all labels.

Expand All @@ -29,21 +29,46 @@ python3 .codex/skills/codex-issue-digest/scripts/collect_issue_digest.py --label
Use `--window "past week"` or `--window-hours 168` when the user asks for a non-default duration. Use `--all-labels` when the user says all areas or all labels.

2. Use the JSON as the source of truth. It includes new issues, new issue comments, new reactions/upvotes, current labels, current reaction counts, model-ready `summary_inputs`, and detailed `digest_rows`.
3. Start the report with `## Summary`, then `## Details`.
4. In `## Summary`, write skim-first headlines:
- Lead with the most important fact or judgment. Do not start with aggregate counts unless the aggregate itself is the story.
- Make the first 1-3 bullets answer "what should owners pay attention to right now?"
- Bold only the critical insight phrase in each high-priority bullet, for example `**GPT-5.5 context is the dominant pressure point**`.
- Keep summary bullets short enough to scan in about 20 seconds.
- Put broad stats near the end of the summary, after the owner-relevant takeaways.
- Say clearly when there is nothing significant to act on.
- Call out any areas or themes receiving lots of user attention.
3. Choose the output mode from the user's request:
- Default mode: start the report with `## Summary` and do not emit `## Details`.
- Details-upfront mode: if the user asks for details, a table, a full digest, "include details", or similar, start with `## Summary`, then include `## Details`.
- Follow-up details mode: if the user asks for more detail after a summary-only digest, produce `## Details` from the existing collector JSON when it is still available; otherwise rerun the collector.
4. In `## Summary`, write a headline-first executive summary:
- The first nonblank line under `## Summary` must be a single-line headline or judgment, not a bullet. It should be useful even if the reader stops there.
- On quiet days, prefer exactly: `No major issues reported by users.` Use this when there are no elevated rows, no newly repeated theme, and nothing that needs owner action.
- When users are surfacing notable issues, make the headline name the count or theme, for example `Two issues are being surfaced by users:`.
- Immediately under an active headline, list only the issues or themes driving attention, ordered by importance. Start each line with the row's `attention_marker` when present, then a concise owner-readable description and inline issue refs.
- Treat `🔥🔥` as headline-worthy and `🔥` as elevated. Do not add fire emoji yourself; only copy the row's `attention_marker`.
- Keep any extra summary detail after the headline to 1-3 terse lines, only when it adds a decision-relevant caveat, repeated theme, or owner action.
- Do not include routine counts, broad stats, or low-signal table summaries in `## Summary` unless they change the headline. Put metadata and optional counts in `## Details` or the footer.
- In default mode, end the report with a concise prompt such as `Want details? I can expand this into the issue table.` Keep this separate from the summary headline so the headline stays clean.
- Cluster and name themes yourself from `summary_inputs`; the collector intentionally does not hard-code issue categories.
- Use a cluster only when the issues genuinely share the same product problem. If several issues merely share a broad platform or label, describe them individually.
- Do not omit a repeated theme just because its individual issues fall below the details table cutoff. Several similar reports should be called out as a repeated customer concern.
- For single-issue rows, summarize the concern directly instead of calling it a cluster.
- Use inline numbered issue links from each relevant row's `ref_markdown`.
5. In `## Details`, include a compact table only when useful:
- Example quiet summary:

```markdown
## Summary
No major issues reported by users.

Source: collector v4, git `abc123def456`, window `2026-04-27T00:00:00Z` to `2026-04-28T00:00:00Z`.
Want details? I can expand this into the issue table.
```

- Example active summary:

```markdown
## Summary
Two issues are being surfaced by users:
🔥🔥 Terminal launch hangs on startup [1](https://github.com/openai/codex/issues/123)
🔥 Resume switches model providers unexpectedly [2](https://github.com/openai/codex/issues/456)

Source: collector v4, git `abc123def456`, window `2026-04-27T00:00:00Z` to `2026-04-28T00:00:00Z`.
Want details? I can expand this into the issue table.
```
5. In `## Details`, when details are requested, include a compact table only when useful:
- Prefer rows from `digest_rows`; include a `Refs` column using each row's `ref_markdown`.
- Keep the table short; omit low-signal rows when the summary already covers them.
- Use compact columns such as marker, area, type, description, interactions, and refs.
Expand All @@ -52,7 +77,7 @@ Use `--window "past week"` or `--window-hours 168` when the user asks for a non-
6. Use the JSON `attention_marker` exactly. It is empty for normal rows, `🔥` for elevated rows, and `🔥🔥` for very high-attention rows. The actual cutoffs are in `attention_thresholds`.
7. Use inline numbered references where a row or bullet points to issues, for example `Compaction bugs [1](https://github.com/openai/codex/issues/123), [2](https://github.com/openai/codex/issues/456)`. Do not add a separate footnotes section.
8. Label `interactions` as `Interactions`; it counts posts/comments/reactions during the requested window, not unique people.
9. Mention the collector `script_version`, repo checkout `git_head`, and time window in the digest footer or final line.
9. Mention the collector `script_version`, repo checkout `git_head`, and time window in one compact source line. In default mode, put this before the details prompt so the final line still asks whether the user wants details. In details-upfront mode, it can be the footer.

## Reaction Handling

Expand All @@ -64,7 +89,7 @@ GitHub issue search is still seeded by issue `updated_at`, so a purely reaction-

## Attention Markers

The collector scales attention markers by the requested time window. The baseline is 10 human user interactions for `🔥` and 20 for `🔥🔥` over 24 hours; longer or shorter windows scale those cutoffs linearly and round up. For example, a one-week report uses 70 and 140 interactions. Human user interactions are human-authored new issue posts, human-authored new comments, and human reactions created during the window, including upvotes. Bot posts and bot reactions are excluded. In prose, explain this as high user interaction rather than naming the emoji.
The collector scales attention markers by the requested time window. The baseline is 5 human user interactions for `🔥` and 10 for `🔥🔥` over 24 hours; longer or shorter windows scale those cutoffs linearly and round up. For example, a one-week report uses 35 and 70 interactions. Human user interactions are human-authored new issue posts, human-authored new comments, and human reactions created during the window, including upvotes. Bot posts and bot reactions are excluded. In prose, explain this as high user interaction rather than naming the emoji.

## Freshness

Expand Down
14 changes: 10 additions & 4 deletions .codex/skills/codex-issue-digest/scripts/collect_issue_digest.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,12 +11,12 @@
from pathlib import Path
from urllib.parse import quote

SCRIPT_VERSION = 2
SCRIPT_VERSION = 4
QUALIFYING_KIND_LABELS = ("bug", "enhancement")
REACTION_KEYS = ("+1", "-1", "laugh", "hooray", "confused", "heart", "rocket", "eyes")
BASE_ATTENTION_WINDOW_HOURS = 24.0
ONE_ATTENTION_INTERACTION_THRESHOLD = 10
TWO_ATTENTION_INTERACTION_THRESHOLD = 20
ONE_ATTENTION_INTERACTION_THRESHOLD = 5
TWO_ATTENTION_INTERACTION_THRESHOLD = 10
ALL_LABEL_PHRASES = {"all", "all areas", "all labels", "all-areas", "all-labels", "*"}


Expand Down Expand Up @@ -305,6 +305,7 @@ def search_issue_numbers(queries, limit):
numbers = {}
for query in queries:
page = 1
seen_for_query = 0
while True:
payload = gh_json(
[
Expand All @@ -315,6 +316,10 @@ def search_issue_numbers(queries, limit):
"-f",
f"q={query}",
"-f",
"sort=updated",
"-f",
"order=desc",
"-f",
"per_page=100",
"-f",
f"page={page}",
Expand All @@ -331,7 +336,8 @@ def search_issue_numbers(queries, limit):
number = item.get("number")
if isinstance(number, int):
numbers[number] = str(item.get("updated_at") or "")
if len(items) < 100 or len(numbers) >= limit:
seen_for_query += 1
if len(items) < 100 or seen_for_query >= limit:
break
page += 1
ordered = sorted(
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,77 @@ def test_normalize_requested_labels_accepts_all_area_phrases():
)


def test_search_issue_numbers_requests_updated_sort(monkeypatch):
calls = []

def fake_gh_json(args):
calls.append(args)
return {
"items": [
{"number": 1, "updated_at": "2026-04-25T00:00:00Z"},
]
}

monkeypatch.setattr(collect_issue_digest, "gh_json", fake_gh_json)

assert collect_issue_digest.search_issue_numbers(["query"], limit=10) == [1]
assert "-f" in calls[0]
assert "sort=updated" in calls[0]
assert "order=desc" in calls[0]


def test_search_issue_numbers_applies_limit_per_query(monkeypatch):
calls = []

def fake_gh_json(args):
calls.append(args)
query = next(
value.removeprefix("q=") for value in args if value.startswith("q=")
)
page = int(
next(
value.removeprefix("page=")
for value in args
if value.startswith("page=")
)
)
base = 10_000 if query == "first" else 20_000
offset = (page - 1) * 100
return {
"items": [
{
"number": base + offset + idx,
"updated_at": f"2026-04-25T00:{idx:02d}:00Z",
}
for idx in range(100)
]
}

monkeypatch.setattr(collect_issue_digest, "gh_json", fake_gh_json)

collect_issue_digest.search_issue_numbers(["first", "second"], limit=150)

queried_pages = [
(
next(
value.removeprefix("q=") for value in args if value.startswith("q=")
),
next(
value.removeprefix("page=")
for value in args
if value.startswith("page=")
),
)
for args in calls
]
assert queried_pages == [
("first", "1"),
("first", "2"),
("second", "1"),
("second", "2"),
]


def test_summarize_issue_keeps_new_comments_and_reaction_signals():
since = collect_issue_digest.parse_timestamp("2026-04-25T00:00:00Z", "--since")
until = collect_issue_digest.parse_timestamp("2026-04-26T00:00:00Z", "--until")
Expand Down Expand Up @@ -227,19 +298,19 @@ def test_parse_duration_hours_accepts_common_phrases():

def test_attention_thresholds_scale_by_window_length():
one_day = collect_issue_digest.attention_thresholds_for_window(24)
assert one_day["elevated"] == 10
assert one_day["very_high"] == 20
assert one_day["elevated"] == 5
assert one_day["very_high"] == 10

half_day = collect_issue_digest.attention_thresholds_for_window(12)
assert half_day["elevated"] == 5
assert half_day["very_high"] == 10
assert half_day["elevated"] == 3
assert half_day["very_high"] == 5

week = collect_issue_digest.attention_thresholds_for_window(168)
assert week["elevated"] == 70
assert week["very_high"] == 140
assert collect_issue_digest.attention_marker_for(69, week) == ""
assert collect_issue_digest.attention_marker_for(107, week) == "🔥"
assert collect_issue_digest.attention_marker_for(140, week) == "🔥🔥"
assert week["elevated"] == 35
assert week["very_high"] == 70
assert collect_issue_digest.attention_marker_for(34, week) == ""
assert collect_issue_digest.attention_marker_for(35, week) == "🔥"
assert collect_issue_digest.attention_marker_for(70, week) == "🔥🔥"


def test_fetch_comments_uses_since_filter_and_page_cap(monkeypatch):
Expand Down Expand Up @@ -300,7 +371,7 @@ def test_attention_markers_count_human_user_interactions():
"user": {"login": f"user-{idx}"},
"body": "same here",
}
for idx in range(9)
for idx in range(4)
]
comments.append(
{
Expand All @@ -322,8 +393,8 @@ def test_attention_markers_count_human_user_interactions():
comment_chars=100,
)

assert summary["user_interactions"] == 10
assert summary["activity"]["new_human_comments"] == 9
assert summary["user_interactions"] == 5
assert summary["activity"]["new_human_comments"] == 4
assert summary["attention"] is True
assert summary["attention_level"] == 1
assert summary["attention_marker"] == "🔥"
Expand All @@ -337,7 +408,7 @@ def test_attention_markers_count_human_user_interactions():
"user": {"login": f"extra-user-{idx}"},
"body": "also seeing this",
}
for idx in range(11)
for idx in range(100, 106)
)

summary = collect_issue_digest.summarize_issue(
Expand All @@ -350,7 +421,7 @@ def test_attention_markers_count_human_user_interactions():
comment_chars=100,
)

assert summary["user_interactions"] == 20
assert summary["user_interactions"] == 10
assert summary["attention_level"] == 2
assert summary["attention_marker"] == "🔥🔥"

Expand Down
Loading