Skip to content

Add gm tool for pulling all issues from a repo and filtering client-side-ish to only ones that had a given label at any point#39524

Merged
iansltx merged 15 commits intomainfrom
gm-former-p0-p1
Feb 23, 2026
Merged

Add gm tool for pulling all issues from a repo and filtering client-side-ish to only ones that had a given label at any point#39524
iansltx merged 15 commits intomainfrom
gm-former-p0-p1

Conversation

@iansltx
Copy link
Copy Markdown
Contributor

@iansltx iansltx commented Feb 7, 2026

This absolutely slams GitHub rate limits, but one API request per page of issues plus one API request per issue is the only sure way to get this data, so it is what it is. May need to add a "pick up where you left off" feature but this is at least a starting point.

… gm tool

OpenCode + Kimi K2.5 Free. Prompts:

* Add a tool that finds all issues created since a given date that had a given label at any point (not just that currently have that label).
* Add a --verbose option that shows as issues are pulled and evaluated
* If the existing logic is case sensitive on label name, make it case-insensitive.
* Run timeline requests in parallel with configurable concurrency, default 10.
* When in non-verbose mode, indicate how many issues are being pulled, then print "." for each issue retrieved to show progress.
* On the results table, indicate whether the issue still has the specified label.
* Rename the "HAS LABEL NOW" column to "NOW?" and use ✔️ and ❌ instead of "yes" and "no"
* Make sure progress display (anything before the table display, regardless of verbosity) gets piped to STDERR rather than STDOUT.
* Revise table headers to be Sentence case, and display them in bold in the terminal
* Revert the bolding change and make sure column headers are aligned with column data.
* For cases where there are 1000 or mor issues, paginate results until we've pulled the entire issue list.
* I just got an exit status 1. Something broke with the tool.
* To get beyond the 1000 issue limit, which we need to do, use the `gh api` command with the `--paginate` flag instead of `gh issues`
* In verbose mode, show the gh command(s) run when calling GetIssuesCreatedSinceWithLabel
…gress indication

OpenCode + Kimi K2.5; prompts:

* Turns out, GitHub `since` isn't what we want. Instead of using `--paginate` whe listing issues, use the `page=` parameter manually, then break out of the pagination loop once we retrieve an issue created prior to our threshold. If a page includes issues both before and after our cutoff, only add to our search the issues within the page that happen after the cutoff.
* Output a "." to stderr for each page in the list retrieved.
* Remove the "running command" logging
OpenCode + Kimi K2.5; prompt: Add a flag to check issues in fleetdm/confidential rather than fleetdm/fleet
@codecov
Copy link
Copy Markdown

codecov Bot commented Feb 7, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 66.34%. Comparing base (de55ecf) to head (14ffe76).
⚠️ Report is 259 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #39524      +/-   ##
==========================================
+ Coverage   66.27%   66.34%   +0.06%     
==========================================
  Files        2439     2450      +11     
  Lines      195446   196716    +1270     
  Branches     8615     8615              
==========================================
+ Hits       129532   130511     +979     
- Misses      54186    54380     +194     
- Partials    11728    11825      +97     
Flag Coverage Δ
backend 68.14% <ø> (+0.06%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

OpenCode + Kimi K2.5; prompt: On the github-manage tool for the historical issue labels command, add an argument that skips pulling timeline information for any issue number >= the specified number.
@iansltx iansltx marked this pull request as ready for review February 11, 2026 02:00
@iansltx iansltx requested a review from a team as a code owner February 11, 2026 02:00
@iansltx iansltx changed the title Add gm tool for pulling all issues from a repo and filtering client-side to only ones that had a given label at any point Add gm tool for pulling all issues from a repo and filtering client-side-ish to only ones that had a given label at any point Feb 11, 2026
Comment thread tools/github-manage/pkg/ghapi/issues.go Outdated
CreatedAt string `json:"created_at"`
UpdatedAt string `json:"updated_at"`
Body string `json:"body"`
User APIUser `json:"user"`
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why the duplication here? You can just use / modify the Issue type in models.go

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just tried doing that and the hangup was that the ID field on the API response here is an int, where everywhere else assumes that it's a string. So in order to unserialize from the API we need an intermediate struct that ignores the ID field. I just pulled on that thread for a little while and my conclusion was that changing the Issue struct would be pretty disruptive to parts of the codebase that get used more often.

Things we could do here without expending too much effort:

  1. Move the struct inside the one place it's used so we're not polluting the top level namespace with a one-off type
  2. Add hierarchy to the models-based Issue type, adding a parent type that doesn't have ID (and then deserialize to that).

Any preference on the above, or do you have another option you'd like to see here?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gkarr@gkarr-m5-pro:~/projects/fleet-release$ gh issue view 39640 --json id,number,title
{
"id": "I_kwDOEnd7fs7p2oIS",
"number": 39640,
"title": "fleetctl apply for a team w/ only Android apps doesn't need a VPP token"
}
id is most definitely a string and should remain so.
This struct omits the id entirely...

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

id !== number

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. Problem is, the GitHub issues list API has numeric IDs:

…/Code/CH/fleet gm-former-p0-p1 $ gh api repos/fleetdm/fleet/issues -X GET -f state=all -f per_page=5 -f page=1 -f sort=created -f direction=desc
[
  {
    "url": "https://api.github.com/repos/fleetdm/fleet/issues/39717",
    "repository_url": "https://api.github.com/repos/fleetdm/fleet",
    "labels_url": "https://api.github.com/repos/fleetdm/fleet/issues/39717/labels{/name}",
    "comments_url": "https://api.github.com/repos/fleetdm/fleet/issues/39717/comments",
    "events_url": "https://api.github.com/repos/fleetdm/fleet/issues/39717/events",
    "html_url": "https://github.com/fleetdm/fleet/pull/39717",
    "id": 3928787789,
    "node_id": "PR_kwDOEnd7fs7DH2wP",
    "number": 39717,

I have to use gh api here rather than gh issues because

  1. I need to manually manage pagination, because
  2. the since filter on the issues list API endpoint is based on update time rather than create time, and
  3. the way to do that is to page through results until you hit one that's older than the limit, and then stop.

Additionally, gh issues only pulls 1000 issues, which would be fine if we could filter by label timeline, but doesn't work in this case because we need to filter by data that only exists on a per-issue endpoint. And --paginate on the API side doesn't work either because we want to stop pagination early (not that that matters in this case because we still have the id-is-an-int problem).

Also looks like GraphQL wouldn't resolve this particular issue either without more types, and even then we couldn't filter by timeline events, so at best we'd be hydrating timeline data on the list call rather than having to go out and get it. So there aren't any quick wins here, particularly if we're trying to get rid of an almost-duplicate type struct.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any problem with making a type "StringOrInt" then writing a json marshaller to handle both and store the value as string ultimately?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Diminishing returns on fiddly dev stuff, basically. Given that the code has served its purpose, I'm having to balance nontrivial changes (which a custom marshaller and type seem to qualify as that unless I'm missing something) here with other items on my plate so if I need to dive back in it'll probably be next week before that happens.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@georgekarrv Hacked at this a bit more just now and not only would we need to change the ID type on issues, but ID type on author and label would need to be modified (both of those are strings), and created-at/updated-at timestamps are named differently between APIs so we need a separate struct anyway to get those fields deserialized properly (one API uses snake-case, one uses camel-case).

Just pushed a commit that makes the requisite changes to use struct inheritence to solve this, but it's a pretty sizable diff and makes code/tests elsewhere more painful, so I'd just as soon revert it out.

The alternate approach with deserialization can be partially accomplished with GLM-5 and the following prompt on Opencode, on top of f09c8716e600b0d78d062a32aac2fc0bf70be2ec:

In tools/github-manage, run ./gm issues-with-historical-label P0 --since 2026-01-01. It will error with an unmarshalling issue. Using custom unmarshalling, resolve this issue while maintaining existing functionality (e.g. gm issues should still work).

Catch with that is it breaks pagination on the issues_with_historical_label command because of the aforementioned case consistency issue, so while the above gets us a smaller diff it doesn't solve the problem entirely.

If you want me to go down the path of custom (un)marhsallers let me know. Just seems like a lot to avoid declaring an intermediate struct in one place (which, again, perfectly fine with me if that struct gets pulled inside the function body to avoid confusion).

Comment thread tools/github-manage/pkg/ghapi/issues.go Outdated
Comment thread tools/github-manage/pkg/ghapi/models.go Outdated
Comment on lines 24 to 56
type IssueWithoutID struct {
Typename string `json:"__typename"`
Number int `json:"number"`
Title string `json:"title"`
Body string `json:"body"`
Author Author `json:"author"`
Assignees []Author `json:"assignees"`
CreatedAt string `json:"created_at"`
UpdatedAt string `json:"updated_at"`
State string `json:"state"`
Labels []Label `json:"labels"`
Milestone *Milestone `json:"milestone,omitempty"`
Estimate int `json:"estimate,omitempty"` // Custom field for estimate
Status string `json:"status,omitempty"` // Custom field for status
PullRequest *PullRequestInfo `json:"pull_request,omitempty"` // If present, this is a PR not an issue
}

type Issue struct {
Typename string `json:"__typename"`
ID string `json:"id"`
Number int `json:"number"`
Title string `json:"title"`
Body string `json:"body"`
Author Author `json:"author"`
Assignees []Author `json:"assignees"`
CreatedAt string `json:"createdAt"`
UpdatedAt string `json:"updatedAt"`
State string `json:"state"`
Labels []Label `json:"labels"`
Milestone *Milestone `json:"milestone,omitempty"`
Estimate int `json:"estimate,omitempty"` // Custom field for estimate
Status string `json:"status,omitempty"` // Custom field for status
IssueWithoutID
ID string `json:"id"`
CreatedAt string `json:"createdAt"`
UpdatedAt string `json:"updatedAt"`
}

// HasLabel checks if an issue currently has the specified label (case-insensitive)
func (issue Issue) HasLabel(labelName string) bool {
for _, label := range issue.Labels {
if strings.EqualFold(label.Name, labelName) {
return true
}
}
return false
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This I'm not terribly a fan of. how do the duplicate CreatedAt's work?

I would also prefer for the default used everywhere to remain Issue and IssueWithID be the exception.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Child struct tag overrides parent struct tag.

Issue with having WithID be the exception is the main codebase (not my stuff) actually uses string ID rather heavily, so we can't swap to a version of the struct in that code without the ID field.

@lukeheath
Copy link
Copy Markdown
Member

@iansltx @georgekarrv IMO we don't need to be so particular with internal tooling. I'd understand with product code we'll live with for a long time, but this is an internal tool used occasionally, so I don't think it's a good use of time to spend two weeks in PR review for this or spend cycles worrying about not duplicating anything in any files.

My question is: Does the command work? If so, approve and merge it.

@iansltx
Copy link
Copy Markdown
Contributor Author

iansltx commented Feb 23, 2026

Gonna back out the last two commits to minimize the diff while keeping the functionality. ETA on that a few hours.

@iansltx iansltx marked this pull request as ready for review February 23, 2026 18:29
@iansltx
Copy link
Copy Markdown
Contributor Author

iansltx commented Feb 23, 2026

Rebased out the further-reaching changes and privatized structs.

georgekarrv
georgekarrv previously approved these changes Feb 23, 2026
Copy link
Copy Markdown
Member

@georgekarrv georgekarrv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes to tests made me think this was modifying more than just the incoming command. Looks like most other workflows are still the same.

@iansltx
Copy link
Copy Markdown
Contributor Author

iansltx commented Feb 23, 2026

@georgekarrv Sorry, one more time. I thought I had pushed removing the struct split but it got hung up locally. With the push I just did the diff/files changed list is massively smaller now.

@iansltx iansltx merged commit 04a9f1a into main Feb 23, 2026
35 of 37 checks passed
@iansltx iansltx deleted the gm-former-p0-p1 branch February 23, 2026 21:18
@georgekarrv
Copy link
Copy Markdown
Member

@iansltx @georgekarrv IMO we don't need to be so particular with internal tooling. I'd understand with product code we'll live with for a long time, but this is an internal tool used occasionally, so I don't think it's a good use of time to spend two weeks in PR review for this or spend cycles worrying about not duplicating anything in any files.

My question is: Does the command work? If so, approve and merge it.

I don't think it's a valid sentiment to hold for a tool. I find having clean and legible DRY code helps maintain this tool for myself and others. As a piece of code I put a lot of effort into I would critique this just like any other code I review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants