-
Notifications
You must be signed in to change notification settings - Fork 94
feat: add model card generator workflow and configuration #646
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Summary of ChangesHello @ilopezluna, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request establishes the foundational configuration for an automated AI model card generation system. It defines an intelligent agent capable of researching model information from external sources and structuring it into a standardized Markdown document. This enhancement streamlines the process of creating consistent and informative documentation for AI models hosted on Docker Hub, ensuring clarity and accuracy for users. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Ignored Files
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a new workflow and agent configuration for automatically generating AI model cards. The agent configuration in .github/agents/model-card-generator.yaml is well-structured with a detailed prompt, template, and example. My review focuses on improving the correctness, security, and consistency of this configuration. I've identified a few issues: a likely typo in the Anthropic model name, a contradictory instruction in the agent's prompt that could lead to incorrect output, and an overly permissive shell command that should be removed for security reasons. The suggested changes will help ensure the agent runs correctly and securely.
| models: | ||
| sonnet: | ||
| provider: anthropic | ||
| model: claude-sonnet-4-5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
|
||
| ## Important Guidelines | ||
|
|
||
| - Omit namespace for all Docker references (e.g., `{model_name}` instead of `aistaging/{model_name}`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a contradiction in the instructions regarding the use of namespaces in Docker references. This guideline instructs to omit the namespace, but the template (e.g., lines 53, 76, 82) and the example (e.g., lines 145, 172, 177) consistently include a namespace (like aistaging/ or ai/). This inconsistency will likely confuse the model and lead to incorrect model card generation. The guideline should be updated to align with the template and example.
- Always include the namespace for Docker references (e.g., aistaging/{model_name})| permissions: | ||
| allow: | ||
| - shell:cmd=curl * | ||
| - shell:cmd=cat * |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The agent is granted permission to use cat *, which allows it to read any file on the runner's filesystem. This is an unnecessary and overly broad permission that violates the principle of least privilege. The agent's instructions do not require the use of cat, and the filesystem toolset already provides a safer read_file tool for this purpose. To enhance security, this permission should be removed.
… and provider options
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey - I've found 2 issues, and left some high level feedback:
- The
reference_urlsinput is described as comma-separated, but the agent prompt simply interpolates the raw string; consider updating the agent instructions or preprocessing in the workflow so the agent reliably receives one URL per line to avoid malformedcurlcommands. - In
.github/agents/model-card-generator.yaml, the instruction "do no attempt to create it" has a typo and could be clarified (e.g., "do not attempt to create it") to avoid confusion in future edits or reuse.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The `reference_urls` input is described as comma-separated, but the agent prompt simply interpolates the raw string; consider updating the agent instructions or preprocessing in the workflow so the agent reliably receives one URL per line to avoid malformed `curl` commands.
- In `.github/agents/model-card-generator.yaml`, the instruction "do no attempt to create it" has a typo and could be clarified (e.g., "do not attempt to create it") to avoid confusion in future edits or reuse.
## Individual Comments
### Comment 1
<location> `.github/agents/model-card-generator.yaml:24-33` </location>
<code_context>
+ Do not fetch HTML as the content is too large, use the Hugging Face API.
</code_context>
<issue_to_address>
**issue:** Clarify the apparent conflict between avoiding HTML and using curl on arbitrary reference URLs.
The current wording bans HTML fetching but the workflow later tells the agent to `curl` arbitrary reference URLs, including non‑HF sites (blogs, papers, etc.). This makes it unclear when HTML is allowed vs. when the HF API must be used. Please adjust the text to clearly distinguish Hugging Face model pages (where the API should be used instead of HTML) from other reference URLs (where HTML or other formats may be necessary) so the behavior is explicit and consistent.
</issue_to_address>
### Comment 2
<location> `.github/workflows/generate-model-card.yml:65` </location>
<code_context>
+
+ ---
+ Please review the generated content for accuracy before merging.
+ branch: model-card/${{ inputs.repository }}
+ base: main
+ labels: model-card,automated
</code_context>
<issue_to_address>
**suggestion:** Sanitize the repository input before using it in a branch name.
Because `inputs.repository` is used directly in the branch name, values with spaces, `..`, or extra `/` characters could produce invalid or unintended refs. Consider validating it (e.g., allowing only `[A-Za-z0-9._-]+`) or normalizing it by replacing invalid characters with `-` before composing the branch name.
```suggestion
branch: model-card/${{ replace(replace(replace(inputs.repository, ' ', '-'), '/', '-'), '..', '-') }}
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| Do not fetch HTML as the content is too large, use the Hugging Face API. | ||
|
|
||
| You'll receive these inputs: | ||
| ``` | ||
| **Repository:** aistaging/${{ inputs.repository }} | ||
| **Model name:** ${{ inputs.repository }} | ||
| **Reference URLs to research:** | ||
| ${{ inputs.reference_urls }} | ||
| ``` | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue: Clarify the apparent conflict between avoiding HTML and using curl on arbitrary reference URLs.
The current wording bans HTML fetching but the workflow later tells the agent to curl arbitrary reference URLs, including non‑HF sites (blogs, papers, etc.). This makes it unclear when HTML is allowed vs. when the HF API must be used. Please adjust the text to clearly distinguish Hugging Face model pages (where the API should be used instead of HTML) from other reference URLs (where HTML or other formats may be necessary) so the behavior is explicit and consistent.
|
|
||
| --- | ||
| Please review the generated content for accuracy before merging. | ||
| branch: model-card/${{ inputs.repository }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: Sanitize the repository input before using it in a branch name.
Because inputs.repository is used directly in the branch name, values with spaces, .., or extra / characters could produce invalid or unintended refs. Consider validating it (e.g., allowing only [A-Za-z0-9._-]+) or normalizing it by replacing invalid characters with - before composing the branch name.
| branch: model-card/${{ inputs.repository }} | |
| branch: model-card/${{ replace(replace(replace(inputs.repository, ' ', '-'), '/', '-'), '..', '-') }} |
This pull request introduces a new automated workflow for generating AI model cards for Docker Hub repositories. The workflow leverages a custom agent configuration and integrates with Anthropic's Claude model to research reference URLs and produce detailed, standardized model documentation in Markdown format. The most important changes are grouped below:
Model Card Generation Agent and Template
.github/agents/model-card-generator.yaml, which defines a new agent that uses theclaude-sonnet-4-5model from Anthropic to generate model cards. The agent includes detailed instructions, a strict Markdown template, and guidelines for factual, Docker-centric model documentation. It also specifies allowed shell and filesystem tool access for research and file writing.GitHub Actions Workflow for Automation
.github/workflows/generate-model-card.yml, a new reusable workflow that: