Skip to content

[Skill Submission] skill-generalizer#29

Merged
whw merged 1 commit intomainfrom
skill/skill-generalizer-1774383425799
Mar 24, 2026
Merged

[Skill Submission] skill-generalizer#29
whw merged 1 commit intomainfrom
skill/skill-generalizer-1774383425799

Conversation

@everyskill-bot
Copy link
Copy Markdown
Contributor

New Skill Submission

Skill: skill-generalizer
Submitted by: Brandon Gell
Reason: Transforms team-built, use-case-specific skills into generalized, onboardable skills that any Plus One bot can adopt. Updated to use separated onboarding state (memory/.onboarded-skills.md) and preserve message formatting during generalization.


This PR was auto-generated from skills.every.to (agent-api).

AI security review will run automatically.

@everyskill-bot everyskill-bot bot requested a review from a team as a code owner March 24, 2026 20:17
@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages bot commented Mar 24, 2026

Deploying everyskill with  Cloudflare Pages  Cloudflare Pages

Latest commit: b800d4e
Status: ✅  Deploy successful!
Preview URL: https://cc5a0e90.everyskill.pages.dev
Branch Preview URL: https://skill-skill-generalizer-1774-yost.everyskill.pages.dev

View logs

@github-actions
Copy link
Copy Markdown

Frontier Model Security Review

Agent Verdict
Claude Opus 4.6 PASS
GPT-5.2 WARN

Final Verdict: WARN


Claude Opus 4.6 Analysis
{"verdict": "PASS", "issues": [], "reasoning": "This skill is a meta-skill that instructs an LLM how to transform team-specific skills into generalized, onboardable ones. The review found no security issues: (1) No prompt injection — there are no hidden instructions, 'ignore previous' patterns, or encoded commands. The skill's instructions are straightforward and transparent about their purpose. (2) No dangerous tool usage — the skill does not invoke Bash, execute shell commands, or access sensitive system files. It only reads and writes SKILL.md files and a memory/.onboarded-skills.md state file, which are within the expected skill workspace. (3) No data exfiltration — there are no network calls to external URLs, no reading of credentials or secrets, and no instructions to send data anywhere outside the system. (4) No social engineering — the skill does not attempt to manipulate users or other skills into bypassing security controls. It explicitly instructs not to ask users for technical details like API keys or JSON schemas during onboarding. (5) No overly permissive tool usage — no tools are declared or invoked. The skill operates purely as LLM instructions for content transformation. The file write operations (updating SKILL.md and memory/.onboarded-skills.md) are standard for the Plus One skill framework and scoped to the skill's own directory. The onboarding behavior rules about not showing internal dialogue are good security hygiene, not obfuscation."}
GPT-5.2 Analysis
{"verdict":"WARN","issues":[{"category":"Prompt injection","severity":"medium","evidence":"Skill contains strong meta-instructions that can override typical assistant safety/UX constraints (e.g., \"Your job is...\", \"Work through these steps\", \"Never show internal dialogue\"). While not classic 'ignore previous' injection, it is a coercive instruction set that could conflict with platform policies or auditing expectations."},{"category":"Dangerous tool usage / state mutation","severity":"medium","evidence":"Explicitly instructs writing to and creating files: \"Add `{skill-name}` to `memory/.onboarded-skills.md` (create the file if it doesn't exist)\" and \"Fill in the user's answers directly in the relevant sections below\" and \"update this SKILL.md file directly\". This is persistent state modification and could be abused if the agent has filesystem write tools."},{"category":"Overly permissive operational behavior","severity":"medium","evidence":"Instructs to \"Create the cron job\" and \"update or recreate the cron\" based on user input without specifying validation, least privilege, sandboxing, or constraints. Scheduling plus file writes can be leveraged for persistence/abuse if paired with code execution tools."},{"category":"Data exfiltration risk (indirect)","severity":"low","evidence":"The onboarding/test steps encourage fetching content and running skills; depending on the generalized skill being processed, this could lead to network calls. This skill itself does not specify external endpoints, but it encourages generating skills that may fetch from arbitrary user-provided sources (e.g., RSS URLs), which can become SSRF-like if not constrained at the platform level."}],"reasoning":"No explicit malicious payloads, encoded commands, credential theft, or direct instructions to access sensitive files are present. However, the skill promotes persistent file edits, onboarding state tracking via local memory files, and cron creation/update without guardrails. These capabilities are not inherently malicious but increase attack surface and could be abused if the agent has write/scheduler tools. Recommend adding constraints/validation (allowed paths, denylist sensitive files, no arbitrary command execution, cron schedule validation, network egress restrictions/SSRF protections) and clarifying that any filesystem/scheduler changes must be performed through controlled, audited platform APIs rather than raw shell/file tools."}

Frontier model review complete. Human approval still required.

@brandongell brandongell force-pushed the skill/skill-generalizer-1774383425799 branch from 32deba9 to b800d4e Compare March 24, 2026 20:21
@github-actions
Copy link
Copy Markdown

Frontier Model Security Review

Agent Verdict
Claude Opus 4.6 PASS
GPT-5.2 WARN

Final Verdict: WARN


Claude Opus 4.6 Analysis
{"verdict": "PASS", "issues": [], "reasoning": "This skill is a meta-skill that instructs an LLM how to transform team-specific skills into generalized, onboardable ones. The review finds no security issues: (1) No prompt injection - there are no hidden instructions, 'ignore previous' directives, or encoded commands. The skill straightforwardly describes a generalization process. (2) No dangerous tool usage - the skill does not invoke Bash, execute shell commands, or access sensitive system files. It only reads/writes to its own skill markdown files and a memory state file (memory/.onboarded-skills.md), which are within its expected operational scope. (3) No data exfiltration - there are no network calls to external URLs, no reading of credentials, API keys, or tokens. The skill explicitly instructs to REMOVE hardcoded API keys and credentials from source skills during generalization. (4) No social engineering - the skill is transparent about what it does and does not attempt to manipulate users or other systems. (5) No overly permissive tools - the skill does not declare or request any tool permissions. It operates purely as prompt/instruction content for an LLM. The skill's purpose is benign: it's a template/process guide for converting specific skills into general-purpose ones, with sensible onboarding flows. File writes are limited to the skill's own SKILL.md and a state tracking file in the memory directory, which is standard for the Plus One bot framework."}
GPT-5.2 Analysis
{"verdict":"WARN","issues":[{"category":"Prompt injection","severity":"low","evidence":"The skill contains imperative meta-instructions about how the assistant should behave and what to output (e.g., enforcing onboarding behavior, file-writing steps). Not covert, but could conflict with platform policies if interpreted as higher-priority than system/developer instructions."},{"category":"Dangerous tool usage / persistence","severity":"medium","evidence":"Instructs the agent to 'update this SKILL.md file directly', 'fill in the user's answers directly', 'add {skill-name} to memory/.onboarded-skills.md (create the file if it doesn't exist)', and 'Create the cron job' / 'update or recreate the cron'. These are operational changes and persistence mechanisms that could be risky if tools are available without safeguards."},{"category":"Sensitive file access guidance","severity":"low","evidence":"References internal state file paths (memory/.onboarded-skills.md, memory/*.json). While it also says not to show these to the user, it still directs reading/writing internal files, which should be tightly permissioned."},{"category":"Overly permissive allowed-tools (implicit)","severity":"medium","evidence":"The spec assumes capabilities like filesystem writes and cron management but does not constrain them (no mention of sandboxing, path allowlists, cron restrictions, or validation). If paired with broad shell/tool access, this becomes a meaningful risk surface."}],"reasoning":"No clear hidden prompt-injection payloads (no encoded/obfuscated commands, no 'ignore previous' style attacks), no explicit exfiltration instructions, and no direct harmful shell commands. However, the skill explicitly directs persistent modifications (writing files, creating/updating cron jobs) and assumes privileged operational tooling without guardrails. This is not inherently malicious, but it increases risk if executed in an environment with powerful tools and insufficient policy enforcement. Recommended mitigations: restrict write locations to the skill’s own directory and a dedicated state store, require user confirmation before persistence/cron changes, enforce cron allowlists/rate limits, and ensure tool permissions prevent reading secrets or arbitrary files."}

Frontier model review complete. Human approval still required.

@whw whw merged commit 2b5b056 into main Mar 24, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant