fix: allow hyphens in collection, field, and taxonomy slugs#83
fix: allow hyphens in collection, field, and taxonomy slugs#83antoineVIVIES wants to merge 7 commits intoemdash-cms:mainfrom
Conversation
5116d21 to
99fef63
Compare
🦋 Changeset detectedLatest commit: e604ffd The changes in this PR will be included in the next version bump. This PR includes changesets to release 9 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
|
All contributors have signed the CLA ✍️ ✅ |
|
I have read the CLA Document and I hereby sign the CLA |
fb1f0a8 to
17bfe19
Compare
WordPress plugins register custom post types with hyphens (e.g. `elementor-hf`), but EmDash collection slugs require `[a-z][a-z0-9_]*`. This caused a crash in the prepare step with SchemaError: INVALID_SLUG. Sanitize slugs by replacing invalid characters with underscores across all three import phases: analyze, prepare, and execute.
17bfe19 to
2e2a1f0
Compare
|
I think instead, EmDash should allow dashes in slugs. Could you change that instead? |
|
@ascorbic sure will do that soon |
Update slug validation from /^[a-z][a-z0-9_]*$/ to /^[a-z][a-z0-9_-]*$/ across all validation layers (schema registry, API schemas, MCP server, seed validation, SQL identifier validation, admin UI, Cloudflare sandbox). WordPress import no longer converts hyphens to underscores — post types like `elementor-hf` and `shop-order` pass through natively. Fixes emdash-cms#79
|
Updated per @ascorbic's feedback — instead of sanitizing hyphens to underscores during WordPress import, EmDash now allows hyphens natively in collection, field, and taxonomy slugs. What changed: slug validation updated from |
There was a problem hiding this comment.
Pull request overview
This PR updates EmDash slug validation to natively allow hyphens in collection, field, and taxonomy slugs (fixing WordPress import crashes when plugins use hyphenated post type slugs).
Changes:
- Expanded slug validation regexes across core schema/seed validation, API schemas/handlers, MCP schemas, and admin UI validation to accept
-. - Updated WordPress import logic to preserve hyphens (and added regression tests for slug sanitization/mapping).
- Added a changeset to publish the patch release.
Reviewed changes
Copilot reviewed 14 out of 14 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| packages/core/tests/unit/import/wordpress-slug-sanitization.test.ts | Adds regression tests ensuring WordPress slugs preserve hyphens and sanitize only truly invalid chars. |
| packages/core/src/seed/validate.ts | Updates seed validation to allow hyphens in collection/field slugs and updates error messages. |
| packages/core/src/schema/registry.ts | Allows hyphens in schema registry slug validation and updates the thrown error message. |
| packages/core/src/mcp/server.ts | Updates MCP Zod schemas/descriptions to accept hyphenated slugs. |
| packages/core/src/database/validate.ts | Broadens “safe SQL identifier” validation to allow hyphens and updates docs/error text. |
| packages/core/src/astro/routes/api/import/wordpress/prepare.ts | Sanitizes collection slugs defensively during prepare. |
| packages/core/src/astro/routes/api/import/wordpress/execute.ts | Sanitizes collection slugs defensively during execute. |
| packages/core/src/astro/routes/api/import/wordpress/analyze.ts | Introduces sanitizeSlug, allows hyphens, and applies sanitization to unknown post types. |
| packages/core/src/api/schemas/taxonomies.ts | Updates taxonomy name/collection slug validation to accept hyphens. |
| packages/core/src/api/schemas/common.ts | Updates shared slugPattern to accept hyphens. |
| packages/core/src/api/handlers/taxonomies.ts | Updates handler-side taxonomy name validation/message to accept hyphens. |
| packages/cloudflare/src/sandbox/bridge.ts | Updates sandbox collection-name validation regex to accept hyphens. |
| packages/admin/src/components/TaxonomyManager.tsx | Updates admin-side taxonomy name validation + HTML pattern/title/help text to accept hyphens. |
| .changeset/tidy-turkeys-drum.md | Adds release note for allowing hyphens in slugs and fixing WP import crash (#79). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| /** Regex to validate collection names (prevent SQL injection) */ | ||
| const COLLECTION_NAME_REGEX = /^[a-z][a-z0-9_]*$/; | ||
| const COLLECTION_NAME_REGEX = /^[a-z][a-z0-9_-]*$/; | ||
|
|
There was a problem hiding this comment.
Allowing hyphens here makes ec_${collection} an invalid/unquoted SQL identifier in the raw D1 queries below (e.g. SELECT * FROM ec_${collection} ...). A collection like my-type will be parsed as ec_my minus type and fail at runtime. Either keep this regex underscore-only, or update all interpolated table references to be properly quoted/escaped (e.g. "ec_${collection}").
| /** | ||
| * Pattern for safe SQL identifiers. | ||
| * Must start with a lowercase letter, followed by lowercase letters, digits, or underscores. | ||
| * Must start with a lowercase letter, followed by lowercase letters, digits, underscores, or hyphens. | ||
| */ | ||
| const IDENTIFIER_PATTERN = /^[a-z][a-z0-9_]*$/; | ||
| const IDENTIFIER_PATTERN = /^[a-z][a-z0-9_-]*$/; | ||
|
|
There was a problem hiding this comment.
validateIdentifier is documented/used as a guard for interpolating identifiers into raw SQL, but allowing - is not safe unless every usage consistently quotes identifiers. For example, search/fts-manager.ts builds FTS column lists and trigger statements from field slugs without quoting; a slug like my-field will produce invalid SQL. Consider reverting IDENTIFIER_PATTERN to exclude hyphens (and rely on a separate slug validator), or audit/update all raw-SQL call sites to quote/escape identifiers before allowing -.
packages/core/src/schema/registry.ts
Outdated
|
|
||
| // Regex patterns for schema registry | ||
| const SLUG_VALIDATION_PATTERN = /^[a-z][a-z0-9_]*$/; | ||
| const SLUG_VALIDATION_PATTERN = /^[a-z][a-z0-9_-]*$/; |
There was a problem hiding this comment.
Now that field slugs may contain hyphens, any feature that interpolates field slugs into unquoted SQL identifiers can break. Concretely, the FTS manager builds FTS5 column lists / trigger bodies from field.slug without quoting, so searchable fields like my-field will fail when enabling search. Either ensure those SQL fragments quote identifiers, or disallow hyphens specifically for searchable-field slugs.
| const SLUG_VALIDATION_PATTERN = /^[a-z][a-z0-9_-]*$/; | |
| const SLUG_VALIDATION_PATTERN = /^[a-z][a-z0-9_]*$/; |
Overlapping PRsThis PR modifies files that are also changed by other open PRs:
This may cause merge conflicts or duplicated work. A maintainer will coordinate. |
|
Awaiting a wordpress export that can replicate the bug, for testing. |
|
I'm sorry – Copilot is right. My suggestion to allow dashes is flawed, because it means the names are no longer SQL-safe. Sorry for the runaround, but can we switch back to the previous implementation? |
|
@ascorbic sure Will go back to previous fix asap |
This reverts commit 4132572.
…production Includes realistic custom post types from Elementor (elementor-hf), WooCommerce (shop-order, shop-coupon), and ACF (acf-field-group).
|
@ascorbic Reverted back to the original approach — hyphens are sanitized to underscores during WordPress import, slug validation stays at @BenjaminPrice Added a test fixture for reproduction: It's a minimal WXR export containing hyphenated custom post types from common plugins:
Plus a standard |
BenjaminPrice
left a comment
There was a problem hiding this comment.
One edge case to consider.
That said, I was able to reproduce the bug and confirm that this PR does indeed fix it.
| .toLowerCase() | ||
| .replace(INVALID_SLUG_CHARS, "_") | ||
| .replace(LEADING_NON_ALPHA, ""); | ||
| if (!sanitized) return "imported"; |
There was a problem hiding this comment.
The INVALID_SLUG_CHARS regex /[^a-z0-9_]/g is applied after .toLowerCase(), so it correctly handles uppercase. However, a slug consisting entirely of digits (e.g. 123) or underscores/hyphens produces "imported". If two different post types both degenerate to "imported", their content would merge into a single collection. This is an edge case but could cause data loss during import of unusual WP exports.
Perhaps we need to add some randomization to the slug, or we can get more sophisticated and have a map for all santized slugs and add a numerical incrementation.
…o the same value If multiple WordPress post types degenerate to the same slug (e.g. numeric-only types both becoming "imported"), append a numeric suffix (_1, _2, etc.) to prevent merging unrelated content into one collection.
|
@BenjaminPrice Good catch on the slug collision edge case — fixed. |
@emdash-cms/admin
@emdash-cms/auth
@emdash-cms/blocks
@emdash-cms/cloudflare
emdash
create-emdash
@emdash-cms/gutenberg-to-portable-text
@emdash-cms/x402
@emdash-cms/plugin-ai-moderation
@emdash-cms/plugin-atproto
@emdash-cms/plugin-audit-log
@emdash-cms/plugin-color
@emdash-cms/plugin-embeds
@emdash-cms/plugin-forms
@emdash-cms/plugin-webhook-notifier
commit: |
Fixes #79
Problem
WordPress plugins (Elementor, WooCommerce, ACF, etc.) register custom post types with hyphens in their slugs (e.g.
elementor-hf,shop-order,acf-field-group). EmDash collection slugs previously required[a-z][a-z0-9_]*, so the import crashed withSchemaError: INVALID_SLUG.Fix
Instead of sanitizing hyphens away during import, allow hyphens natively in EmDash slugs. The slug validation pattern is updated from
/^[a-z][a-z0-9_]*$/to/^[a-z][a-z0-9_-]*$/across all validation layers:database/validate.ts— SQL identifier validationschema/registry.ts— collection/field slug validationapi/schemas/common.ts— shared Zod slug patternapi/schemas/schema.ts— collection/field API schemasapi/schemas/taxonomies.ts— taxonomy API schemasapi/handlers/taxonomies.ts— taxonomy name validationmcp/server.ts— MCP tool schemasseed/validate.ts— seed file validationadmin/TaxonomyManager.tsx— admin UI validation + HTML patterncloudflare/sandbox/bridge.ts— sandbox collection name validationanalyze.ts—INVALID_SLUG_CHARSupdated to allow hyphensTest plan
wordpress-slug-sanitization.test.ts— 15 tests pass, verifying hyphens are preserved