Skip to content

fix: allow hyphens in collection, field, and taxonomy slugs#83

Open
antoineVIVIES wants to merge 7 commits intoemdash-cms:mainfrom
antoineVIVIES:fix/wordpress-import-hyphenated-slugs
Open

fix: allow hyphens in collection, field, and taxonomy slugs#83
antoineVIVIES wants to merge 7 commits intoemdash-cms:mainfrom
antoineVIVIES:fix/wordpress-import-hyphenated-slugs

Conversation

@antoineVIVIES
Copy link
Copy Markdown

@antoineVIVIES antoineVIVIES commented Apr 2, 2026

Fixes #79

Problem

WordPress plugins (Elementor, WooCommerce, ACF, etc.) register custom post types with hyphens in their slugs (e.g. elementor-hf, shop-order, acf-field-group). EmDash collection slugs previously required [a-z][a-z0-9_]*, so the import crashed with SchemaError: INVALID_SLUG.

Fix

Instead of sanitizing hyphens away during import, allow hyphens natively in EmDash slugs. The slug validation pattern is updated from /^[a-z][a-z0-9_]*$/ to /^[a-z][a-z0-9_-]*$/ across all validation layers:

  • database/validate.ts — SQL identifier validation
  • schema/registry.ts — collection/field slug validation
  • api/schemas/common.ts — shared Zod slug pattern
  • api/schemas/schema.ts — collection/field API schemas
  • api/schemas/taxonomies.ts — taxonomy API schemas
  • api/handlers/taxonomies.ts — taxonomy name validation
  • mcp/server.ts — MCP tool schemas
  • seed/validate.ts — seed file validation
  • admin/TaxonomyManager.tsx — admin UI validation + HTML pattern
  • cloudflare/sandbox/bridge.ts — sandbox collection name validation
  • WordPress import analyze.tsINVALID_SLUG_CHARS updated to allow hyphens

Test plan

  • Updated wordpress-slug-sanitization.test.ts — 15 tests pass, verifying hyphens are preserved
  • Lint clean (0 diagnostics)
  • Typecheck passes

@antoineVIVIES antoineVIVIES force-pushed the fix/wordpress-import-hyphenated-slugs branch 4 times, most recently from 5116d21 to 99fef63 Compare April 2, 2026 07:27
@changeset-bot
Copy link
Copy Markdown

changeset-bot bot commented Apr 3, 2026

🦋 Changeset detected

Latest commit: e604ffd

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 9 packages
Name Type
emdash Patch
@emdash-cms/cloudflare Patch
@emdash-cms/plugin-ai-moderation Patch
@emdash-cms/plugin-atproto Patch
@emdash-cms/plugin-audit-log Patch
@emdash-cms/plugin-color Patch
@emdash-cms/plugin-embeds Patch
@emdash-cms/plugin-forms Patch
@emdash-cms/plugin-webhook-notifier Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 3, 2026

All contributors have signed the CLA ✍️ ✅
Posted by the CLA Assistant Lite bot.

@antoineVIVIES
Copy link
Copy Markdown
Author

I have read the CLA Document and I hereby sign the CLA

github-actions bot added a commit that referenced this pull request Apr 3, 2026
@antoineVIVIES antoineVIVIES force-pushed the fix/wordpress-import-hyphenated-slugs branch 3 times, most recently from fb1f0a8 to 17bfe19 Compare April 3, 2026 00:54
WordPress plugins register custom post types with hyphens (e.g.
`elementor-hf`), but EmDash collection slugs require `[a-z][a-z0-9_]*`.
This caused a crash in the prepare step with SchemaError: INVALID_SLUG.

Sanitize slugs by replacing invalid characters with underscores across
all three import phases: analyze, prepare, and execute.
@ascorbic
Copy link
Copy Markdown
Collaborator

ascorbic commented Apr 5, 2026

I think instead, EmDash should allow dashes in slugs. Could you change that instead?

@antoineVIVIES
Copy link
Copy Markdown
Author

@ascorbic sure will do that soon

Update slug validation from /^[a-z][a-z0-9_]*$/ to /^[a-z][a-z0-9_-]*$/
across all validation layers (schema registry, API schemas, MCP server,
seed validation, SQL identifier validation, admin UI, Cloudflare sandbox).

WordPress import no longer converts hyphens to underscores — post types
like `elementor-hf` and `shop-order` pass through natively.

Fixes emdash-cms#79
@antoineVIVIES antoineVIVIES changed the title fix: sanitize WordPress post type slugs during import fix: allow hyphens in collection, field, and taxonomy slugs Apr 5, 2026
Copilot AI review requested due to automatic review settings April 5, 2026 13:59
@antoineVIVIES
Copy link
Copy Markdown
Author

Updated per @ascorbic's feedback — instead of sanitizing hyphens to underscores during WordPress import, EmDash now allows hyphens natively in collection, field, and taxonomy slugs.

What changed: slug validation updated from [a-z][a-z0-9_]* to [a-z][a-z0-9_-]* across all 10 validation points (schema registry, API schemas, MCP server, seed validation, SQL identifier validation, admin UI, Cloudflare sandbox). WordPress post types like elementor-hf and shop-order now pass through as-is.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates EmDash slug validation to natively allow hyphens in collection, field, and taxonomy slugs (fixing WordPress import crashes when plugins use hyphenated post type slugs).

Changes:

  • Expanded slug validation regexes across core schema/seed validation, API schemas/handlers, MCP schemas, and admin UI validation to accept -.
  • Updated WordPress import logic to preserve hyphens (and added regression tests for slug sanitization/mapping).
  • Added a changeset to publish the patch release.

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
packages/core/tests/unit/import/wordpress-slug-sanitization.test.ts Adds regression tests ensuring WordPress slugs preserve hyphens and sanitize only truly invalid chars.
packages/core/src/seed/validate.ts Updates seed validation to allow hyphens in collection/field slugs and updates error messages.
packages/core/src/schema/registry.ts Allows hyphens in schema registry slug validation and updates the thrown error message.
packages/core/src/mcp/server.ts Updates MCP Zod schemas/descriptions to accept hyphenated slugs.
packages/core/src/database/validate.ts Broadens “safe SQL identifier” validation to allow hyphens and updates docs/error text.
packages/core/src/astro/routes/api/import/wordpress/prepare.ts Sanitizes collection slugs defensively during prepare.
packages/core/src/astro/routes/api/import/wordpress/execute.ts Sanitizes collection slugs defensively during execute.
packages/core/src/astro/routes/api/import/wordpress/analyze.ts Introduces sanitizeSlug, allows hyphens, and applies sanitization to unknown post types.
packages/core/src/api/schemas/taxonomies.ts Updates taxonomy name/collection slug validation to accept hyphens.
packages/core/src/api/schemas/common.ts Updates shared slugPattern to accept hyphens.
packages/core/src/api/handlers/taxonomies.ts Updates handler-side taxonomy name validation/message to accept hyphens.
packages/cloudflare/src/sandbox/bridge.ts Updates sandbox collection-name validation regex to accept hyphens.
packages/admin/src/components/TaxonomyManager.tsx Updates admin-side taxonomy name validation + HTML pattern/title/help text to accept hyphens.
.changeset/tidy-turkeys-drum.md Adds release note for allowing hyphens in slugs and fixing WP import crash (#79).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 14 to 16
/** Regex to validate collection names (prevent SQL injection) */
const COLLECTION_NAME_REGEX = /^[a-z][a-z0-9_]*$/;
const COLLECTION_NAME_REGEX = /^[a-z][a-z0-9_-]*$/;

Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Allowing hyphens here makes ec_${collection} an invalid/unquoted SQL identifier in the raw D1 queries below (e.g. SELECT * FROM ec_${collection} ...). A collection like my-type will be parsed as ec_my minus type and fail at runtime. Either keep this regex underscore-only, or update all interpolated table references to be properly quoted/escaped (e.g. "ec_${collection}").

Copilot uses AI. Check for mistakes.
Comment on lines 11 to 16
/**
* Pattern for safe SQL identifiers.
* Must start with a lowercase letter, followed by lowercase letters, digits, or underscores.
* Must start with a lowercase letter, followed by lowercase letters, digits, underscores, or hyphens.
*/
const IDENTIFIER_PATTERN = /^[a-z][a-z0-9_]*$/;
const IDENTIFIER_PATTERN = /^[a-z][a-z0-9_-]*$/;

Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

validateIdentifier is documented/used as a guard for interpolating identifiers into raw SQL, but allowing - is not safe unless every usage consistently quotes identifiers. For example, search/fts-manager.ts builds FTS column lists and trigger statements from field slugs without quoting; a slug like my-field will produce invalid SQL. Consider reverting IDENTIFIER_PATTERN to exclude hyphens (and rely on a separate slug validator), or audit/update all raw-SQL call sites to quote/escape identifiers before allowing -.

Copilot uses AI. Check for mistakes.

// Regex patterns for schema registry
const SLUG_VALIDATION_PATTERN = /^[a-z][a-z0-9_]*$/;
const SLUG_VALIDATION_PATTERN = /^[a-z][a-z0-9_-]*$/;
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that field slugs may contain hyphens, any feature that interpolates field slugs into unquoted SQL identifiers can break. Concretely, the FTS manager builds FTS5 column lists / trigger bodies from field.slug without quoting, so searchable fields like my-field will fail when enabling search. Either ensure those SQL fragments quote identifiers, or disallow hyphens specifically for searchable-field slugs.

Suggested change
const SLUG_VALIDATION_PATTERN = /^[a-z][a-z0-9_-]*$/;
const SLUG_VALIDATION_PATTERN = /^[a-z][a-z0-9_]*$/;

Copilot uses AI. Check for mistakes.
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 5, 2026

Overlapping PRs

This PR modifies files that are also changed by other open PRs:

This may cause merge conflicts or duplicated work. A maintainer will coordinate.

@BenjaminPrice
Copy link
Copy Markdown
Contributor

Awaiting a wordpress export that can replicate the bug, for testing.

@ascorbic
Copy link
Copy Markdown
Collaborator

ascorbic commented Apr 6, 2026

I'm sorry – Copilot is right. My suggestion to allow dashes is flawed, because it means the names are no longer SQL-safe. Sorry for the runaround, but can we switch back to the previous implementation?

@antoineVIVIES
Copy link
Copy Markdown
Author

@ascorbic sure Will go back to previous fix asap

…production

Includes realistic custom post types from Elementor (elementor-hf),
WooCommerce (shop-order, shop-coupon), and ACF (acf-field-group).
@github-actions github-actions bot added size/L and removed size/M labels Apr 7, 2026
@antoineVIVIES
Copy link
Copy Markdown
Author

@ascorbic Reverted back to the original approach — hyphens are sanitized to underscores during WordPress import, slug validation stays at [a-z][a-z0-9_]*.

@BenjaminPrice Added a test fixture for reproduction: hyphenated-post-types-export.xml

It's a minimal WXR export containing hyphenated custom post types from common plugins:

  • Elementor: elementor-hf (header/footer templates)
  • WooCommerce: shop-order, shop-coupon
  • ACF: acf-field-group

Plus a standard post as a control. You can use this to test the import flow — without the fix, the elementor-hf / shop-order / etc. post types crash with SchemaError: INVALID_SLUG during the prepare step. With the fix, they get sanitized to elementor_hf, shop_order, etc.

Copy link
Copy Markdown
Contributor

@BenjaminPrice BenjaminPrice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One edge case to consider.

That said, I was able to reproduce the bug and confirm that this PR does indeed fix it.

.toLowerCase()
.replace(INVALID_SLUG_CHARS, "_")
.replace(LEADING_NON_ALPHA, "");
if (!sanitized) return "imported";
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The INVALID_SLUG_CHARS regex /[^a-z0-9_]/g is applied after .toLowerCase(), so it correctly handles uppercase. However, a slug consisting entirely of digits (e.g. 123) or underscores/hyphens produces "imported". If two different post types both degenerate to "imported", their content would merge into a single collection. This is an edge case but could cause data loss during import of unusual WP exports.

Perhaps we need to add some randomization to the slug, or we can get more sophisticated and have a map for all santized slugs and add a numerical incrementation.

…o the same value

If multiple WordPress post types degenerate to the same slug (e.g.
numeric-only types both becoming "imported"), append a numeric suffix
(_1, _2, etc.) to prevent merging unrelated content into one collection.
@antoineVIVIES
Copy link
Copy Markdown
Author

@BenjaminPrice Good catch on the slug collision edge case — fixed. analyzeWxr now tracks seen collection slugs and appends a numeric suffix (_1, _2, etc.) when multiple post types sanitize to the same value. This prevents silent data merging.

Copy link
Copy Markdown
Collaborator

@ascorbic ascorbic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! Thanks

@ascorbic ascorbic enabled auto-merge (squash) April 8, 2026 13:46
@pkg-pr-new
Copy link
Copy Markdown

pkg-pr-new bot commented Apr 8, 2026

Open in StackBlitz

@emdash-cms/admin

npm i https://pkg.pr.new/@emdash-cms/admin@83

@emdash-cms/auth

npm i https://pkg.pr.new/@emdash-cms/auth@83

@emdash-cms/blocks

npm i https://pkg.pr.new/@emdash-cms/blocks@83

@emdash-cms/cloudflare

npm i https://pkg.pr.new/@emdash-cms/cloudflare@83

emdash

npm i https://pkg.pr.new/emdash@83

create-emdash

npm i https://pkg.pr.new/create-emdash@83

@emdash-cms/gutenberg-to-portable-text

npm i https://pkg.pr.new/@emdash-cms/gutenberg-to-portable-text@83

@emdash-cms/x402

npm i https://pkg.pr.new/@emdash-cms/x402@83

@emdash-cms/plugin-ai-moderation

npm i https://pkg.pr.new/@emdash-cms/plugin-ai-moderation@83

@emdash-cms/plugin-atproto

npm i https://pkg.pr.new/@emdash-cms/plugin-atproto@83

@emdash-cms/plugin-audit-log

npm i https://pkg.pr.new/@emdash-cms/plugin-audit-log@83

@emdash-cms/plugin-color

npm i https://pkg.pr.new/@emdash-cms/plugin-color@83

@emdash-cms/plugin-embeds

npm i https://pkg.pr.new/@emdash-cms/plugin-embeds@83

@emdash-cms/plugin-forms

npm i https://pkg.pr.new/@emdash-cms/plugin-forms@83

@emdash-cms/plugin-webhook-notifier

npm i https://pkg.pr.new/@emdash-cms/plugin-webhook-notifier@83

commit: e604ffd

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

WordPress import crashes on collections with hyphens in slug (e.g. Elementor elementor-hf)

4 participants