Exploring Native Paywall Support in EmDash Core #1467

CacheMeOwside · 2026-06-14T18:49:54Z

CacheMeOwside
Jun 14, 2026

For most publishers, paywalled content is the core business model. Large media organizations depend on subscriptions as the primary revenue model, instead of relying entirely on ad revenue.

That's why most major publishing platforms have some solution for content gating, whether it's built into the platform itself (Ghost, Substack) or provided through a plugin ecosystem (WordPress). The idea is the same in both cases: show a preview to everyone, and deliver the full content only to readers who are paying or otherwise authorized.

What I take from open ecosystems

WordPress shows both sides of a very open, plugin-friendly platform. The openness is a real strength. There is a plugin for almost anything, and that ecosystem is a big part of why WordPress won.

But that same openness has a cost when it reaches critical areas, and paywalls are one of the example. Poor development practices, and an incomplete understanding of how the system actually serves content, leads to incorrect implementations. A common practice is gating content in the browser, where the full article is still sent in the response and only hidden from view (with the CSS display:none property). The paywalled content is still in the response, so it stays retrievable, and a service like smry.ai can access it.

A few live examples of paywalls on some of the popular WordPress websites, where tools like smry.ai are easily able to access paywalled content. Feel free to give it a try:

Indian Express: https://indianexpress.com/article/upsc-current-affairs/upsc-key-pm-modis-france-visit-brain-eating-amoeba-assam-nagaland-pact-10739182/
Small Boats Monthly: https://smallboatsmonthly.com/article-categories/adventure-narrative/.
This website has a metered paywall, where the first article is free and subsequent ones are paid.
The Mercury News: https://www.mercurynews.com/2026/05/21/what-to-watch-creepy-boroughs-cashes-in-on-its-all-star-cast/

To be fair to some of these sites, they are not all necessarily poor implementations. There is a genuine structural tension at play. Search engines expect the full article to be present so they can index and rank it. The moment websites try to serve full content only to verified search engine bots while showing a preview to everyone else, it risks being flagged for cloaking, which carries ranking penalties.

So sites are pushed toward putting the full content in the response for everyone (and then hiding it on the client-side using CSS), which is exactly what makes it retrievable by tools like smry.ai. In other words, the leak is often not a bug in the site's paywall but a side effect of staying on the right side of search engine policy.

This official Google article on using schema.org JSON-LD to mark up paywalled content was my starting point for researching the SEO side of this. The key is that this approach (exposing full content to crawlers without exposing it to everyone else) works securely only if we can reliably verify a request genuinely comes from a crawler rather than a spoofed user-agent.

I'd like to invite SEO experts in the community to share insights on the best practices for keeping paywalled articles secure without hurting SEO rankings.

Bugs are a normal part of software and no system can fully avoid them. But there is a difference between a cosmetic bug and a failure in a critical area. Anything that touches security, revenue, access control, or reader privacy deserves a higher bar, because the cost of getting it wrong is money and trust. One way to raise that bar is to be deliberate about what the CMS core owns versus what is handed to third party developers. The critical guarantees belong in core, designed carefully and tested once, instead of being re-implemented (often insecurely) by every plugin.

The proposal - Native paywall support in EmDash Core

Core gates. Plugins and external providers do everything else.
EmDash Core owns one guarantee: protected content is never delivered to a reader who is not authorized to see it. Everything else stays with third party plugins or external services: authentication, authorization, subscriptions, memberships, metering, and payment processing. Core provides the gate and a contract that plugins and providers build on. Core does not become an identity provider or a billing platform.

This keeps the part that is easy to get wrong, and expensive when you do (not leaking content), in one well-tested place, while leaving the parts that vary a lot between publishers (how you log readers in, how you charge them) fully open.

Design considerations at a high level

In order to build a paywall capability, here are some points I think need consideration. This is a starting list, not a spec. Under some points I have noted a candidate approach to keep things concrete, but these are illustrative starting points, not decisions. Delivery modes are covered in their own section below. Please feel free to add anything I may have missed.

1. Where gating happens

When EmDash itself enforces the gate (Mode A in the "Delivery Modes" section), the enforcement should sit where content is fetched i.e. at the data layer, not in the template that renders it. If a post is protected, every path that can return its body or an excerpt of it must respect that: the rendered page, the content API, search results, and any feeds or other content surfaces a site exposes, including ones that plugins add. Gating only the rendered page leaves the rest open. (In Mode B in the "Delivery Modes" section, this does not apply in the same way, since EmDash hands full content plus markers to an external proxy that does the gating).

In practice this points to a reader-aware content fetch that returns the preview or the full body based on access, or a content gateway that every read passes through, so the decision is made once for all of those surfaces.

2. The authoring primitive

Publishers need a simple way to mark where the free preview ends and protected content begins. A single in-content marker (a "paywall break" block) can be the source of truth that drives previews, feeds, search, and SEO. Since EmDash content is Portable Text, this fits naturally as a paywall break block in the body: everything before it is the preview, everything after is protected. Ghost does the same thing today with a paywall card that marks the cut point.

3. Caching

The public preview of posts should be cacheable and shared freely, since it is the same for everyone. Unlocked or reader-specific responses must never enter a shared cache. Concretely, the anonymous preview could go out as public with an s-maxage, while unlocked responses use private, no-store.

There also needs to be a clear way for invalidating cached responses when a post changes, which could hang off the existing content lifecycle hooks. Two cases matter, for different reasons:

If a post is edited, the cached copy is just stale, which is annoying but not dangerous.
If a post flips between free and paid, it becomes a security issue: a post that used to be public may have its full body sitting in a shared cache, and unless that copy is purged when the post turns paid, the CDN keeps serving the whole article to anonymous readers regardless of the new paywall. So a visibility change has to trigger a purge, or the paywall leaks through the cache.

4. The entitlement contract (the decision)

This is the decision layer of the gate. Core needs a clear interface it can call to ask "can this reader see this?" and get back allow, deny, or preview. The plugin or external provider makes the decision and supplies the reader identity, since core has no reader accounts of its own. Core enforces the result. When the answer is unclear or the check fails, the safe default is to keep content locked.

There are two situations to support, and the difference is just when the reader's access becomes known:

Sometimes the server already knows the reader while it is building the page, for example a plugin that reads a login cookie, so core can ask the plugin during rendering and serve the full body or the preview based on the answer.
Other times the reader only signs in later, in the browser, after the page has loaded, which is how a provider like Piano works. The server already sent the preview because it did not know the reader yet, so the protected part has to be delivered after the sign-in finishes. How that delivery happens is the next point. Core should support both.

5. Delivering protected content without leaking it (the mechanism)

This is the mechanism layer: Given a decision, how the protected content actually reaches the page. The rule underneath it is simple: protected content should never be sent hidden in the initial response and revealed with client code.

When the reader is already known at request time (a session cookie the server understands), Astro Server Islands (server:defer) look like a good fit: the page shell and preview stay cacheable while the protected region is rendered per request on the server behind the access check. EmDash does not use Server Islands today, so this would be new ground, but it is built for this shape of problem.

When the reader only signs in later in the browser (the Piano case), server islands do not fit, because they fetch automatically on page load, before that sign-in exists. There are two options here: reload the page so the server re-renders it now that the reader is known (the simplest approach, and what Ghost does), or fetch just the protected fragment from a dedicated endpoint after sign-in completes (fetch-after-auth). Either way the server runs the same access check.

6. SEO and discoverability

To be discussed.

7. Abuse and rate limits

Even with the gate working, a reader with a valid account could use it to bulk-download the whole archive, so rate-limiting should make that impractical. This can lean on whatever the deployment already has at its CDN, proxy, or host layer, rather than being built from scratch.

8. Internationalization

Content is per locale. This needs a decision: whether "protected" is set per translation or shared across a translation group, so that a forgotten locale is not an accidental leak.

9. Editor experience

Inserting the paywall break easily (a slash command or toolbar button), and a warning when a protected post has no preview defined. There is also an open question about what "preview" should show for a paywalled post: the full content as the authenticated editor sees it, the anonymous non-subscriber view (the free portion plus the wall), or both via a toggle. This is a UX choice rather than a security gate, since the editor is already authenticated, so showing them the anonymous view is showing less, not a bypass.

Delivery modes

Two delivery models are widely used, and I think core should support both and let the publisher pick the one that matches their setup.

Mode A: EmDash core enforces. Core consults the entitlement contract (plugin or external provider) and returns either the preview or the full content, so protected content never reaches an unauthorized reader. This fits plugin-based setups, and also providers like Piano that authenticate readers in the browser and then unlock content for them.
Mode B: an external service enforces. Large enterprise publishers often put an external paywall provider such as Zephr in front of their site. Zephr runs as a reverse proxy ahead of the origin and does the gating itself. It relies on section markers placed in the HTML to know which regions to strip for each reader. In this mode, EmDash core does not gate. It outputs the full content with well defined region markers, and the external provider removes protected regions per request based on the reader's access.

Because these are different models, core needs to let the publisher select the delivery mode rather than assume one.

Next Steps

This is a high-level idea, and I'd love the community's input on this proposal. Is anything missing in the considerations above? Are there delivery modes that real publishers depend on that aren't covered? Are there better ways to scope what belongs in core versus what belongs to plugins and providers? Suggestions, and ideas from the maintainers, community, plugin authors, SEO experts and anyone who has run a paywalled site would be very welcome.

ascorbic · 2026-06-18T19:02:58Z

ascorbic
Jun 18, 2026
Maintainer

I just want to say this is really good, @CacheMeOwside. Thanks. I will get back to you with more feedback soon when I've had more time to look at it.

1 reply

CacheMeOwside Jun 25, 2026
Author

Thanks so much, @ascorbic! Absolutely no rush on the feedback. I'm happy to dig into any area that would benefit from more detail.

ascorbic · 2026-06-29T11:39:06Z

ascorbic
Jun 29, 2026
Maintainer

Thanks @CacheMeOwside
I've been looking at this in some more. I think overall the proposal is solid, and it does make sense for it to be in core, and to expose plugin hooks. There are a few things that I think need to be addressed.

User accounts

I think there a lot of potential issues in using the same mechanism for editor and subscriber accounts, in security, flexibility and integration. I know we do currently have a subscriber role, but I think we need to think hard whether we do use this as the primary method for subscriber accounts. If we don't go with that, how should subscribers be represented? Should it delegate entirely to external systems (via plugins) or should there be something in core? How does this interact with user comments?

Bot detection

As you identified, there is a real conflict between paywalls and web crawlers. How do we handle crawler detection, so that we can serve full content to crawlers without making the paywall easy to bypass? This is something that Cloudflare can do, but we don't want to tie a solution to Cloudflare – it should be a generic system that can have signals from plugins etc.

x402 integration

We have first-class x402 support to allow AI tools to identify themselves. Can we combine these with the paywalls? Should we?

Thanks again for doing this. It's an important contribution.

2 replies

CacheMeOwside Jun 29, 2026
Author

Thanks for taking another look, @ascorbic, and for the feedback! I'm really glad the overall direction makes sense.

I'll dig into all the three points you mentioned and get back to you soon.

CacheMeOwside Jun 30, 2026
Author

User accounts

I think there a lot of potential issues in using the same mechanism for editor and subscriber accounts, in security, flexibility and integration.

WordPress user management, roles and capabilities work quite differently from EmDash. In WordPress, roles are essentially collections of capabilities, and those capabilities are extensible at runtime. For example, a plugin can introduce a capability such as can_access_premium_content and assign it to any role.

The "Subscriber" label in both systems can be misleading because it suggests a connection to a subscription or membership model. In WordPress, however, subscriber is primarily just a predefined role with a particular set of permissions. By default, frontend read access is generally the same for all roles unless additional capabilities or restrictions are introduced.

In EmDash, roles are fixed and set in code, which makes them more rigid. That could become limiting as we add features or integrate with external services that expect fine-grained capabilities rather than a small set of predefined roles.

In fact, as I type this, I think introducing custom capabilities into EmDash could be a valuable enhancement. It would give third-party developers a more flexible permission model and make it easier to support use cases such as premium content access, integrations, or other feature-specific permissions without overloading the meaning of existing roles like "Subscriber."

I know we do currently have a subscriber role, but I think we need to think hard whether we do use this as the primary method for subscriber accounts?

If we move toward a capabilities-based model, we could introduce core capabilities that grant access to specific areas of the CMS: for example, permissions for viewing comments or accessing restricted content. This would also give plugin developers greater flexibility, allowing features such as paywalls or membership systems to define and manage their own capabilities instead of relying on a dedicated "Subscriber" role as the primary representation of subscriber accounts.

If we don't go with that, how should subscribers be represented? Should it delegate entirely to external systems (via plugins) or should there be something in core?

I've seen WordPress setups where the plugins introduce Custom user role + custom capabilites to meet specific requirements of the external service it integrates with. With the capabilites model, it will offer more power to the external systems.

CC: @ascorbic

rmakk · 2026-06-30T18:40:02Z

rmakk
Jun 30, 2026

This is a great writeup. I've been tackling this challenge on some client work and arrived at a plugin solution for handling "memberships" I felt that mixing emdash users and readers into the same bucket wasn't the correct way forward and pivoted from opening a discussion proposing an implementation in emdash core. Payment integrations are necessary to drive these as well which is hard to not be opinionated about. Let me know if you'd like to checkout my plugin. I'm modeling it on WallKit.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Exploring Native Paywall Support in EmDash Core #1467

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 3 comments 3 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

Uh oh!

Exploring Native Paywall Support in EmDash Core #1467

Uh oh!

Uh oh!

CacheMeOwside Jun 14, 2026

What I take from open ecosystems

The proposal - Native paywall support in EmDash Core

Design considerations at a high level

1. Where gating happens

2. The authoring primitive

3. Caching

4. The entitlement contract (the decision)

5. Delivering protected content without leaking it (the mechanism)

6. SEO and discoverability

7. Abuse and rate limits

8. Internationalization

9. Editor experience

Delivery modes

Next Steps

Replies: 3 comments · 3 replies

Uh oh!

ascorbic Jun 18, 2026 Maintainer

Uh oh!

CacheMeOwside Jun 25, 2026 Author

Uh oh!

ascorbic Jun 29, 2026 Maintainer

User accounts

Bot detection

x402 integration

Uh oh!

CacheMeOwside Jun 29, 2026 Author

Uh oh!

CacheMeOwside Jun 30, 2026 Author

User accounts

Uh oh!

Uh oh!

rmakk Jun 30, 2026

CacheMeOwside
Jun 14, 2026

Replies: 3 comments 3 replies

ascorbic
Jun 18, 2026
Maintainer

CacheMeOwside Jun 25, 2026
Author

ascorbic
Jun 29, 2026
Maintainer

CacheMeOwside Jun 29, 2026
Author

CacheMeOwside Jun 30, 2026
Author

rmakk
Jun 30, 2026