Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions Gemfile
Original file line number Diff line number Diff line change
Expand Up @@ -206,6 +206,10 @@ gem 'unicorn', require: false
gem 'puma', require: false
gem 'rbtrace', require: false

# required for feed importing and embedding
gem 'ruby-readability', require: false
gem 'simple-rss', require: false

# perftools only works on 1.9 atm
group :profile do
# travis refuses to install this, instead of fuffing, just avoid it for now
Expand Down
7 changes: 7 additions & 0 deletions Gemfile_rails4.lock
Original file line number Diff line number Diff line change
Expand Up @@ -117,6 +117,7 @@ GEM
fspath (2.0.5)
given_core (3.1.1)
sorcerer (>= 0.3.7)
guess_html_encoding (0.0.9)
handlebars-source (1.1.2)
hashie (2.0.5)
highline (1.6.20)
Expand Down Expand Up @@ -309,6 +310,9 @@ GEM
rspec-mocks (~> 2.14.0)
ruby-hmac (0.4.0)
ruby-openid (2.3.0)
ruby-readability (0.5.7)
guess_html_encoding (>= 0.0.4)
nokogiri (>= 1.4.2)
sanitize (2.0.6)
nokogiri (>= 1.4.4)
sass (3.2.12)
Expand Down Expand Up @@ -337,6 +341,7 @@ GEM
celluloid (>= 0.14.1)
ice_cube (~> 0.11.0)
sidekiq (~> 2.15.0)
simple-rss (1.3.1)
simplecov (0.7.1)
multi_json (~> 1.0)
simplecov-html (~> 0.7.1)
Expand Down Expand Up @@ -466,6 +471,7 @@ DEPENDENCIES
rinku
rspec-given
rspec-rails
ruby-readability
sanitize
sass
sass-rails
Expand All @@ -474,6 +480,7 @@ DEPENDENCIES
sidekiq (= 2.15.1)
sidekiq-failures
sidetiq (>= 0.3.6)
simple-rss
simplecov
sinatra
slim
Expand Down
27 changes: 27 additions & 0 deletions app/assets/javascripts/embed.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
/* global discourseUrl */
/* global discourseEmbedUrl */
(function() {

var comments = document.getElementById('discourse-comments'),
iframe = document.createElement('iframe');
iframe.src = discourseUrl + "embed/best?embed_url=" + encodeURIComponent(discourseEmbedUrl);
iframe.id = 'discourse-embed-frame';
iframe.width = "100%";
iframe.frameBorder = "0";
iframe.scrolling = "no";
comments.appendChild(iframe);
Comment on lines +5 to +12
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add error handling for DOM manipulation.

The code assumes discourse-comments element exists and doesn't handle cases where it might be missing.

Add defensive programming:

   var comments = document.getElementById('discourse-comments'),
       iframe = document.createElement('iframe');
+  
+  if (!comments) {
+    console.error('Element with id "discourse-comments" not found');
+    return;
+  }
+  
   iframe.src = discourseUrl + "embed/best?embed_url=" + encodeURIComponent(discourseEmbedUrl);
   iframe.id = 'discourse-embed-frame';
   iframe.width = "100%";
   iframe.frameBorder = "0";
   iframe.scrolling = "no";
   comments.appendChild(iframe);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
var comments = document.getElementById('discourse-comments'),
iframe = document.createElement('iframe');
iframe.src = discourseUrl + "embed/best?embed_url=" + encodeURIComponent(discourseEmbedUrl);
iframe.id = 'discourse-embed-frame';
iframe.width = "100%";
iframe.frameBorder = "0";
iframe.scrolling = "no";
comments.appendChild(iframe);
var comments = document.getElementById('discourse-comments'),
iframe = document.createElement('iframe');
if (!comments) {
console.error('Element with id "discourse-comments" not found');
return;
}
iframe.src = discourseUrl + "embed/best?embed_url=" + encodeURIComponent(discourseEmbedUrl);
iframe.id = 'discourse-embed-frame';
iframe.width = "100%";
iframe.frameBorder = "0";
iframe.scrolling = "no";
comments.appendChild(iframe);
🤖 Prompt for AI Agents
In app/assets/javascripts/embed.js around lines 5 to 12, the code assumes the
element with id 'discourse-comments' exists without checking. To fix this, add a
conditional check to verify that the 'discourse-comments' element is not null
before attempting to create and append the iframe. If the element is missing,
avoid executing the iframe creation and appending logic to prevent runtime
errors.



function postMessageReceived(e) {
if (!e) { return; }
if (discourseUrl.indexOf(e.origin) === -1) { return; }
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Security vulnerability: Weak origin validation.

The current origin validation using indexOf is insufficient and can be bypassed. An attacker could use a malicious domain like evil-discourseUrl.com to pass this check.

Use proper origin validation:

-    if (discourseUrl.indexOf(e.origin) === -1) { return; }
+    var allowedOrigin = new URL(discourseUrl).origin;
+    if (e.origin !== allowedOrigin) { return; }
🤖 Prompt for AI Agents
In app/assets/javascripts/embed.js at line 17, the origin validation uses
indexOf which can be bypassed by malicious domains containing the discourseUrl
as a substring. Replace this check with a strict equality comparison or use URL
parsing to verify that e.origin exactly matches the expected discourseUrl origin
to ensure proper security validation.


if (e.data) {
if (e.data.type === 'discourse-resize' && e.data.height) {
iframe.height = e.data.height + "px";
}
}
}
window.addEventListener('message', postMessageReceived, false);

})();
69 changes: 69 additions & 0 deletions app/assets/stylesheets/embed.css.scss
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
//= require ./vendor/normalize
//= require ./common/foundation/base

article.post {
border-bottom: 1px solid #ddd;

.post-date {
float: right;
color: #aaa;
font-size: 12px;
margin: 4px 4px 0 0;
}

.author {
padding: 20px 0;
width: 92px;
float: left;

text-align: center;

h3 {
text-align: center;
color: #4a6b82;
font-size: 13px;
margin: 0;
}
}

.cooked {
padding: 20px 0;
margin-left: 92px;

p {
margin: 0 0 1em 0;
}
}
}

header {
padding: 10px 10px 20px 10px;

font-size: 18px;

border-bottom: 1px solid #ddd;
}

footer {
font-size: 18px;

.logo {
margin-right: 10px;
margin-top: 10px;
}

a[href].button {
margin: 10px 0 0 10px;
}
}

.logo {
float: right;
max-height: 30px;
}

a[href].button {
background-color: #eee;
padding: 5px;
display: inline-block;
}
34 changes: 34 additions & 0 deletions app/controllers/embed_controller.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
class EmbedController < ApplicationController
skip_before_filter :check_xhr
skip_before_filter :preload_json
before_filter :ensure_embeddable

layout 'embed'

def best
embed_url = params.require(:embed_url)
topic_id = TopicEmbed.topic_id_for_embed(embed_url)

if topic_id
@topic_view = TopicView.new(topic_id, current_user, {best: 5})
else
Jobs.enqueue(:retrieve_topic, user_id: current_user.try(:id), embed_url: embed_url)
render 'loading'
end

discourse_expires_in 1.minute
end
Comment on lines +8 to +20
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add explicit rendering and improve error handling.

The method logic is sound but has a few areas for improvement:

  1. Missing explicit render call when topic_id is found
  2. Cache expiration is set regardless of content state
  3. No error handling for malformed embed_url parameter

Apply this diff to improve the method:

 def best
   embed_url = params.require(:embed_url)
+  
+  # Validate URL format
+  begin
+    URI.parse(embed_url)
+  rescue URI::InvalidURIError
+    render json: { error: 'Invalid embed URL' }, status: 400
+    return
+  end
+  
   topic_id = TopicEmbed.topic_id_for_embed(embed_url)

   if topic_id
     @topic_view = TopicView.new(topic_id, current_user, {best: 5})
+    render 'best'
+    discourse_expires_in 1.minute
   else
     Jobs.enqueue(:retrieve_topic, user_id: current_user.try(:id), embed_url: embed_url)
     render 'loading'
   end
-
-  discourse_expires_in 1.minute
 end
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def best
embed_url = params.require(:embed_url)
topic_id = TopicEmbed.topic_id_for_embed(embed_url)
if topic_id
@topic_view = TopicView.new(topic_id, current_user, {best: 5})
else
Jobs.enqueue(:retrieve_topic, user_id: current_user.try(:id), embed_url: embed_url)
render 'loading'
end
discourse_expires_in 1.minute
end
def best
embed_url = params.require(:embed_url)
# Validate URL format
begin
URI.parse(embed_url)
rescue URI::InvalidURIError
render json: { error: 'Invalid embed URL' }, status: 400
return
end
topic_id = TopicEmbed.topic_id_for_embed(embed_url)
if topic_id
@topic_view = TopicView.new(topic_id, current_user, {best: 5})
render 'best'
discourse_expires_in 1.minute
else
Jobs.enqueue(
:retrieve_topic,
user_id: current_user.try(:id),
embed_url: embed_url
)
render 'loading'
end
end
🤖 Prompt for AI Agents
In app/controllers/embed_controller.rb around lines 8 to 20, add an explicit
render call when topic_id is found to ensure a response is returned. Move the
discourse_expires_in call inside the conditional branches so cache expiration is
set only when appropriate. Add error handling for malformed embed_url by
rescuing parameter errors and rendering an error or fallback response
accordingly.


private

def ensure_embeddable
raise Discourse::InvalidAccess.new('embeddable host not set') if SiteSetting.embeddable_host.blank?
raise Discourse::InvalidAccess.new('invalid referer host') if URI(request.referer || '').host != SiteSetting.embeddable_host

response.headers['X-Frame-Options'] = "ALLOWALL"
rescue URI::InvalidURIError
raise Discourse::InvalidAccess.new('invalid referer host')
end
Comment on lines +24 to +31
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Strengthen embedding security controls.

The security validation is functional but could be more robust:

  1. X-Frame-Options: ALLOWALL is overly permissive and defeats clickjacking protection
  2. Referer validation can be bypassed and should be supplemented with additional checks
  3. The fallback to empty string for nil referer could mask legitimate validation failures

Consider this more secure approach:

 def ensure_embeddable
   raise Discourse::InvalidAccess.new('embeddable host not set') if SiteSetting.embeddable_host.blank?
-  raise Discourse::InvalidAccess.new('invalid referer host') if URI(request.referer || '').host != SiteSetting.embeddable_host
+  
+  referer = request.referer
+  raise Discourse::InvalidAccess.new('referer required for embedding') if referer.blank?
+  raise Discourse::InvalidAccess.new('invalid referer host') if URI(referer).host != SiteSetting.embeddable_host

-  response.headers['X-Frame-Options'] = "ALLOWALL"
+  # Only allow framing from the configured embeddable host
+  response.headers['X-Frame-Options'] = "ALLOW-FROM https://#{SiteSetting.embeddable_host}"
+  response.headers['Content-Security-Policy'] = "frame-ancestors https://#{SiteSetting.embeddable_host}"
 rescue URI::InvalidURIError
-  raise Discourse::InvalidAccess.new('invalid referer host')
+  raise Discourse::InvalidAccess.new('invalid referer URL')
 end
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def ensure_embeddable
raise Discourse::InvalidAccess.new('embeddable host not set') if SiteSetting.embeddable_host.blank?
raise Discourse::InvalidAccess.new('invalid referer host') if URI(request.referer || '').host != SiteSetting.embeddable_host
response.headers['X-Frame-Options'] = "ALLOWALL"
rescue URI::InvalidURIError
raise Discourse::InvalidAccess.new('invalid referer host')
end
def ensure_embeddable
raise Discourse::InvalidAccess.new('embeddable host not set') if SiteSetting.embeddable_host.blank?
referer = request.referer
raise Discourse::InvalidAccess.new('referer required for embedding') if referer.blank?
raise Discourse::InvalidAccess.new('invalid referer host') if URI(referer).host != SiteSetting.embeddable_host
# Only allow framing from the configured embeddable host
response.headers['X-Frame-Options'] = "ALLOW-FROM https://#{SiteSetting.embeddable_host}"
response.headers['Content-Security-Policy'] = "frame-ancestors https://#{SiteSetting.embeddable_host}"
rescue URI::InvalidURIError
raise Discourse::InvalidAccess.new('invalid referer URL')
end
🤖 Prompt for AI Agents
In app/controllers/embed_controller.rb around lines 24 to 31, the current
embedding security is weak because it sets 'X-Frame-Options' to 'ALLOWALL',
which disables clickjacking protection, and the referer validation is easily
bypassed. To fix this, replace 'ALLOWALL' with a stricter header like
'SAMEORIGIN' or use Content Security Policy frame-ancestors directive to
restrict embedding to trusted hosts. Also, improve referer validation by
checking for presence and validity explicitly instead of defaulting to an empty
string, and consider adding additional verification such as checking an
authentication token or origin header to strengthen security.



end
24 changes: 24 additions & 0 deletions app/jobs/regular/retrieve_topic.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
require_dependency 'email/sender'
require_dependency 'topic_retriever'

module Jobs

# Asynchronously retrieve a topic from an embedded site
class RetrieveTopic < Jobs::Base

def execute(args)
raise Discourse::InvalidParameters.new(:embed_url) unless args[:embed_url].present?

user = nil
if args[:user_id]
user = User.where(id: args[:user_id]).first
end

TopicRetriever.new(args[:embed_url], no_throttle: user.try(:staff?)).retrieve
end

end

end


41 changes: 41 additions & 0 deletions app/jobs/scheduled/poll_feed.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
#
# Creates and Updates Topics based on an RSS or ATOM feed.
#
require 'digest/sha1'
require_dependency 'post_creator'
require_dependency 'post_revisor'
require 'open-uri'

module Jobs
class PollFeed < Jobs::Scheduled
recurrence { hourly }
sidekiq_options retry: false

def execute(args)
poll_feed if SiteSetting.feed_polling_enabled? &&
SiteSetting.feed_polling_url.present? &&
SiteSetting.embed_by_username.present?
end

def feed_key
@feed_key ||= "feed-modified:#{Digest::SHA1.hexdigest(SiteSetting.feed_polling_url)}"
end

def poll_feed
user = User.where(username_lower: SiteSetting.embed_by_username.downcase).first
return if user.blank?

require 'simple-rss'
rss = SimpleRSS.parse open(SiteSetting.feed_polling_url)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Critical security vulnerability: Unsafe use of open() with user-controlled URL.

Using open() directly with SiteSetting.feed_polling_url creates a serious security risk. The open() method can execute local files if the URL starts with | or access local files with file:// protocol, potentially leading to command injection or sensitive file disclosure.

Replace with a safer HTTP client:

-      require 'simple-rss'
-      rss = SimpleRSS.parse open(SiteSetting.feed_polling_url)
+      require 'simple-rss'
+      require 'net/http'
+      
+      uri = URI.parse(SiteSetting.feed_polling_url)
+      raise "Invalid feed URL protocol" unless %w[http https].include?(uri.scheme)
+      
+      response = Net::HTTP.get_response(uri)
+      raise "Failed to fetch feed: #{response.code}" unless response.is_a?(Net::HTTPSuccess)
+      
+      rss = SimpleRSS.parse(response.body)

Committable suggestion skipped: line range outside the PR's diff.

🧰 Tools
🪛 Brakeman (7.0.2)

[medium] 29-29: Model attribute used in file name
Type: File Access
Confidence: Medium
More info: https://brakemanscanner.org/docs/warning_types/file_access/

(File Access)

🤖 Prompt for AI Agents
In app/jobs/scheduled/poll_feed.rb at line 29, the use of open() with
SiteSetting.feed_polling_url is unsafe and can lead to security vulnerabilities.
Replace open() with a safer HTTP client like Net::HTTP or an HTTP gem to fetch
the feed content securely. Use the HTTP client to perform a GET request to the
URL, then pass the response body to SimpleRSS.parse instead of directly opening
the URL.


rss.items.each do |i|
url = i.link
url = i.id if url.blank? || url !~ /^https?\:\/\//

content = CGI.unescapeHTML(i.content.scrub)
TopicEmbed.import(user, url, i.title, content)
end
end
Comment on lines +24 to +38
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add comprehensive error handling for feed parsing.

The poll_feed method lacks error handling for several failure scenarios that could cause the job to crash silently or behave unexpectedly.

Add proper error handling:

 def poll_feed
   user = User.where(username_lower: SiteSetting.embed_by_username.downcase).first
   return if user.blank?

+  begin
     require 'simple-rss'
     # ... (use safer HTTP client as suggested above)
     
     rss.items.each do |i|
       url = i.link
       url = i.id if url.blank? || url !~ /^https?\:\/\//
+      next if url.blank?

       content = CGI.unescapeHTML(i.content.scrub)
       TopicEmbed.import(user, url, i.title, content)
     end
+  rescue => e
+    Rails.logger.error "Feed polling failed: #{e.message}"
+    # Optionally notify administrators
+  end
 end
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def poll_feed
user = User.where(username_lower: SiteSetting.embed_by_username.downcase).first
return if user.blank?
require 'simple-rss'
rss = SimpleRSS.parse open(SiteSetting.feed_polling_url)
rss.items.each do |i|
url = i.link
url = i.id if url.blank? || url !~ /^https?\:\/\//
content = CGI.unescapeHTML(i.content.scrub)
TopicEmbed.import(user, url, i.title, content)
end
end
def poll_feed
user = User.where(username_lower: SiteSetting.embed_by_username.downcase).first
return if user.blank?
begin
require 'simple-rss'
# ... (use safer HTTP client as suggested above)
rss = SimpleRSS.parse open(SiteSetting.feed_polling_url)
rss.items.each do |i|
url = i.link
url = i.id if url.blank? || url !~ /^https?\:\/\//
next if url.blank?
content = CGI.unescapeHTML(i.content.scrub)
TopicEmbed.import(user, url, i.title, content)
end
rescue => e
Rails.logger.error "Feed polling failed: #{e.message}"
# Optionally notify administrators
end
end
🧰 Tools
🪛 Brakeman (7.0.2)

[medium] 29-29: Model attribute used in file name
Type: File Access
Confidence: Medium
More info: https://brakemanscanner.org/docs/warning_types/file_access/

(File Access)

🤖 Prompt for AI Agents
In app/jobs/scheduled/poll_feed.rb around lines 24 to 38, the poll_feed method
currently does not handle errors that may occur during feed fetching or parsing,
which can cause silent failures. Wrap the feed fetching and parsing logic in a
begin-rescue block to catch exceptions such as network errors or parsing errors.
Log or handle these exceptions appropriately to prevent the job from crashing
silently and to aid in debugging issues with the feed.


end
end
9 changes: 9 additions & 0 deletions app/models/post.rb
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,10 @@ def self.types
@types ||= Enum.new(:regular, :moderator_action)
end

def self.cook_methods
@cook_methods ||= Enum.new(:regular, :raw_html)
end

def self.find_by_detail(key, value)
includes(:post_details).where(post_details: { key: key, value: value }).first
end
Expand Down Expand Up @@ -124,6 +128,11 @@ def post_analyzer
end

def cook(*args)
# For some posts, for example those imported via RSS, we support raw HTML. In that
# case we can skip the rendering pipeline.
return raw if cook_method == Post.cook_methods[:raw_html]

# Default is to cook posts
Plugin::Filter.apply(:after_post_cook, self, post_analyzer.cook(*args))
end

Expand Down
82 changes: 82 additions & 0 deletions app/models/topic_embed.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
require_dependency 'nokogiri'

class TopicEmbed < ActiveRecord::Base
belongs_to :topic
belongs_to :post
validates_presence_of :embed_url
validates_presence_of :content_sha1

# Import an article from a source (RSS/Atom/Other)
def self.import(user, url, title, contents)
return unless url =~ /^https?\:\/\//

contents << "\n<hr>\n<small>#{I18n.t('embed.imported_from', link: "<a href='#{url}'>#{url}</a>")}</small>\n"

embed = TopicEmbed.where(embed_url: url).first
content_sha1 = Digest::SHA1.hexdigest(contents)
post = nil

# If there is no embed, create a topic, post and the embed.
if embed.blank?
Topic.transaction do
creator = PostCreator.new(user, title: title, raw: absolutize_urls(url, contents), skip_validations: true, cook_method: Post.cook_methods[:raw_html])
post = creator.create
if post.present?
TopicEmbed.create!(topic_id: post.topic_id,
embed_url: url,
content_sha1: content_sha1,
post_id: post.id)
end
end
else
post = embed.post
# Update the topic if it changed
if content_sha1 != embed.content_sha1
revisor = PostRevisor.new(post)
revisor.revise!(user, absolutize_urls(url, contents), skip_validations: true, bypass_rate_limiter: true)
embed.update_column(:content_sha1, content_sha1)
end
end

post
end
Comment on lines +10 to +42
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Refactor complex method and address security concerns.

This method has high complexity (ABC size 24.78/23) and mixes multiple responsibilities. Additionally, there are security concerns with HTML injection in the footer.

Security Issue: The I18n interpolation includes raw HTML which could lead to XSS:

-contents << "\n<hr>\n<small>#{I18n.t('embed.imported_from', link: "<a href='#{url}'>#{url}</a>")}</small>\n"
+safe_url = ERB::Util.html_escape(url)
+contents << "\n<hr>\n<small>#{I18n.t('embed.imported_from', link: "<a href='#{safe_url}'>#{safe_url}</a>")}</small>\n"

Complexity Reduction: Consider extracting methods:

def self.import(user, url, title, contents)
  return unless valid_url?(url)
  
  contents_with_footer = add_source_footer(url, contents)
  content_sha1 = Digest::SHA1.hexdigest(contents_with_footer)
  embed = find_embed(url)
  
  if embed.blank?
    create_new_embed(user, url, title, contents_with_footer, content_sha1)
  else
    update_existing_embed(user, embed, contents_with_footer, content_sha1)
  end
end

private

def self.valid_url?(url)
  url =~ /\Ahttps?:\/\//
end

def self.add_source_footer(url, contents)
  safe_url = ERB::Util.html_escape(url)
  contents + "\n<hr>\n<small>#{I18n.t('embed.imported_from', link: "<a href='#{safe_url}'>#{safe_url}</a>")}</small>\n"
end
🧰 Tools
🪛 RuboCop (1.76.1)

[convention] 10-42: Assignment Branch Condition size for import is too high. [<7, 23, 6> 24.78/23]

(Metrics/AbcSize)

🤖 Prompt for AI Agents
In app/models/topic_embed.rb around lines 10 to 42, the import method is overly
complex and includes a security risk by interpolating raw HTML into the I18n
translation, which can lead to XSS vulnerabilities. Refactor the method by
extracting helper methods such as valid_url? to check the URL format,
add_source_footer to safely append the footer with escaped URLs, find_embed to
retrieve existing embeds, and separate methods to create or update embeds.
Ensure that URLs are HTML-escaped before interpolation to prevent injection
attacks.


def self.import_remote(user, url, opts=nil)
require 'ruby-readability'

opts = opts || {}
doc = Readability::Document.new(open(url).read,
tags: %w[div p code pre h1 h2 h3 b em i strong a img],
attributes: %w[href src])

TopicEmbed.import(user, url, opts[:title] || doc.title, doc.content)
end
Comment on lines +44 to +53
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Critical security vulnerability: SSRF attack vector.

The open(url).read call is a serious security vulnerability that allows Server-Side Request Forgery (SSRF) attacks. An attacker could provide URLs to internal services or file:// schemes.

Replace with a safer HTTP client with proper restrictions:

 def self.import_remote(user, url, opts=nil)
   require 'ruby-readability'
+  require 'net/http'
+  require 'uri'

   opts = opts || {}
-  doc = Readability::Document.new(open(url).read,
+  
+  # Validate URL and fetch safely
+  uri = URI.parse(url)
+  raise ArgumentError, 'Only HTTP/HTTPS URLs allowed' unless %w[http https].include?(uri.scheme)
+  raise ArgumentError, 'Invalid host' if uri.host.nil?
+  
+  # Prevent SSRF to internal networks
+  raise ArgumentError, 'Access to private networks not allowed' if private_network?(uri.host)
+  
+  response = fetch_url_safely(uri)
+  doc = Readability::Document.new(response,
                                       tags: %w[div p code pre h1 h2 h3 b em i strong a img],
                                       attributes: %w[href src])

   TopicEmbed.import(user, url, opts[:title] || doc.title, doc.content)
 end

+private
+
+def self.private_network?(host)
+  # Add logic to check for private IP ranges, localhost, etc.
+  %w[localhost 127.0.0.1 ::1].include?(host) ||
+  host.match(/\A(10\.|172\.(1[6-9]|2\d|3[01])\.|192\.168\.)/)
+end
+
+def self.fetch_url_safely(uri)
+  Net::HTTP.start(uri.host, uri.port, use_ssl: uri.scheme == 'https',
+                  open_timeout: 10, read_timeout: 30) do |http|
+    request = Net::HTTP::Get.new(uri)
+    response = http.request(request)
+    raise "HTTP error: #{response.code}" unless response.code == '200'
+    response.body
+  end
+end
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def self.import_remote(user, url, opts=nil)
require 'ruby-readability'
opts = opts || {}
doc = Readability::Document.new(open(url).read,
tags: %w[div p code pre h1 h2 h3 b em i strong a img],
attributes: %w[href src])
TopicEmbed.import(user, url, opts[:title] || doc.title, doc.content)
end
def self.import_remote(user, url, opts=nil)
require 'ruby-readability'
require 'net/http'
require 'uri'
opts = opts || {}
# Validate URL and fetch safely
uri = URI.parse(url)
raise ArgumentError, 'Only HTTP/HTTPS URLs allowed' unless %w[http https].include?(uri.scheme)
raise ArgumentError, 'Invalid host' if uri.host.nil?
# Prevent SSRF to internal networks
raise ArgumentError, 'Access to private networks not allowed' if private_network?(uri.host)
response = fetch_url_safely(uri)
doc = Readability::Document.new(response,
tags: %w[div p code pre h1 h2 h3 b em i strong a img],
attributes: %w[href src])
TopicEmbed.import(user, url, opts[:title] || doc.title, doc.content)
end
private
def self.private_network?(host)
# Simple checks for localhost and common private IP ranges
%w[localhost 127.0.0.1 ::1].include?(host) ||
host.match(/\A(10\.|172\.(1[6-9]|2\d|3[01])\.|192\.168\.)/)
end
def self.fetch_url_safely(uri)
Net::HTTP.start(uri.host, uri.port,
use_ssl: uri.scheme == 'https',
open_timeout: 10, read_timeout: 30) do |http|
request = Net::HTTP::Get.new(uri)
response = http.request(request)
raise "HTTP error: #{response.code}" unless response.code == '200'
response.body
end
end
🤖 Prompt for AI Agents
In app/models/topic_embed.rb around lines 44 to 53, the use of open(url).read
introduces a critical SSRF vulnerability by allowing arbitrary URL access.
Replace this with a safer HTTP client like Net::HTTP or an HTTP gem that
enforces URL validation and restricts schemes to http/https only. Implement
checks to prevent access to internal IP ranges or local files, and handle
timeouts and errors securely to mitigate SSRF risks.


# Convert any relative URLs to absolute. RSS is annoying for this.
def self.absolutize_urls(url, contents)
uri = URI(url)
prefix = "#{uri.scheme}://#{uri.host}"
prefix << ":#{uri.port}" if uri.port != 80 && uri.port != 443

fragment = Nokogiri::HTML.fragment(contents)
fragment.css('a').each do |a|
href = a['href']
if href.present? && href.start_with?('/')
a['href'] = "#{prefix}/#{href.sub(/^\/+/, '')}"
end
end
fragment.css('img').each do |a|
src = a['src']
if src.present? && src.start_with?('/')
a['src'] = "#{prefix}/#{src.sub(/^\/+/, '')}"
end
end

fragment.to_html
end

def self.topic_id_for_embed(embed_url)
TopicEmbed.where(embed_url: embed_url).pluck(:topic_id).first
end

end
30 changes: 30 additions & 0 deletions app/views/embed/best.html.erb
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
<header>
<%- if @topic_view.posts.present? %>
<%= link_to(I18n.t('embed.title'), @topic_view.topic.url, class: 'button', target: '_blank') %>
<%- else %>
<%= link_to(I18n.t('embed.start_discussion'), @topic_view.topic.url, class: 'button', target: '_blank') %>
<%- end if %>
Comment on lines +3 to +6
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add security attributes to external links.

Links that open in new tabs should include rel="noopener" to prevent the new page from accessing the window.opener property, which is a security risk.

Apply this diff:

-    <%= link_to(I18n.t('embed.title'), @topic_view.topic.url, class: 'button', target: '_blank') %>
+    <%= link_to(I18n.t('embed.title'), @topic_view.topic.url, class: 'button', target: '_blank', rel: 'noopener') %>
   <%- else %>
-    <%= link_to(I18n.t('embed.start_discussion'), @topic_view.topic.url, class: 'button', target: '_blank') %>
+    <%= link_to(I18n.t('embed.start_discussion'), @topic_view.topic.url, class: 'button', target: '_blank', rel: 'noopener') %>

Also apply the same fix to lines 14, 25, and 26.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
<%= link_to(I18n.t('embed.title'), @topic_view.topic.url, class: 'button', target: '_blank') %>
<%- else %>
<%= link_to(I18n.t('embed.start_discussion'), @topic_view.topic.url, class: 'button', target: '_blank') %>
<%- end if %>
<%= link_to(I18n.t('embed.title'), @topic_view.topic.url, class: 'button', target: '_blank', rel: 'noopener') %>
<%- else %>
<%= link_to(I18n.t('embed.start_discussion'), @topic_view.topic.url, class: 'button', target: '_blank', rel: 'noopener') %>
<%- end if %>
🤖 Prompt for AI Agents
In app/views/embed/best.html.erb around lines 3 to 6, 14, 25, and 26, the
external links that open in new tabs use target="_blank" but lack the
rel="noopener" attribute, which is a security risk. Update all these link_to
helper calls to include rel: "noopener" in their HTML options to prevent the new
page from accessing window.opener.


<%= link_to(image_tag(SiteSetting.logo_url, class: 'logo'), Discourse.base_url) %>
</header>

<%- if @topic_view.posts.present? %>
<%- @topic_view.posts.each do |post| %>
<article class='post'>
<%= link_to post.created_at.strftime("%e %b %Y"), post.url, class: 'post-date', target: "_blank" %>
<div class='author'>
<img src='<%= post.user.small_avatar_url %>'>
<h3><%= post.user.username %></h3>
</div>
<div class='cooked'><%= raw post.cooked %></div>
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Security concern: Raw HTML rendering without sanitization.

The <%= raw post.cooked %> line renders HTML content directly without sanitization, which could introduce XSS vulnerabilities if the content isn't properly validated upstream.

Verify that the post.cooked content is properly sanitized before rendering. If this is for trusted embedded content only, ensure proper validation occurs in the TopicEmbed.import or Post.cook methods. Consider adding a comment explaining why raw HTML is safe here.

Let me check how the post content is processed:


🏁 Script executed:

#!/bin/bash
# Check how post.cooked content is processed and if it's sanitized
ast-grep --pattern 'def cook($$$) { $$$ }'
rg -A 10 "cook.*raw_html"
rg -A 5 "TopicEmbed.*import"

Length of output: 5095


Action Required: Sanitize embedded HTML and harden link‐security

It turns out that post.cooked is rendered with the :raw_html cook method (via PostCreator in TopicEmbed.import), so no HTML sanitization is applied before it’s sent to the view. Rendering it verbatim can introduce XSS if the source isn’t fully trusted. In addition, we still need to tighten up link security, remove inline styles, and localize dates:

• app/views/embed/best.html.erb (line 19)

  • Replace
    <%= raw post.cooked %>
    with something like:
    <%= sanitize(post.cooked,
         tags: %w[a p img ul ol li strong em blockquote],
         attributes: %w[href src title alt rel target]) %>
    or ensure upstream that only safe HTML is passed here.
    • Every <a target="_blank"> in this template needs a rel="noopener noreferrer".
    • Remove any inline style="…" attributes and move them into your CSS.
    • Replace hard-coded date formatting (e.g. post.created_at.strftime(...)) with I18n—e.g.
<%= l(post.created_at, format: :short) %>
🤖 Prompt for AI Agents
In app/views/embed/best.html.erb at line 19, replace the raw rendering of
post.cooked with a sanitized version using the sanitize helper, allowing only
safe tags and attributes to prevent XSS. Ensure all <a> tags with
target="_blank" include rel="noopener noreferrer" to improve link security.
Remove any inline style attributes from the HTML and move those styles into CSS
files. Also, replace any hard-coded date formatting with the I18n localization
helper l(), using a suitable format like :short for dates.

<div style='clear: both'></div>
</article>
<%- end %>

<footer>
<%= link_to(I18n.t('embed.continue'), @topic_view.topic.url, class: 'button', target: '_blank') %>
<%= link_to(image_tag(SiteSetting.logo_url, class: 'logo'), Discourse.base_url) %>
</footer>

<% end %>

Loading