Skip to content

Commit

Permalink
Use Rails helper strip_tags to remove potentially dangerous tags from…
Browse files Browse the repository at this point in the history
… url, fetch_url and title attributes of Feed model.

Those three attributes are entered by the user, and therefore cannot be trusted. We want to sanitize them removing any dangerous markup or scripts: in particular those three attributes should always contain plain text (URLs and the feed title), never HTML markup.

With this commit a Rails helper is used to strip tags; before this the Sanitize gem was used. The problem with the Sanitize gem is that it assumes it operates on HTML markup and it HTML-encodes any reserved characters, forcing us to do an unencodeHTML afterwards; this can cause problems with some URLs. For example this URL:

http://www.drwindows.de/external.php?do=rss&type=newcontent&sectionid=1&days=120&count=10

the "&sectionid" part is problematic, because "§" is the url-encoding of the § character. The encoding-unencoding code gets confused and the output is a different URL which no longer points to the feed.

The fix is not to treat as HTML during sanitization of attributes that do not contain HTML (most importantly, URLs). The strip_tags helper outputs plain text without ever HTML-encoding the strings.

This fixes a bug in which the above sample URL could not be subscribed by FeedBunch.

Pending: extend this change to other models (e.g. entry URLs). Move to the new Subscriber class the code that sanitizes HTML fragments (e.g. entry bodies). Remove any configurations for the Sanitize gem that are no longer used, if any.
  • Loading branch information
amatriain committed Nov 24, 2016
1 parent 7df0ed7 commit 1e6b6e4
Show file tree
Hide file tree
Showing 2 changed files with 25 additions and 8 deletions.
13 changes: 5 additions & 8 deletions app/models/feed.rb
Expand Up @@ -4,6 +4,7 @@
require 'schedule_manager'
require 'feed_blacklister'
require 'url_normalizer'
require 'sanitizer'

##
# Feed model. Each instance of this model represents a single feed (Atom, RSS...) to which users can be suscribed.
Expand Down Expand Up @@ -199,7 +200,7 @@ def single_user_folder(folder)
# Various operations before each validation:
# - fix any encoding problems, converting to utf-8 if necessary
# - set default values for missing attributes
# - sanitize values, removing script tags from entry bodies etc.
# - sanitize values, removing script tags from titles etc.
# - encode any invalid characters in url and fetch_url
# - check if the feed url or fetch_url is blacklisted, and if so a BlacklistedUrlError is raised

Expand Down Expand Up @@ -242,13 +243,9 @@ def default_values
# the update only for that attribute and keep the old value.

def sanitize_attributes
config = Feedbunch::Application.config.restricted_sanitizer

self.title = Sanitize.fragment(self.title, config)&.strip

# Unescape HTML entities in the URL escaped by the sanitizer
self.fetch_url = CGI.unescapeHTML(Sanitize.fragment(self.fetch_url, config)&.strip)
self.url = CGI.unescapeHTML(Sanitize.fragment(self.url, config)&.strip)
self.title = Sanitizer.sanitize_plaintext self.title
self.fetch_url = Sanitizer.sanitize_plaintext self.fetch_url
self.url = Sanitizer.sanitize_plaintext self.url

# URLs must be valid http or https
if self.fetch_url_was.present? && (self.fetch_url =~ URI::regexp(%w{http https})).nil?
Expand Down
20 changes: 20 additions & 0 deletions lib/sanitizer.rb
@@ -0,0 +1,20 @@
##
# Class with methods related to sanitizing user input to remove potentially malicious input.

class Sanitizer

##
# Sanitize the passed string, removing any tags to leave only plain text. This avoids users entering malicious
# markup in text fields (e.g. URL field in subscribe popup).
#
# Receives as argument a string.
#
# If a nil or empty string is passed, returns nil.

def self.sanitize_plaintext(unsanitized_text)
# Check that the passed string contains something
return nil if unsanitized_text.blank?
sanitized_text = ActionController::Base.helpers.strip_tags(unsanitized_text)&.strip
return sanitized_text
end
end

0 comments on commit 1e6b6e4

Please sign in to comment.