From 1e6b6e435faf51521354cbc7a76fc4ab51fefefa Mon Sep 17 00:00:00 2001 From: Alfredo Amatriain Date: Thu, 24 Nov 2016 19:43:25 +0100 Subject: [PATCH] Use Rails helper strip_tags to remove potentially dangerous tags from url, fetch_url and title attributes of Feed model. MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Those three attributes are entered by the user, and therefore cannot be trusted. We want to sanitize them removing any dangerous markup or scripts: in particular those three attributes should always contain plain text (URLs and the feed title), never HTML markup. With this commit a Rails helper is used to strip tags; before this the Sanitize gem was used. The problem with the Sanitize gem is that it assumes it operates on HTML markup and it HTML-encodes any reserved characters, forcing us to do an unencodeHTML afterwards; this can cause problems with some URLs. For example this URL: http://www.drwindows.de/external.php?do=rss&type=newcontent§ionid=1&days=120&count=10 the "§ionid" part is problematic, because "§" is the url-encoding of the ยง character. The encoding-unencoding code gets confused and the output is a different URL which no longer points to the feed. The fix is not to treat as HTML during sanitization of attributes that do not contain HTML (most importantly, URLs). The strip_tags helper outputs plain text without ever HTML-encoding the strings. This fixes a bug in which the above sample URL could not be subscribed by FeedBunch. Pending: extend this change to other models (e.g. entry URLs). Move to the new Subscriber class the code that sanitizes HTML fragments (e.g. entry bodies). Remove any configurations for the Sanitize gem that are no longer used, if any. --- app/models/feed.rb | 13 +++++-------- lib/sanitizer.rb | 20 ++++++++++++++++++++ 2 files changed, 25 insertions(+), 8 deletions(-) create mode 100644 lib/sanitizer.rb diff --git a/app/models/feed.rb b/app/models/feed.rb index e056879af..b8e1a2e25 100644 --- a/app/models/feed.rb +++ b/app/models/feed.rb @@ -4,6 +4,7 @@ require 'schedule_manager' require 'feed_blacklister' require 'url_normalizer' +require 'sanitizer' ## # Feed model. Each instance of this model represents a single feed (Atom, RSS...) to which users can be suscribed. @@ -199,7 +200,7 @@ def single_user_folder(folder) # Various operations before each validation: # - fix any encoding problems, converting to utf-8 if necessary # - set default values for missing attributes - # - sanitize values, removing script tags from entry bodies etc. + # - sanitize values, removing script tags from titles etc. # - encode any invalid characters in url and fetch_url # - check if the feed url or fetch_url is blacklisted, and if so a BlacklistedUrlError is raised @@ -242,13 +243,9 @@ def default_values # the update only for that attribute and keep the old value. def sanitize_attributes - config = Feedbunch::Application.config.restricted_sanitizer - - self.title = Sanitize.fragment(self.title, config)&.strip - - # Unescape HTML entities in the URL escaped by the sanitizer - self.fetch_url = CGI.unescapeHTML(Sanitize.fragment(self.fetch_url, config)&.strip) - self.url = CGI.unescapeHTML(Sanitize.fragment(self.url, config)&.strip) + self.title = Sanitizer.sanitize_plaintext self.title + self.fetch_url = Sanitizer.sanitize_plaintext self.fetch_url + self.url = Sanitizer.sanitize_plaintext self.url # URLs must be valid http or https if self.fetch_url_was.present? && (self.fetch_url =~ URI::regexp(%w{http https})).nil? diff --git a/lib/sanitizer.rb b/lib/sanitizer.rb new file mode 100644 index 000000000..edf046d03 --- /dev/null +++ b/lib/sanitizer.rb @@ -0,0 +1,20 @@ +## +# Class with methods related to sanitizing user input to remove potentially malicious input. + +class Sanitizer + + ## + # Sanitize the passed string, removing any tags to leave only plain text. This avoids users entering malicious + # markup in text fields (e.g. URL field in subscribe popup). + # + # Receives as argument a string. + # + # If a nil or empty string is passed, returns nil. + + def self.sanitize_plaintext(unsanitized_text) + # Check that the passed string contains something + return nil if unsanitized_text.blank? + sanitized_text = ActionController::Base.helpers.strip_tags(unsanitized_text)&.strip + return sanitized_text + end +end