Skip to content

Commit

Permalink
Change text extraction in PlainTextFormatter to be faster (#26727)
Browse files Browse the repository at this point in the history
  • Loading branch information
ClearlyClaire committed Sep 5, 2023
1 parent 2b0cabe commit 3d72b38
Showing 1 changed file with 5 additions and 8 deletions.
13 changes: 5 additions & 8 deletions app/lib/plain_text_formatter.rb
Original file line number Diff line number Diff line change
@@ -1,9 +1,7 @@
# frozen_string_literal: true

class PlainTextFormatter
include ActionView::Helpers::TextHelper

NEWLINE_TAGS_RE = /(<br \/>|<br>|<\/p>)+/.freeze
NEWLINE_TAGS_RE = %r{(<br />|<br>|</p>)+}

attr_reader :text, :local

Expand All @@ -18,7 +16,10 @@ def to_s
if local?
text
else
html_entities.decode(strip_tags(insert_newlines)).chomp
node = Nokogiri::HTML.fragment(insert_newlines)
# Elements that are entirely removed with our Sanitize config
node.xpath('.//iframe|.//math|.//noembed|.//noframes|.//noscript|.//plaintext|.//script|.//style|.//svg|.//xmp').remove
node.text.chomp
end
end

Expand All @@ -27,8 +28,4 @@ def to_s
def insert_newlines
text.gsub(NEWLINE_TAGS_RE) { |match| "#{match}\n" }
end

def html_entities
HTMLEntities.new
end
end

0 comments on commit 3d72b38

Please sign in to comment.