Skip to content

Commit

Permalink
Improved image stripping.
Browse files Browse the repository at this point in the history
  • Loading branch information
kenpratt committed Apr 19, 2010
1 parent 36033e7 commit 86e7437
Show file tree
Hide file tree
Showing 3 changed files with 27 additions and 1 deletion.
5 changes: 4 additions & 1 deletion lib/wikipedia/page.rb
Original file line number Original file line Diff line number Diff line change
Expand Up @@ -80,7 +80,10 @@ def self.sanitize( s )
# strip internal links # strip internal links
s.gsub!(/\[\[([^\]\|]+?)\|([^\]\|]+?)\]\]/, '\2') s.gsub!(/\[\[([^\]\|]+?)\|([^\]\|]+?)\]\]/, '\2')
s.gsub!(/\[\[([^\]\|]+?)\]\]/, '\1') s.gsub!(/\[\[([^\]\|]+?)\]\]/, '\1')
s.gsub!(/\[\[File:[^\]]+?\]\]/, '') # remove file links completely
# strip images and file links
s.gsub!(/\[\[Image:[^\[\]]+?\]\]/, '')
s.gsub!(/\[\[File:[^\[\]]+?\]\]/, '')


# convert bold/italic to html # convert bold/italic to html
s.gsub!(/'''''(.+?)'''''/, '<b><i>\1</i></b>') s.gsub!(/'''''(.+?)'''''/, '<b><i>\1</i></b>')
Expand Down
16 changes: 16 additions & 0 deletions spec/fixtures/sanitization_samples/Sashimi-raw.txt
Original file line number Original file line Diff line number Diff line change
@@ -0,0 +1,16 @@
{{for|the open software project|Mass spectrometry data format}}
[[Image:Sashimi.jpg|thumb|right|A sashimi dinner set]]
[[Image:Fugu_sashimi.jpg|thumb|right|''Tessa'' ([[Puffer Fish]])]]
[[Image:Salmon sashimi.jpg|thumb|right|A sashimi salmon rose]]
[[Image:Chef preparing octopus sashimi, December 2005.jpg|thumb|In restaurants, sashimi often is prepared at a bar, in view of the patrons.]]
[[Image:Salmon Sashimi with Calamansi.JPG|thumb|250pcx|Salmon sashimi served with [[Calamondin|calamansi]]. The juice can be mixed with the shoyu or applied directly to the sashimi.]]
[[Image:Sashimis.jpg|thumb|250[cx|Assorted sashimi]]
'''Sashimi''' ({{lang-ja|刺身}}, {{IPA-ja|saɕimiꜜ|pron}}; {{IPA-en|səˈʃiːmiː|lang}}) is a [[Japanese cuisine|Japanese delicacy]] primarily consisting of very fresh raw [[seafood]], sliced into thin pieces and served with only a dipping sauce ([[soy sauce]] with [[wasabi]] paste or other condiments such as grated fresh [[ginger]], or [[ponzu]]), depending on the fish, and simple garnishes such as [[perilla|shiso]] and shredded [[daikon]] radish. Dimensions vary depending on the type of item and chef, but are typically about 2.5 cm (1") wide by 4 cm (1.5") long by 0.5 cm (0.2") thick.

The word ''sashimi'' means "pierced body", i.e. "[[wikt:刺身|刺身]] = ''sashimi'' = [[wikt:刺|刺]][[wikt:し|し]] = ''sashi'' (pierced, stuck) and [[wikt:身|身]] = ''mi'' (body, meat).
This word had used from [[Muromachi period]], withheld the word "[[wikt:切る|切る]] = ''Cut'' ,the culinary step, by [[Samurai]] for inauspiciousness.
This word may derive from the culinary practice of sticking the fish's tail and fin to the slices in identifying the fish being eaten.

Another possibility for the name could come from the traditional method of harvesting. 'Sashimi Grade' fish is caught by individual handline, and as soon as the fish is landed, its brain is pierced with a sharp spike, killing it instantly, then placed in slurried ice. This spiking is called the [[Ike jime]] process. Because the flesh thus contains minimal lactic acid from the fish dying slowly, it will keep fresh on ice for about 10 days without turning white, or otherwise degrading. {{Fact|date=October 2007}}

The word ''sashimi'' has been integrated into the [[English language]] and is often used to refer to other uncooked fish preparations besides the traditional Japanese dish subject of this article. Many non-Japanese conflate sashimi and [[sushi]]; the two dishes are actually distinct and separate. Sushi refers to any dish made with vinegared rice, and while raw fish is one traditional sushi ingredient, many sushi dishes contain seafood that has been cooked, while others have no seafood at all.
7 changes: 7 additions & 0 deletions spec/fixtures/sanitization_samples/Sashimi-sanitized.txt
Original file line number Original file line Diff line number Diff line change
@@ -0,0 +1,7 @@
<p>[[Image:Sashimis.jpg|thumb|250[cx|Assorted sashimi]]
<b>Sashimi</b> (, ; ) is a Japanese delicacy primarily consisting of very fresh raw seafood, sliced into thin pieces and served with only a dipping sauce (soy sauce with wasabi paste or other condiments such as grated fresh ginger, or ponzu), depending on the fish, and simple garnishes such as shiso and shredded daikon radish. Dimensions vary depending on the type of item and chef, but are typically about 2.5 cm (1") wide by 4 cm (1.5") long by 0.5 cm (0.2") thick.</p>
<p>The word <i>sashimi</i> means "pierced body", i.e. "刺身 = <i>sashimi</i> = 刺し = <i>sashi</i> (pierced, stuck) and 身 = <i>mi</i> (body, meat).
This word had used from Muromachi period, withheld the word "切る = <i>Cut</i> ,the culinary step, by Samurai for inauspiciousness.
This word may derive from the culinary practice of sticking the fish's tail and fin to the slices in identifying the fish being eaten.</p>
<p>Another possibility for the name could come from the traditional method of harvesting. 'Sashimi Grade' fish is caught by individual handline, and as soon as the fish is landed, its brain is pierced with a sharp spike, killing it instantly, then placed in slurried ice. This spiking is called the Ike jime process. Because the flesh thus contains minimal lactic acid from the fish dying slowly, it will keep fresh on ice for about 10 days without turning white, or otherwise degrading.</p>
<p>The word <i>sashimi</i> has been integrated into the English language and is often used to refer to other uncooked fish preparations besides the traditional Japanese dish subject of this article. Many non-Japanese conflate sashimi and sushi; the two dishes are actually distinct and separate. Sushi refers to any dish made with vinegared rice, and while raw fish is one traditional sushi ingredient, many sushi dishes contain seafood that has been cooked, while others have no seafood at all.</p>

0 comments on commit 86e7437

Please sign in to comment.