-
Notifications
You must be signed in to change notification settings - Fork 7
/
index.html
165 lines (155 loc) · 8.3 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Brighter than Gold: Figurative Language in User Generated Comparisons</title>
<link rel="stylesheet" href="style.css">
<link href='http://fonts.googleapis.com/css?family=VT323|Karla:700|Open+Sans:400,300,300italic,400italic|Montserrat:400,700' rel='stylesheet' type='text/css'>
<!--<script src="script.js"></script>-->
</head>
<body>
<header>
<div class="readable">
<h1>Brighter than <span class='gold'>Gold</span></h1>
<h2>Figurative Language in User Generated Comparisons</h2>
</div>
</header>
<div id="content">
<div class="authors"><p><a href="http://vene.ro/">Vlad Niculae</a>
and <a href="http://mpi-sws.org/~cristian">Cristian
Danescu-Niculescu-Mizil</a><br />
EMNLP 2014 paper. [ <a href="niculae14comparisons.pdf">pdf</a> ] [ <a href="#paper">cite</a> ] [ <a href="#dataset">data</a> ] [ <a href="brighter-slides.pdf">slides</a> ]</div>
<!--
<ul>
<li>Get the <a href="#">data</a> (contains this <a href=#>readme</a>).</li>
<li>Read the <a href="#">paper</a>.</li>
<li>Cite in <a href="#">BibTeX</a>.</li>
</ul>
-->
<blockquote>
When you spend a large part of your life underground, you develop a very literal mind.
Dwarfs have no use for metaphor and simile. Rocks are hard, the darkness is
dark.
Start messing around with descriptions like that and you’re in big trouble, is their motto.
<footer>—Terry Pratchett, <cite>Guards! Guards!</cite></footer>
</blockquote>
<h3>TL; DR</h3>
<p>
Unlike dwarfs, we humans love trouble, so we study figurative comparisons
(similes) in the wild, from Amazon product reviews. We make available an
<a href="#dataset">annotated dataset</a>.
We manage to detect figurativeness with high accuracy, using linguistic
features.
This puts us in the novel position of being able to investigate the interaction
of figurative language use and social context: we show strong relationships
between figurative and review rating and helpfulness.</p>
<h3>What are comparisons and similes?</h3>
<p>Comparisons are phrases that express the likeness of two things. They
are useful for communicating something potentially new, helping the audience
picture it better and frame it better in relation to something known.</p>
<p>Often, comparisons are not meant to be taken literally. Figurative
comparisons are an important figure of speech called <em>simile</em>. The
difference can be seen in the following examples, paraphrased from Amazon
reviews:</p>
<ul>
<li><span class="topic">Sterling</span> is much cheaper than <span class="vehicle">gold</span>.</li>
<li><span class="topic">Her voice</span> makes this song shine brighter than <span class="vehicle">gold</span>.</li>
</ul>
<p>
There is no simple way to automatically tell whether a comparison is literal or
figurative. The difference between the two is sometimes subtle and subjective,
to the point that humans find it difficult and sometimes disagree when having
to distinguish between such tricky cases. Using linguistic and domain-specific
cues, we manage to get within 10% of human performance on our data.
</p>
<!--
<p>Comparisons are phrases that express the likeness of two things.
They are useful in introducing new concepts in terms
of known, given ones. Here's how Terry Pratchett describes
seeing a particularily ugly dragon for the first time (emphasis ours):</p>
<blockquote>
Something in its ancestry had given it <span class="topic">a pair of eyebrows</span> that were about the same size as <span class="vehicle">its stubby wings</span>, which could never have supported it in the air.
<span class="topic">Its head</span> was the wrong shape, like <span class="vehicle">an anteater</span>.
It had <span class="topic">nostrils</span> like <span class="vehicle">jet intakes</span>.
If it ever managed to get airborne <span class="topic">the things</span> would have the drag of <span class="vehicle">twin parachutes</span>.
<footer>—Terry Pratchett, <cite>Guards! Guards!</cite></footer>
</blockquote>
<p>Each sentence above contains a comparison. While the first, between the dragon's
eyebrows and its wings, is a matter-of-fact, literal comparison in terms of the size,
the next sentences go a bit wild. The rest of the comparisons suggest less
direct likenessess, making them figurative in meaning. Such comparisons
are called <em>similes</em>. </p>
-->
<h3>Why do they matter?</h3>
<p>
People like similes, as shown by the <a
href="https://www.goodreads.com/quotes/tag/simile">popularity of the best ones
on goodreads</a>! But figurative language is not at all restricted to literature
and poetic language.
It turns out people use it a lot when describing stuff.
We find that
about <strong>30% of the comparisons in Amazon reviews are figurative</strong>.</p>
<p>The use of similes is strongly related to <strong>extreme opinion</strong>
in terms of review ratings:</p>
<div class="figure"> <img src="stars.png" /></div>
<p>Also, the comparisons in reviews found helpful by more people tend to be
more literal:</p>
<div class="figure"><img src="helpful.png" /></div>
<h3 id="dataset">Dataset</h3>
<p><strong>Download the <a href="figurative-comparisons-data.zip">dataset</a></strong>
(contains this <a href="README">readme</a>).</p>
<p>
This dataset contains a collection of 1400 comparisons annotated for
figurativeness together with the context in which they appeared. The
comparisons are extracted mostly from Amazon.com product reviews (1260
comparisons) and from the general web (140 comparisons). </p>
<!-- easter egg: color inspiration for this website:
http://www.colourlovers.com/palette/3428383/Expecto_Patronum -->
<h3 id="paper">Paper</h3>
<p><strong>Download the <a href="niculae14comparisons.pdf">PDF</a></strong>.</p>
<p><strong>BibTeX entry:</strong><br />
<pre class="bibtex">
@inproceedings{niculae14brighter,
author = {Vlad Niculae and Cristian Danescu-Niculescu-Mizil},
title = {{Brighter than gold: Figurative language in user generated comparisons}},
booktitle = {Proceedings of EMNLP},
month = {October},
year = {2014},
}
</pre>
</p>
</p>
<p><strong>Abstract. </strong>
Comparisons are common linguistic devices used to indicate the likeness of two things.
Often, this likeness is not meant in the literal sense—for example, "I slept like a log" does not imply that logs actually sleep.
In this paper we propose a computational study of figurative comparisons, or <em>similes</em>.
Our starting point is a new large dataset of comparisons extracted from product reviews and annotated for figurativeness.
We use this dataset to characterize figurative language in naturally occurring
comparisons and reveal linguistic patterns indicative of this phenomenon.
We operationalize these insights and apply them to a new task with high relevance to text understanding: distinguishing between figurative and literal comparisons.
Finally, we apply this framework to explore the social context in which figurative language is produced, showing that similes are more likely to accompany opinions showing extreme sentiment, and that they are uncommon in reviews deemed helpful.
</p>
<!--
<h3> Maecenas egestas vehicula </h3>
<p>Vivamus consectetur fringilla lorem, sed viverra felis laoreet eu.
Etiam ac fermentum quam. <a href="http://www.colourlovers.com/palette/3428383/Expecto_Patronum">Nullam luctus varius</a> ligula ut porta. Nulla sed libero
justo. Aenean ut erat ac mauris volutpat pharetra sed porttitor est. Nam
volutpat augue diam, sed mattis orci sodales nec. Praesent congue <a href="unseen">sem tellus</a>,
luctus sagittis quam porttitor in. Sed pulvinar luctus purus id ultrices.</p>
-->
<footer><p> Image credit: <a href="https://flic.kr/p/3mG7va">Kathleen Tyler Conklin</a>. Go <a href="/">home</a>.
<a href="/privacy.html">Privacy policy.</a></footer>
<!-- evil trackers so we can see if we're famous yet -->
<script type="text/javascript">
var _gaq = _gaq || [];
_gaq.push(['_setAccount', 'UA-47024389-1']);
_gaq.push(['_trackPageview']);
(function() {
var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
})();
</script>
</div>
</body>
</html>