/
extractedText.txt
161 lines (106 loc) · 390 KB
/
extractedText.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
These are some pretty sweet Twitter accounts! (Too much of a stretch?) [/caption] Anne Charity HudleyDr. Charity Hudley is professor at the College of William and Mary (Go Tribe!). Her research focuses on language variation, especially the use of varieties such as African American English, in the classroom. If you know any teachers, they might find her two books on language variation in the classroom a useful resource. She and Christine Mallinson have even released an app to go with them!Michel DeGraffDr. Michel DeGraff is a professor at MIT. His research is on Haitian Creole, and he's been very active in advocating for the official recognition of Haitian Creole as a distinct language. If you're not sure what Haitian Creole looks like, go check out his Twitter; many of his tweets are in the language! He's also done some really cool work on using technology to teach low-resource languages.Nelson FloresDr. Nelson Flores is a professor at the University of Pennsylvania. His work focuses on how we create the ideas of race and language, as well as bilingualism/multilingualism and bilingual education. I really enjoy his thought-provoking discussions of recent events on his Twitter account. He also runs a blog, which is a good resource for more in-depth discussion.Nicole HollidayDr. Nicole Holliday is (at the moment) Chau Mellon Postdoctoral Scholar at Pomona College. Her research focuses on language use by biracial speakers. I saw her talk on how speakers use pitch differently depending on who they're talking to at last year's LSA meeting and it was fantastic: I'm really looking forwards to seeing her future work! She's also a contributor to Word., an online journal about African American English.Rupal PatelDr. Rupal Patel is a professor at Northeastern University, and also the founder and CEO of VocaliD. Her research focuses on the speech of speakers with developmental disabilities, and how technology can ease communication for them. One really cool project she's working on that you can get involved with is The Human Voicebank. This is collection of voices from all over the world that is used to make custom synthetic voices for those who need them for day-to-day communication. If you've got a microphone and a quiet room you can help out by recording and donating your voice.John R. RickfordLast, but definitely not least, is Dr. John Rickford, a professor at Stanford. If you've taken any linguistics courses, you're probably already familiar with his work. He's one of the leading scholars working on African American English and was crucial in bringing a research-based evidence to bare on the Ebonics controversy. If you're interested, he's also written a non-academic book on African American English that I would really highly recommend; it even won the American Book Award!]]>
Oxford English Dictionary and the Dictionary of American Regional English. Out of the twenty two people who responded to my Twitter poll (which was probably mostly other linguists, given my social networks) only one other person said they'd even heard the word and, as I later confirmed, it turned out to be one of my college friends.So what is this mysterious word that has so far evaded academic inquiry? Ladies, gentlemen and all others, please allow me to introduce you to...[caption id=attachment_4038 align=aligncenter width=342] Pronounced 'b<U+028C>m.p<U+026A>s or 'b<U+028C>m.p<U+0259>s. You can hear me say the word and use it in context by listening to this low quality recording.[/caption]The word means something like fool or incompetent person. To prove that this is actually a real word that people other than me use, I've (very, very laboriously) found some examples from the internet. It shows up in the comments section of this news article:THAT is why people are voting for Mr Trump, even if he does act sometimes like a Bumpus.I also found it in a smattering of public tweets like this one:If you ever meet my dad, please ask him what a bumpus isAnd this one:Having seen horror of war, one would think, John McCain would run from war. No, he runs to war, to get us involved. What a bumpus.And, my personal favorite, this one:because the SUN(in that pic) is wearing GLASSES god karen ur such a bumpusThere's also an Urban Dictionary entry which suggests the definition:A raucous, boisterous person or thing (usually african-american.)I'm a little sceptical about the last one, though. Partly because it doesn't line up with my own intuitions (I feel like a bumpus is more likely to be silent than rowdy) and partly becuase less popular Urban Dictionary entries, especially for words that are also names, are super unreliable.I also wrote to my parents (Hi mom! Hi dad!) and asked them if they'd used the word growing up, in what contexts, and who they'd learned it from. My dad confirmed that he'd heard it growing up (mom hadn't) and had a suggestion for where it might have come from:I am pretty sure my dad used it - invariably in one of the two phrases [don't be a bumpus or don't stand there like a bumpus].... Bumpass, Virginia is in Lousia County .... Growing up in Norfolk, it could have held connotations of really rural Virginia, maybe, for Dad.While this is definitely a possibility, I don't know that it's definitely the origin of the word. Bumpass, Virginia, like Bumpass Hell (see this review, which also includes the phrase Don't be a bumpass), was named for an early settler. Interestingly, the college friend mentioned earlier is also from the Tidewater region of Virginia, which leads me to think that the word may have originated there.My mom offered some other possible origins, that the term might be related to country bumpkin or bump on a log. I think the latter is especially interesting, given that bump on a log and bumpus show up in exactly the same phrase: standing/sitting there like a _______.She also suggested it might be related to bumpkis or bupkis. This is a possibility, especially since that word is definitely from Yiddish and Norfolk, VA does have a history of Jewish settlement and Yiddish speakers.A usage of Bumpus which seems to be the most common is in phrases like Bumpus dog or Bumpus hound. I think that this is probably actually a different use, though, and a direct reference to a scene from the movie A Christmas Story:[youtube https://www.youtube.com/watch?v=pPRdj1Ce4ao]One final note is that there was a baseball pitcher in the late 1890's who went by the nickname Bumpus: Bumpus Jones. While I can't find any information about where the nickname came from, this post suggests that his family was from Virginia and that he had Powhatan ancestry.I'm really interesting in learning more about this word and its distribution. My intuition is that it's mainly used by older, white speakers in the South, possibly centered around the Tidewater region of Virginia.If you've heard of or used this word, please leave a comment or drop me a line letting me know 1) roughly how old you are, 2) where you grew up and 3) (if you can remember) where you learned it. Feel free to add any other information you feel might be relevant, too! ]]>
gender differences in speech recognition:Is there a voice recognition product that is focusing on women's voices or allows for configuring for women's voices (or the characteristics of women's voices)?I don't know of any ASR systems specifically designed for women. But the answer to the second half of your question is yes!There are two main types of automatic speech recognition, or ASR, systems. The first is speaker independnet. These are systems, like YouTube automatic captions or Apple's Siri, that should work equally well across a large number of different speakers. Of course, as many other researchers have found and I corroborated in my own investigation, that's not always the case. A major reason for this is socially-motivated variation between speakers. This is something we all know as language users. You can guess (with varying degrees of accuracy) a lot about someone from just their voice: thier sex, whether they're young or old, where they grew up, how educated they are, how formal or casual they're being.So what does this mean for speech recognition? Well, while different speakers speak in a lot of different ways, individual speakers tend to use less variation. (With the exception of bidialectal speakers, like John Barrowman.) Which brings me nicely to the second type of speech recognition: speaker dependent. These are systems that are designed to work for one specific speaker, and usually to adapt and get more accurate for that speaker over time.If you read some of my earlier posts, I suggested that the different performance for between dialects and genders was due to imbalances in the training data. The nice thing about speaker dependent systems is that the training data is made up of one voice: yours. (Although the system is usually initialized based on some other training set.)So how can you get a speaker dependent ASR system? By buying software such as Dragon speech recognition. This is probably the most popular commercial speaker-dependent voice recognition software (or at least the one I hear the most about). It does, however, cost real money. Making your own! If you're feeling inspired, you can make your own personalized ASR system. I'd recommend the CMU Sphinx toolkit; it's free and well-documented. To make your own recognizer, you'll need to build your own language model using text you've written as well as adapt the acoustic model using your recorded speech. The former lets the recognizer know what words you're likely to say, and the latter how you say things. (If you're REALLY gung-ho you can even build your own acoustic model from scratch, but that's pretty involved.)In theory, the bones of any ASR system should work equally well on any spoken human language. (Sign language recognition is a whole nother kettle of fish.) The difficulty is getting large amounts of (socially stratified) high-quality training data. By feeding a system data without a lot of variation, for example by using only one person's voice, you can usually get more accurate recognition more quickly. ]]>
question from Veronica the other day: Which wavelength someone would use not to hear but feel it on the body as a vibration?So this would depend on two things. The first is your hearing ability. If you've got no or limited hearing, most of your interaction with sound will be tactile. This is one of the reasons why many Deaf individuals enjoy going to concerts; if the sound is loud enough you'll be able to feel it even if you can't hear it. I've even heard stories about folks who will take balloons to concerts to feel the vibrations better. In this case, it doesn't really depend on the pitch of the sound (how high or low it is), just the volume.But let's assume that you have typical hearing. In that case, the relationship between pitch, volume and whether you can hear or feel a sound is a little more complex. This is due to something called frequency response. Basically, the human ear is better tuned to hearing some pitches than others. We're really sensitive to sounds in the upper ranges of human speech (roughly 2k to 4k Hz). (The lowest pitch in the vocal signal can actually be much lower [down to around 80 Hz for a really low male voice] but it's less important to be able to hear it because that frequency is also reflected in harmonics up through the entire pitch range of the vocal signal. Most telephones only transmit signals between 300 Hz to 3400 Hz, for example, and it's only really the cut-off at the upper end of the range that causes problems--like making it hard to tell the difference between sh and s.)[caption id= align=alignnone width=1200] An approximation of human frequency response curves. The line basically shows where we hear things as equally loud. So a 100 Hz sound (like a bass guitar) needs to be ten times as loud as a 1000 Hz sound (like a flute or piccolo) to sound equally loud. [/caption]The takeaway from all this is that we're not super good at hearing very low sounds. That means they can be very, very loud before we pick up on them. If the sound is low enough and loud enough, then the only way we'll be able to sense it is by feeling it.How low is low enough? Most people can't really hear anything much below 20 Hz (like the lowest note on a really big organ). The older you are and the more you've been exposed to really loud noises in that range, like bass-heavy concerts or explosions, the less you'll be able to pick up on those really low sounds.What about volume? My guess for what would be sufficiently loud, in this case, is 120+ Db. 120 Db is as loud as a rock concert, and it's possible, although difficult and expensive, to get out of a home speaker set-up. If you have a neighbor listening to really bass-y music or watching action movies with a lot of low, booming sound effects on really expensive speakers, it's perfectly possible that you'd feel those vibrations rather than hearing them. Especially if there are walls between the speakers and you. While mid and high frequency sounds are pretty easy to muffle, low-frequency sounds are much more difficult to sound proof against.Are there any health risks? The effects of exposure to these types of low-frequency noise is actually something of an active research question. (You may have heard about the brown note, for example.) You can find a review of some of that research here. One comforting note: if you are exposed to a very loud sound below the frequencies you can easily hear--even if it's loud enough to cause permanent damage at much higher frequencies--it's unlikely that you will suffer any permanent hearing loss. That doesn't mean you shouldn't ask your neighbor to turn down the volume, though; for their ears if not for yours!]]>
my last blog post went up, a couple people wondered if the difference in classification error rates between men and women might be due to pitch, since men tend to have lower voices. I had no idea, so, being experimentally inclined, I decided to find out.First, I found the longest list of words that I could from the accent tag. Pretty much every video I looked used a subset of these words.Aunt, Roof, Route, Wash, Oil, Theater, Iron, Salmon, Caramel, Fire, Water, Sure, Data, Ruin, Crayon, New Orleans, Pecan, Marriage, Both, Again, Probably, Spitting Image, Alabama, Guarantee, Lawyer, Coupon, Mayonnaise, Ask, Potato, Three, Syrup, Cool Whip, Pajamas, Caught, Catch, Naturally, Car, Aluminium, Envelope, Arizonia, Waffle, Auto, Tomato, Figure, Eleven, Atlantic, Sandwich, Attitude, Officer, Avacodo, Saw, Bandana, Oregon, Twenty, Halloween, Quarter, Muslim, Florida, WagonThen I recorded myself reading them at a natural pace, with list intonation. In order to better match the speakers in the other Youtube videos, I didn't go into the lab and break out the good microphones; I just grabbed my gaming headset and used that mic. Then, I used Praat (a free, open source software package for phonetics) to shift the pitch of the whole file up and down 60 Hertz in 20 Hertz intervals. That left me with seven total sound files: the original one, three files that were 20, 40 and 60 Hertz higher and finally three files that were 20, 40 and 60 Hertz lower. You can listen to all the files individually here.The original recording had a mean of 192 Hz and a median of 183, which means that my voice is slightly lower pitched than average for an American English speakering women. For reference, Pepiot 2014 found a mean pitch of 210 Hz for female American English speakers. The same papers also lists a mean pitch of 119 Hz for male American English speakers. This means that my lowest pitch manipulation (mean of 132) is still higher than the average American English speaking male. I didn't want to go too much lower with my pitch manipulations, though, because the sound files were starting to sound artifact-y and robotic.Why did I do things this way? Only using one recording. This lets me control 100% for demographic information. I'm the same person, with the same language background, saying the same words in the same way. If I'd picked a bunch of speakers with different pitches, they'd also have different language backgrounds and voices. Plus I'm not getting effects from using different microphones. Manipulating pitch both up and down. This was for two reasons. First, it means that the original recording isn't the end-point for the pitch continuum. Second, it means that we can pick apart whether accuracy is a function of pitch or just the file having been manipulated.Results:You can check out how well the auto-captions did yourself by checking out this video. Make sure to hit the CC button in the lower left-hand corner.[youtube https://www.youtube.com/watch?v=eUgrizlV-R4]The first thing I noticed was that I had really, really good results with the auto captions. Waaayyyy better than any of the other videos I looked at. There were nine errors across 434 tokens, for a total error rate of only 2%, which I'd call pretty much at ceiling. There was maaayybe a slight effect of the pitch manipulation, with higher pitches having slightly higher error rates, as you can see:BUT there's also sort of a u-shaped curve, which suggests to me that the recognizer is doing worse with the files that have been messed with the most. (Although, weirdly, only the file that had had its pitched shifted up by 20 Hz had no errors.) I'm going to go ahead and say that I'm not convinced that pitch is a determining factorSo why were these captions so much better than the ones I looked at in my last post? It could just be that I was talking very slowly and clearly. To check that out, I looked at autocaptions for the most recent video posted by someone who's fairly similar to me in terms of social and vocal characteristics: a white woman who speaks standardized American English with Southern features. Ideally I'd match for socioeconomic class, education and rural/urban background as well, but those are harder to get information about.I chose Bunny Meyer, who posts videos as Grav3yardgirl. In this video her speech style is fast and conversational, as you can hear for yourself:[youtube https://www.youtube.com/watch?v=hrLvRRaiMwY] To make sure I had roughly the same amount of data as I had before, I checked the captions for the first 445 words, which was about two minutes worth of video (you can check my work here). There was an overall error rate of approximately 8%, if you count skipped words as errors. Which, considering that recognizing words in fast/connected speech is generally more error-prone, is pretty good. It's definitely better than in the videos I analyzed for my last post. It's also a fairly small difference from my careful speech: definitely less than the 13% difference I found for gender.So it looks like neither the speed of speech nor the pitch are strongly affecting recognition rate (at least for videos captioned recently). There are a couple other things that I think may be going on here that I'm going to keep poking at: ASR has got better over time. It's totally possible that more women just did the accent tag challenge earlier, and thus had higher error rates because the speech recognition system was older and less good. I'm going to go back and tag my dataset for date, though, and see if that shakes out some of the gender differences. Being louder may be important, especially in less clear recordings. I used a head-mounted microphone in a quiet room to make my recordings, and I'm assuming that Bunny uses professional recording equipment. If you're recording outside or with a device microphone, though, there going to be a lot more noise. If your voice is louder, and men's voices tend to be, it should be easier to understand in noise. My intuition is that, since there are gender differences in how loud people talk, some of the error may be due to intensity differences in noisy recordings. Although an earlier study found no difference in speech recognition rates for men and women in airplane cockpits, which are very noisy, so who knows? Testing that out will have to wait for another day, though.]]>
In my last post, I looked at how Google's automatic speech recognition worked with different dialects. To get this data, I hand-checked annotations more than 1500 words from fifty different accent tag videos .Now, because I'm a sociolinguist and I know that it's important to stratify your samples, I made sure I had an equal number of male and female speakers for each dialect. And when I compared performance on male and female talkers, I found something deeply disturbing: YouTube's auto captions consistently performed better on male voices than female voice (t(47) = -2.7, p < 0.01.) . (You can see my data and analysis here.)[caption id=attachment_3273 align=alignnone width=800] On average, for each female speaker less than half (47%) her words were captioned correctly. The average male speaker, on the other hand, was captioned correctly 60% of the time.[/caption]It's not that there's a consistent but small effect size, either, 13% is a pretty big effect. The Cohen's d was 0.7 which means, in non-math-speak, that if you pick a random man and random woman from my sample, there's an almost 70% chance the transcriptions will be more accurate for the man. That's pretty striking.What it is not, unfortunately, is shocking. There's a long history of speech recognition technology performing better for men than women: Its Not You, Its It: Voice Recognition Doesnt Recognize Women (Times, 2011) Study finding that medical voice-dictation software performs significantly better for men (Roger & Pendharkar 2003) Paper finding that speech recognition performs worse for women than men, and worse for girls than boys (Nicol et al. 2002)This is a real problem with real impacts on people's lives. Sure, a few incorrect Youtube captions aren't a matter of life and death. But some of these applications have a lot higher stakes. Take the medical dictation software study. The fact that men enjoy better performance than women with these technologies means that it's harder for women to do their jobs. Even if it only takes a second to correct an error, those seconds add up over the days and weeks to a major time sink, time your male colleagues aren't wasting messing with technology. And that's not even touching on the safety implications of voice recognition in cars. So where is this imbalance coming from? First, let me make one thing clear: the problem is not with how women talk. The suggestion that, for example, women could be taught to speak louder, and direct their voices towards the microphone is ridiculous. In fact, women use speech strategies that should make it easier for voice recognition technology to work on women's voices. Women tend to be more intelligible (for people without high-frequency hearing loss), and to talk slightly more slowly. In general, women also favor more standard forms and make less use of stigmatized variants. Women's vowels, in particular, lend themselves to classification: women produce longer vowels which are more distinct from each other than men's are. (Edit 7/28/2016: I have since found two papers by Sharon Goldwater, Dan Jurafsky and Christopher D. Manning where they found better performance for women than men--due to the above factors and different rates of filler words like um and uh.) One thing that may be making a difference is that women also tend not to be as loud, partly as a function of just being smaller, and cepstrals (the fancy math thing what's under the hood of most automatic voice recognition) are sensitive to differences in intensity. This all doesn't mean that women's voices are more difficult; I've trained classifiers on speech data from women and they worked just fine, thank you very much. What it does mean is that women's voices are different from men's voices, though, so a system designed around men's voices just won't work as well for women's.Which leads right into where I think this bias is coming from: unbalanced training sets. Like car crash dummies, voice recognition systems were designed for (and largely by) men. Over two thirds of the authors in the Association for Computational Linguistics Anthology Network are male, for example. Which is not to say that there aren't truly excellent female researchers working in speech technology (Mari Ostendorf and Gina-Anne Levow here at the UW and Karen Livescu at TTI-Chicago spring immediately to mind) but they're outnumbered. And that unbalance seems to extend to the training sets, the annotated speech that's used to teach automatic speech recognition systems what things should sound like. Voxforge, for example, is a popular open source speech dataset that suffers from major gender and per speaker duration imbalances. I had to get that info from another paper, since Voxforge doesn't have speaker demographics available on their website. And it's not the only popular corpus that doesn't include speaker demographics: neither does the AMI meeting corpus, nor the Numbers corpus. And when I could find the numbers, they weren't balanced for gender. TIMIT, which is the single most popular speech corpus in the Linguistic Data Consortium, is just over 69% male. I don't know what speech database the Google speech recognizer is trained on, but based on the speech recognition rates by gender I'm willing to bet that it's not balanced for gender either.Why does this matter? It matters because there are systematic differences between men's and women's speech. (I'm not going to touch on the speech of other genders here, since that's a very young research area. If you're interested, the Journal of Language and Sexuality is a good jumping-off point.) And machine learning works by making computers really good at dealing with things they've already seen a lot of. If they get a lot of speech from men, they'll be really good at identifying speech from men. If they don't get a lot of speech from women, they won't be that good at identifying speech from women. And it looks like that's the case. Based on my data from fifty different speakers, Google's speech recognition (which, if you remember, is probably the best-performing proprietary automatic speech recognition system on the market) just doesn't work as well for women as it does for men.]]>
on the news) you may have noticed that speech recognition software doesn't generally work very well for you. You can see the sort of thing I'm talking about in this clip:[youtube https://www.youtube.com/watch?v=5FFRoYhTJQQ]This clip is a little old, though (2010). Surely voice recognition technology has improved since then, right? I mean, we've got more data and more computing power than ever. Surely somebody's gotten around to making sure that the current generation of voice-recognition software deals equally well with different dialects of English. Especially given that those self-driving cars that everyone's so excited about are probably going to use voice-based interfaces.To check, I spent some time on Youtube looking at the accuracy automatic captions for videos of the accent tag challenge, which was developed by Bert Vaux. I picked Youtube automatic captions because they're done with Google's Automatic Speech Recognition technology--which is one of the most accurate commercial systems out there right now.Data: I picked videos with accents from Maine (U.S), Georgia (U.S.), California (U.S), Scotland and New Zealand. I picked these locations because they're pretty far from each other and also have pretty distinct regional accents. All speakers from the U.S. were (by my best guess) white and all looked to be young-ish. I'm not great at judging age, but I'm pretty confident no one was above fifty or so.What I did: For each location, I checked the accuracy of the automatic captions on the word-list part of the challenge for five male and five female speakers. So I have data for a total of 50 people across 5 dialect regions. For each word in the word list, I marked it as correct if the entire word was correctly captioned on the first try. Anything else was marked wrong. To be fair, the words in the accent tag challenge were specifically chosen because they have a lot of possible variation. On the other hand, they're single words spoken in isolation, which is pretty much the best case scenario for automatic speech recognition, so I think it balances out.Ok, now the part you've all been waiting for: the results. Which dialects fared better and which worse? Does dialect even matter? First the good news: based on my (admittedly pretty small) sample, the effect of dialect is so weak that you'd have to be really generous to call it reliable. A linear model that estimated number of correct classifications based on total number of words, speaker's gender and speaker's dialect area fared only slightly better (p = 0.08) than one that didn't include dialect area. Which is great! No effect means dialect doesn't matter, right?Weellll, not really. Based on a power analysis, I really should have sampled forty people from each dialect, not ten. Unfortunately, while I love y'all and also the search for knowledge, I'm not going to hand-annotate two hundred Youtube videos for a side project. (If you'd like to add data, though, feel free to branch the dataset on Github here. Just make sure to check the URL for the video you're looking at so we don't double dip.)So while I can't confidently state there is an effect, based on the fact that I'm sort of starting to get one with only a quarter of the amount of data I should be using, I'm actually pretty sure there is one. No one's enjoying stellar performance (there's a reason that they tend to be called AutoCraptions in the Deaf community) but some dialect areas are doing better than others. Look at this chart of accuracy by dialect region:[caption id=attachment_3239 align=alignnone width=800] Proportion of correctly recognized words by dialect area, color coded by country.[/caption]There's variation, sure, but in general the recognizer seems to be working best on people from California (which just happens to be where Google is headquartered) and worst on Scottish English. The big surprise for me is how well the recognizer works on New Zealand English, especially compared to Scottish English. It's not a function of country population (NZ = 4.4 million, Scotland = 5.2 million). My guess is that it might be due to sample bias in the training sets, especially if, say, there was some 90's TV shows in there; there's a lot of captioned New Zealand English in Hercules, Xena and related spin-offs. There's also a Google outreach team in New Zealand, but not Scotland, so that might be a factor as well.So, unfortunately, it looks like the lift skit may still be current. ASR still works better for some dialects than others. And, keep in mind, these are all native English speakers! I didn't look at non-native English speakers, but I'm willing to bet the system is also letting them down. Which is a shame. It's a pity that how well voice recognition works for you is still dependent on where you're from. Maybe in another six years I'll be able to write a blog post says it isn't.]]>
deets are here, but they're fairly boring unless you really care about typography. What's more interesting to me, as someone who studies visual, spoken and written language, is that there are a whole batch of new emoji. And it's led to lots of interesting speculation about, for example, what is the most popular new emoji is going to be (tldr: probably the ROFL face. People have a strong preference for using positive face emojis.) This led me to wonder: what obvious lexical gaps are there?[I]n some cases it is useful to refer to the words that are not part of the vocabulary: the nonexisting words. Instead of referring to nonexisting words, it is common to speak about lexical gaps, since the nonexisting words are indications of holes in the lexicon of the language that could be filled.Janssen, M. 2012. Lexical Gaps. The Encyclopedia of Applied Linguistics.This question is pretty easy to answer about emoji-- we can just find out what words people are most likely to use when they're complaining about not being able to use emoji. There's even a Twitter bot that collects these kind of tweets. I decided to do something similar, but with a twist. I wanted to know what kinds of emoji people complain about wanting the most.Boring technical details <U+653C><U+3E64><U+613C><U+3E30><U+623C><U+3E64><U+653C><U+3E64><U+623C><U+3E32><U+613C><U+3E34> Yesterday, I grabbed 4817 recent tweets that contained both the words no and emoji. (You can find the R script I used for this on my Github.) For each tweet, I took the two words occurring directly in front of the word emoji and created a corpus from them using the tm (text mining) package. I tidied up the corpus--removing super-common words like the, making everything lower-case, and so on. (The technical term is cleaning, but I like the sound of tidying better. It sounds like you're getting comfy with your data, not delousing it.) I ranked these words by frequency, or how often then showed up. There were 1888 distinct words, but the vast majority (1280) showed up only once. This is completely normal for word frequency data and is modelled by Zipf's law. I then took all words that occurred more than three times and did a content analysis. Exciting results! <U+653C><U+3E64><U+613C><U+3E30><U+623C><U+3E64><U+653C><U+3E64><U+623C><U+3E38><U+383C><U+3E34>At the end of my content analysis, I arrived at nine distinct categories. I've listed them below, with the most popular four terms from each. One thing I noticed right off is how many of these are emoji that either already exist or are in the Unicode update. To highlight this, I've italicized terms in the list below that don't have an emoji. animal: shark, giraffe, butterfly, duck color: orange, red, white, green face: crying, angry, love, hate (facial) feature: mustache, redhead, beard, glasses flag: flag, England, Welsh, pride food: bacon, avocado, salt, carrot gesture: peace, finger, middle, crossed object: rifle, gun, drum, spoon person: mermaid, pirate, clown, chef(One note: the rifle is in unicode 9.0, but isn't an emoji. This has been the topic of some discussion, and is probably why it's so frequent.)Based on these categories, where are the lexical gaps? The three categories that have the most different items in them are, in order 1) food, 2) animals and 3) objects. These are also the three categories with the most mentions across all items.So, given that so many people are talking about emojis for animals, food and objects, why aren't the bulk of emojis in these categories? We can see why this might be by comparing how many different items get mentioned in each category to how many times each item is mentioned.[caption id=attachment_3051 align=aligncenter width=713] Yeah, people talk about food a lot... but they also talk about a lot of different types of food. On the other hand you have categories like colors, which aren't talked about as much but where the same colors come up over and over again.[/caption]As you can see from the figure above, the most popular categories have a lot of different things in them, but each thing is mentioned relatively rarely. So while there is an impassioned zebra emoji fanbase, it only comes up three times in this dataset. On the other hand, red is fairly common but shows up because of discussion of, among other things, flowers, shoes and hair color. Some categories, like flags, fall in a happy medium--lots of discussion and fairly few suggestions for additions.Based on this teeny data set, I'd say that if the Unicode consortium continues to be in charge of putting emoji standardization it'll have its hands full for quite some time to come. There's a lot of room for growth, and most of it is in food, animals and objects, which all have a lot of possible items, rather than gestures or facial expressions, which have much fewer.]]>
Why do Canadians say 'eh'? [caption id= align=alignnone width=512] Blog post about Canada, eh?[/caption] Fortunately for my curious students, this is actually an active area of inquiry. (It's actually one those research questions where there was a flurry of work--in this case in the 1970's--and then a couple quiet decades followed by a resurgence in interest. The 'eh' renaissance started in the mid-2000's and continues today. For some reason, at least in linguistics, this sort of thing tends to happen a lot. I'll leave discussing why this particular pattern is so common to the sociologists of science.) So what do we know about 'eh'?Is 'eh' actually Canadian?'Eh' has quite the pedigree--it's first attested in Middle English and even shows up in Chaucer. Canadian English, however, boasts a more frequent use of 'eh', which can fill the same role as 'right?', 'you know?' or 'innit?' for speakers of other varieties of English.What does 'eh' mean?The real thing that makes an 'eh' Canadian, though, is how it's used. Despite some claims to the contrary, eh is far from meaningless. It has a limited number of uses (Elaine Gold identified an even dozen in her 2004 paper) some of which aren't found outside of Canada. Walter Avis described two of these uniquely Canadian uses in his 1972 paper, So eh? is Canadian, eh (it's not available anywhere online as far as I can tell): Narrative use: Used to punctuate a story, in the same way that an American English speaker (south of the border, that is) might use right? or you know? Example: I was walking home from school, eh? I was right by that construction site where there's a big hole in the ground, eh? And I see someone toss a piece of trash right in it. Miscellaneous/exclamation use: Tacked on to the end of a statement. (Although more recent work, presented by Martina Wiltschko and Alex D'Arcy at last year's NWAV suggests that there's really a limited number of ways to use this type of 'eh' and that they can be told apart by the way the speaker uses pitch.) Example: What a litterbug, eh?And these uses seems to be running strong. Gold found that use of 'eh' in a variety of contexts has either increased or remained stable since 1980.That's not to say there's no change going on, though. D'Arcy and Wiltschko found that younger speakers of Canadian English are more likely than older speakers to use 'right?' instead of 'eh?'. Does this mean that 'eh' may be going the way of the dodo or 'sliver' to mean 'splinter' in British English?Probably not--but it may show up in fewer places than it used to. In particular, in their 2006 study Elaine Gold and Mireille Tremblay found that almost half of their participants feel negatively about the narrative use of 'eh' and only 16% actually used it themselves. This suggests this type of uniquely-Canadian usage may be on its way out.]]>
Nope.[/caption]But Rachael, you say, you're going to grad school in linguistics and having all sorts of fun. Why are you trying to keep me from doing the same thing? Two big reasons.The Job Market for Linguistics PhDsWhat do you want to do when you get out of grad school? If you're like most people, you'll probably say you want to teach linguistics at the college or university level. What you should know is that this is an increasingly unsustainable career path.In 1975, 30 percent of college faculty were part-time. By 2011, 51 percent of college faculty were part-time, and another 19 percent were nontenure track, full-time employees. In other words, 70 percent were contingent faculty, a broad classification that includes all nontenure track faculty (NTTF), whether they work full-time or part-time.More Than Half of College Faculty Are Adjuncts: Should You Care? by Dan Edmonds.And most of these part-time faculty, or adjuncts, are very poorly paid. This survey from 2015 found that 62% of adjuncts made less than $20,000 a year. This is even more upsetting you consider that you need a PhD and scholarly publications to even be considered for one of these posts.(But what about being paid for your research publications? you ask. Surely you can make a few bucks by publishing in those insanely expensive academic journals. While I understand where you're coming from--in almost any other professional publishing context it's completely normal to be paid for your writing--authors of academic papers are not paid. Nor are the reviewers. Furthermore, authors are often charged fees by the publishers. One journal I was recently looking at charges $2,900 per article, which is about three times the funding my department gives us for research over our entire degree. Not a scam journal, either--an actual reputable venue for scholarly publication.)Yes, there are still tenure-track positions available in linguistics, but they are by far the minority. What's more, even including adjunct positions, there are still fewer academic posts than graduating linguists with PhDs. It's been that way for a while, too, so even for a not-so-great adjunct position you'll be facing stiff competition. Is it impossible to find a good academic post in linguistics? No. Are the odds in your (or my, or any other current grad student's) favor? Also no. But don't take it from me. In Surviving Linguistics: A Guide for Graduate Students (which I would highly recommend) Monica Macaulay says:[It] is common knowledge that we are graduating more PhDs than there are faculty positions available, resulting in certain disappointment for many... graduates. The solution is to think creatively about job opportunities and keep your options open.As Dr. Macaulay goes on to outline, there are jobs for linguists outside academia. Check out the LSA's Linguistics Beyond Academia special interest group or the Linguists Outside Academia mailing list. There are lots of things you can do with a linguistics degree, from data science to forensic linguistics.That said, there are degrees that will better prepare you for a career than a PhD in theoretical linguistics. A master's degree in Speech Language Pathology (SLP) or Computational Linguistics or Teaching English to Speakers of Other Languages (TESOL) will prepare you for those careers far better than a general PhD.Even if you're 100% dead set on teaching post-secondary students, you should look around and see what linguists are doing outside of universities. Sure, you might win the job-lottery, but at least some of your students probably won't, and you'll want to make sure they can find well-paying, fulfilling work.Grad School is GruelingYes, grad school can absolutely be fun. On a good day, I enjoy it tremendously. But it's also work. (And don't give me any nonsense about it not being real work because you do it sitting down. I've had jobs that required hard physical and/or emotional labor, and grad school is exhausting.) I feel like I probably have a slightly better than average work/life balance--partly thanks to my fellowship, which means I have limited teaching duties and don't need a second job any more--and I'm still actively trying to get better about stopping work when I'm tired. I fail, and end up all tearful and exhausted, about once a week.It's also emotionally draining. Depression runs absolutely rampant among grad students. This 2015 report from Berkeley, for example, found that over two thirds of PhD students in the arts and sciences were depressed. The main reason? Point number one above--the stark realities of the job market. It can be absolutely gutting to see a colleague do everything right, from research to teaching, and end up not having any opportunity to do the job they've been preparing for. Especially since you know the same lays in wait for you.And doing everything right is pretty Herculean in and of itself. You have to have very strong personal motivation to finish a PhD. Sure, your committee is there to provide oversight and you have drop-dead due dates. But those deadlines are often very far away and, depending on your committee, you may have a lot of independence. That means motivating yourself to work steadily while manage several ongoing projects in parallel (you're publishing papers in addition to writing your dissertation, right?) and not working yourself to exhaustion in the process. Basically you're going to need a big old double helping of executive functioning.And oh by the way, to be competitive in the job market you'll also need to demonstrate you can teach and perform service for your school/discipline. Add in time to sleep, eat, get at least a little exercise and take breaks (none of which are optional!) and you've got a very full plate indeed. Some absolutely iron-willed people even manage all of this while having/raising kids and I have nothing but respect for them.Main take-awayWhether inside or outside of academia, it's true that a PhD does tend to correlate with higher salary--although the boost isn't as much as you'd get from a related professional degree. BUT in order to get that higher salary you'll need to give up some of your most productive years. My spouse (who also has a bachelors in linguistics) got a master's degree, found a good job, got promoted and has cultivated a professional social network in the time it's taken me just to get to the point of starting my dissertation.The opportunity cost of spending five more years (at a minimum--I've heard of people who took more than a decade to finish) in school, probably in your twenties, is very, very high. And my spouse can leave work at work, come home on weekends and just chill. This month I've got four full weekends of either conferences or outreach. Even worse, no matter how hard I try to stamp it out, I've got a tiny little voice in my head that's very quietly screaming you should be working literally all the time.I'm being absolutely real right now: going to grad school for linguistics is a bad investment of your time and labor. I knew that going in--heck, I knew that before I even applied--and I still went in. Why? Because I decided that, for me, it was a worthwhile trade-off. I really like doing research. I really like being part of the scientific community. Grad school is hard, yes, but overall I'm enjoying myself. And even if I don't end up being able to find a job in academia (although I'm still hopeful and still plugging away at it) I really, truly believe that the research I'm doing now is valuable and interesting and, in some small way, helping the world. What can I say? I'm a nerdy idealist.But this is 100% a personal decision. It's up to you as an individual to decide whether the costs are worth it to you. Maybe you'll decide, as I have, that they are. But maybe you won't. And to make that decision you really do need to know what those costs are. I hope I've helped to begin making them clear. One final thought: Not going to grad school doesn't mean you're not smart. In fact, considering everything I've discussed above, it probably means you are.]]>
than the game itself already does.)One example that I'm pretty excited about is #PronouncingThingsIncorrectly, which is a language game invented by Chaz Smith. Smith is a Viner, Cinema Studies student at the University of Pennsylvania and advocate for sexual assault prevention. But right now, I'm mostly interested in his role as a linguistic innovator. In that role he's invented a new type of language game, which you can see an example of here:https://vine.co/v/iDFI7ExxtlzIt's been picked up by a lot of other viners, as well. You can seem some additional examples here.So why is this linguistically interesting? Because, like most other language games, it has rules to it. I don't think Chaz necessarily sat down and came up with them (he could have, but I'd be surprised) but they're there none the less. This is a great example of one of the big True Things linguists know about language: even in play, it tends to be structured. This particular game has three structures I noticed right away: vowel harmony,re-syllabification and new stress assignment.Vowel Harmony Vowel harmony is where all the vowels in a word tend to sound alike. It's not really a big thing in English, but you may be familiar with it from the nursery rhyme I like to eat Apples and Bananas. Other languages, though, use it all the time: Finnish, Nez Perce, Turkish and Maasai all have vowel harmony.It's also part of this language game. For example, tide is pronounced so that it rhymes with speedy and tomatoes rhymes with toe so toes. Notice that both words have the same vowel sound throughout. Not all words have the same vowel all the way through, but there's more vowel harmony in the #PronouncingThingsIncorrectly words than there are in the original versions.Re-syllabification Syllables are a way of chunking up words--you probably learned about them in school at some point. (If not, I've talked about them before.) But languages break words up in different places. And in the game, the boundaries get moved around. We've already seen one example: tide. It's usually one chunk, but in the game it gets split in to two: tee.dee. (Linguists like to put periods in the middle of words to show where the syllable boundaries are.)You might have noticed that tide is spelled with two a silent e on the end. My strong intuition is that spelling plays a big role in this word game. (Which is pretty cool! Usually language games like this rely on mostly on sounds and not the letters used to write them.) Most words get each of the vowels in thier spelling produced separately, which is where a lot of these resyllabifications come from. Two consonants in a row also tend to each get their syllables. You can see some examples of each below: Hawaiian -> ha.why.EE.an Mayonnaise -> may.yon.nuh.ASS.ee Skittles -> ski.TI.til.eesNew Stress AssignmentEnglish stress assignment (how we pick which syllables in a word get the most emphasis) is a mess. It depends on, among other things, which language we borrowed the word from (words from Latin and words from Old English work differently), whether you can break the word down into smaller meaning bits (like how bats is bat + s) and what part of speech it is (the compact in powder compact and compact car have stress in different places). People have spent entire careers trying to describe it.In this word game, however, Smith fixes English stress. After resyllabificaiotn, almost all words with more than one syllable have stress one syllable in from the right edge: suc.CESS -> SUC.cess pe.ROK.side -> pee.rok.SEED.dee col.OGNE -> col.OG.nee HON.ey stays the sameBut if you've been paying attention, you'll notice that there are some exceptions, like Skittles: Skittles -> ski.TI.til.ees Jalapenos -> djuh.LA.pen.osWhy are these ones different? I think it's probably because they're plural, and if the final syllable is plural it doesn't really count. You can hear some more examples of this in the Vine embedded above: bubbles -> BOO.buh.lees drinks -> duh.RIN.uh.kus bottles -> BOO.teh.lessSo what? Ok, so why is this important or interesting? Well, for one thing it's a great example of how humans can't help but be systematic. This is very informal linguistic play that still manages to be pretty predictable. By investigating this sort of language game we can better characterize what it is to be a human using language.Secondly, this particular language games shows us some of the pressures on English. While it's my impression that the introduction of vowel harmony is done to be funny (especially since there are other humorous processes at work here--if a word can be pronounced like booty or ass is usually is) I'm really interested in the resyllabification and stress assignment--or is that ree.sill.luh.ah.bee.fee.ca.TEE.oin and STUH.rees ass.see.guh.nuh.MEN.tee? The ways they're done in this game is real improvement over the current way of doing things, at least in terms of being systematic and easy to learn. Who knows? In a couple centuries maybe we'll all be #PronouncingThingsIncorrectly.]]>
he was speaking Arabic. And this isn't the first time this has happened. Nor the second. These are all, in addition to being deeply disturbing and illegal, examples of linguistic discrimination.What is linguistic discrimination?Linguistic discrimination is discrimination based on someone's language use. And it's not restricted to the instances I discussed above: African American English is often discriminated against. For example, Rachel Jeantel's testimony in the Trayvon Martin case was largely dismissed by the white jury--not because of the content of the testimony, but because of her use of African American English, and there is a long history of landlords linguistically profiling and discriminating against African American and Latina/o prospective tenants. Sometimes it's the language itself under attack, as in this letter about American Sign Language, claiming that it's unnecessary for deaf children to learn to sign. (Here's a rebuttal from the Gallaudet Linguistics department, which is a major center of sign language research at the world's only university devoted to deaf and hard of hearing students.) In multilingual countries, it's unfortunately common for speakers of marginalized language to find themselves denied services in their language.As I've talked about before, linguistic discrimination can be a way to discriminate against a specific group of people without saying so in so many words. Linguistic discrimination, in addition to being morally repugnant,is illegal in the U.S. under Titles VI and VII of the Civil Rights Act of 1964.These are important legal protections and the number of people affected by them is huge: There are over 350 different languages spoken in the United States. In Seattle, where I live, over a fifth of people over age five speak a language other than English at home. That's a lot of people! Further, most of these individuals are bilingual or multilingual; 90% of second-generation immigrants speak English. And since multilingualism has both neurological benefits for individuals and larger positive impacts on society, I see this as no bad thing. And I'm hardly the only one: how many people that you know are learning or want to learn another language?Unfortunately, linguistic discrimination threatens this rich diversity, and every person who speaks anything other than the standardized variety of the dominant language.What can you do? Don't participate in linguistic discrimination. It can be hard to retrain yourself to reduce the impact of negative stereotypes but, especially if you're in a position of privilege (as I am), it's literally the least you can do. Don't make assumptions about people based on their language use. Stand up for people who may be facing linguistic discrimination. If you see someone being discriminated in in the workplace (like being given lower performance evaluations for having a non-native accent) point out that this is illegal, and back up people who are being discriminated against. Be patient with non-native speakers. Appreciate that they've gone through a lot of effort to learn your language. If possible, try and arrange for an interpreter (for face-to-face communication) or translator (for written communications). Sometimes non-native speakers are more comfortable with reading and writing than speaking; offer to communicate through e-mails or other written correspondence. ]]>
here, focused on responses to whether frosting and icing were different things, or different words for the same thing. This post gets a little more in-depth. In the first part, I was just asking people what they thought they said. In the second part, I was asking them to pick words for specific pictures. It's not a perfect design--by asking people what they think they saw first I primed them pretty heavily--but it does reveal some interesting patterns of usage.The main thing I was interested in was this--did people who said frosting and icing were interchangeable for them actually use them as if they were the same? Why is this a good question to ask? Because it turns out that a lot of the time people aren't the best judges of how they use language. Especially if there's some sort of rule about how you're supposed to do it. For example, there's something of a running joke among linguists how often people will use the passive voice while they're telling people not to! I don't think anyone would intentionally lie about their usage, but it's possible that respondents aren't always doing exactly what they think they are.I split my dataset into people who said they thought the words frosting and icing meant the same thing and those who thought they were different. In the charts below these groups are labelled same and different respectively. For this stage of analysis, I left out people who weren't sure; there weren't a whole lot of them anyway.CupcakeSo this picture was a pretty canonical example of what people brought up a lot--it's on a cake, and it's been both whipped and piped. For a lot of people, then, this should be frosting. So what did people say?The results here were pretty much what I expected. (Whew!) People who thought the words meant different things pretty much all thought this was frosting. And there was a pretty strong different between the groups. But this still doesn't answer some of my questions. Is it the texture that makes it frosting or, as the AP Styleguide suggests, the fact that it's on a cake? After all, you can definitely put buttercream on a cookie, as evinced by Lofthouse.DoughnutsNext I had some doughnuts. A lot of people, when I first started asking around, brought up doughnuts as something that they thought were iced rather than frosted. So what did people say?That does seem to hold true.There was no strong difference between the groups, but there were also a lot of write-in answers. (Glaze was especially popular, which, for the record, is probably what I'd say. ) So there seems to be more variety in what people call doughnut toppings but there is a tendency towards icing.Cake with fondantOk, so this image was a bit of a trick. The cake here is covered in fondant. Which, to me, isn't really frosting or icing. But if it's really being on a cake that makes something frosting, we should see a strong frosting bias from people with a distinction. And that's just not the case. There's also a pretty big difference between the groups here. Interestingly, people who thought frosting and icing are different things were more likely to write in fondant. (Remember that level of baking knowledge had no effect on whether people said there was a difference or not, so it's probably not just specialized knowledge.)Bundt CakeI included this image for a couple of reasons. Again, I'm poking at this on a cake idea. But I also had a lot of people tell me that, for them, the distinction between the words was texture-based. So responses here could have gone two ways: If anything on a cake is frosting, then we'd expect frosting to win. But, if frosting has to be fluffy/whipped, then we'd expect icing to win.And icing wins! This is no surprise, given the written results summarized in my previous post and the responses for the cake pictures above, but for me it really puts the nail in the coffin of the on cakes argument. (Take note, AP Styleguide!) Even on this one, though, people with no distinction are much more likely to be able to use frosting.Sweet RollSo this is an interesting one. I included it because, for me, cinnamon rolls are synonymous with cream cheese frosting/icing. Since several people I talked to said specifically that cream cheese had to be frosting and not icing, I was expecting a large frosting response on this one.That was definitely not what I saw, though. (Although people with no distinction were much more likely to be able to say frosting, so I guess I came by it natural.) Most people, and especially people with a distinction, thought it was icing.OverviewSo there are two main takeaways here: There's a strong difference in usage between people who say that frosting and icing are different things and those who say they aren't. (For most of the pictures, these groups responded significantly differently.) If there is a difference, it's got everything to do with texture and nothing to do with cake.That's not to say that these things will always hold true; no one knows better than linguists that language is in a constant state of flux. But for now, these generalizations seem to hold for most of the people surveyed. So if you're going to make a usage distinction between these words, please make one that's based on the actual usage and not some completely made-up rule!A final note: if youre interested in seeing the (slightly sanitized) data and the R code I used for analysis, both are available here. ]]>
AP Style tip: Use icing to describe sugar decorations applied to cookies; frosting for cupcakes and cakes. AP Stylebook (@APStylebook) February 22, 2016This struck me as 1) kind of a petty usage distinction and 2) completely at odds with my personal usage and what I knew about the dialectal research. The Dictionary of American Regional English, for example, notes that Frosting is widespread, but chiefly North, North Midland, West. Icing, on the other hand, is found all over,but less freq North, Pacific. As someone from Virginia but currently living in Seattle, I have no problem using either frosting or icing for a nice buttercream. I'm hardly the only one, either. This baking blog post even says I use lots of different icings to frost cupcakes.[caption id= align=aligncenter width=512] Frosting or icing, I'll take a dozen.[/caption] BUT when I posted about this Twitter, some people replied that they did have a very strong distinction between the two words. And the same thing happened when I brought it up with different groups of friends. A lot of people brought up texture, or that they'd say that some things are frosted and others are iced. This was really fascinating to me, both as a baker and a linguist, so I did what any social scientist would and set out to collect some data to get a better idea of what's going on.I set up a survey on Google forms and got 109 responses. First I collected info on where speakers were from, how old they were and how knowledgeable they were about baking. Then I asked them for both their general impression of use and then used pictures to ask what they'd call the sweet topping on a variety of baked goods. To avoid making this blog post absolutely huge, I'm going to split up data discussion. The first half (this one) will look at whether people make a distinction between frosting and icing and whether that's related to any of their social characteristics. The second half (I'll link it here when it's done) will focus on responses to specific images.Are frosting and icing different, or are they different words for the same thing?The first question I asked people was whether frosting and icing were different, or just different words for the same thing. Most people (over 60%) thought that they were different things, while about a third (27% ) thought they were different words for the same thing, and the rest weren't sure. So it does look like there's some difference in how people use these words. But in and of itself, that's not very interesting. What I want to know is this: how do people with different social characteristics use these words? (You may remember that I wrote a while ago that this is the central question in sociolinguistics.)RegionThe first thing I wanted to look at was region. I was expecting to see a pretty big difference here, and I wasn't disappointed. Once I broke down the data by the states people were from, I found a definite pattern: people from the South were far more likely to say that frosting and icing were different words for the same thing. (Virginia isn't really patterning with the rest of the South, here, but that may be due to bit of sampling bias--I recruited participants through my social network, and a lot of my friends are from Northern Virginia, which tends not to pattern with the South.)[caption id=attachment_1698 align=alignnone width=814] Most people in the South thought frosting and icing were the same thing, while outside of the South more people thought they were different things. (The darker the blue, the more likely someone from that state was to say that they were different things--black states I didn't get any respondents from.)[/caption]Why is there a distinction? Honestly, I'm not really sure. My intuition, though, is that people from the South probably have pretty wide exposure to both terms. (Since books, TV and movies tend to come from outside of the South, there's plenty of chances to come across other dialectal variants.) However, people from outside the South historically had less exposure to one of the terms--icing--when they started to come across it they decided that it must refer to something different. As a result, the meanings of both words changed to become more narrow. (This is actually a pretty common process in languages.) I don't have strong evidence for this theory right now, though, so take it with a couple shakes of salt!AgeAnother thing I wanted to look at was whether the age of respondents played a role in how they used these words. If younger respondents seem to use the word differently than older respondents, it might be because there's a change happening in the language. Given time, everyone might end up doing the same thing as the younger people.[caption id=attachment_1842 align=aligncenter width=502] While it looks like there's a slight tendency for younger participants to say there's a difference between frosting and icing, the effect isn't strong enough to be reliable.[/caption]I didn't find a strong pattern, though. Again, this might be due to sampling problems, since most of my respondents were roughly the same age (21-30). But it could also be that there's simply not anything to find--that this is neither an on ongoing change, nor one where younger people and older people do things differently.Baking KnowledgeOk, so it looks like people are varying by region, but not by age... but what about by level of baking knowledge? Maybe you don't care about the difference if you almost never make or eat baked goods. It could be that people who know a lot about baking make a distinction, and it's only people who don't know a beater from a dough hook that are lumping things together.[caption id=attachment_1850 align=aligncenter width=502] Baking knowledge also isn't closely tied to how people use these words. So it's not just that people who don't know a lot about baking say they're the same.[/caption]But that's not what I found. People at all levels of baking knowledge tended to have a pretty even balance between the two uses of the words.CommentsI also collected comments from people, to get more information on what people thought in their own words. Two big themes emerged. One was that the most consistent thing people pointed to as the difference was texture. The other was that people tended to say that one of them was for the cake and the other wasn't... but which one was which was pretty much random.Just under half of the comments mentioned texture. I've compiled some of the differences below, but the general consensus seems to be that frosting is thick, fluffy and soft, while icing is thin and hard. Take note, AP Stylebook!FrostingIcingcreamy or butterysyrupy, like a glazeplasticy lookingspreadsqueezed or pipedthick and creamythin, hardens as it driesthickerthickerclear crust, driedfluffythinthin layer, smooth, glossymore solid, less flowingwatery, gooeystays softhardens once it setsthicker, softerthinner, harderthick, texturedthin, flatSix people did specifically mention how the words could be used for cake toppings in their comments. Two people said cakes could be either frosted or iced, two said that cakes could only be iced, and two said that cakes could only be frosted. Here's an example of an icing is for cakes comment:icing is for cakes! frosting is for all the other deliciousness. usually.And someone who suggests frosting is for cakes:I usually apply the word frosting solely to cakelike goods (cupcakes, regular cake) and then icing to everything else.So... if you are going to claim there's a difference between frosting and icing, pulling the it goes on cakes card is pretty likely to start a fight. You're much safer talking about texture. Unless you're in the South, of course; then you can pretty much say what you like.Is there a difference between frosting and icing? It looks like the answer mainly depends on where you are. But there were also some pretty interesting differences between different baked goods, so stay tuned for that part of the analysis.P.S. If you're interested in seeing the (slightly sanitized) data and the R code I used for analysis, both are available here.]]>
it's been shown to help improve memory.) But one tip that people often share is that listening to white noise can help you concentrate while studying. Being the sort of person I am (read: huge nerd) I decided to set out and see what the research has to say about it.[caption id= align=alignnone width=512] Ok, with the lab report done, we've just got two more twenty-page papers to write before we can sleep. Anyone got some coffee? [/caption] First things first: some noises can definitely be bad for learning. For example, one study which compared schools near major airports (which are a big source of noise pollution) and some which were not found that children who were in the noisier environment had reduced reading comprehension. An earlier, similar study showed that students in classrooms near a very noisy train track did worse academically than those that were not.And noisy environments are bad for concentration, too. One survey of office workers found that 99% of participants were bothered by noises like ringing telephones and conversations, and that the negative effects of these noises didn't fade over time. And we know that some types of speech noise--especially half of a telephone conversation--are incredibly distracting.Ok, so we know that some noise can hurt both learning and concentration... so why fight fire with fire? Wouldn't listening to white noise just be more of the same? Or even worse?Well, not necessarily. The really distracting thing about noise is that it's not predictable. It's pretty easy to tune out a clock ticking because your brain can figure out when it's going to tick again. When a new noise suddenly starts, however, or keeps happening in an unpredictable way, like a faucet dripping juuuust out of rhythm, your attention snaps to it. There's actually a special set of novelty detector neurons that are looking for any new types of sounds that might show up. There are two ways to avoid this happening. One is to make sure that all your environmental sounds are ones you can easily ignore... or you can cover them up. And white noise is very effective at covering up other noises.White noise is random noise that covers a wide frequency spectrum, usually 20 to 20,000 Hz. That means that other sounds that are the same volume or quieter than the white noise can't get thorough. As a result, you don't hear anything surprising, your novelty detector neurons stay quiet, and you can focus on what you're doing. And don't take my word for it: this study shows that students who listened to a recording of office noises masked with white noise preformed much better on tasks then those who listened to the office noises unmasked.Now, keep in mind, just because a noise is white doesn't mean it's good for you. Volume, for one thing, is very important. Exposing rats to 100-dB white noise for 45 minutes was enough for them to undergo measurable stress-induced neurological changes. To be fair, that's about as loud as a power mower but it does takes you out of the relaxed concentration range. So grab your headphones and favorite white noise source (if you've no other options, a radio set to static will work just fine) but remember to keep the volume down!]]>
I'll let you go with a warning this time, but if I catch you using less for fewer again, I'll have to give you a ticket.[/caption]But what do I mean when I say I was hurting people? Well, like some other types of policing, the grammar police don't target everyone equally. For example, there has been a lot of criticism of Rihanna's language use in her new single Work being thrown around recently. But that fact is that her language is perfectly fine. She's just using Jamaican Patois, which most American English speakers aren't familiar with. People claiming that the language use in Work is wrong is sort of similar to American English speakers complaining that Nederhop group ChildsPlay's language use is wrong. It's not wrong at all, it's just different.And there's the problem. The fact is that grammar policing isn't targeting speech errors, it's targeting differences that are, for many people, perfectly fine. And, overwhelmingly, the people who make errors are marginalized in other ways. Here are some examples to show you what I mean: Misusing ironic: A lot of the lists of common grammar errors you see will include a lot of words where the correct use is actually less common then other ways the word is used. Take ironic. In general use it can mean surprising or remarkable. If you're a literary theorist, however, irony has a specific technical meaning--and if you're not a literary theorist you're going to need to take a course on it to really get what irony's about. The only people, then, who are going to use this word correctly will be those who are highly educated. And, let's be real, you know what someone means when they say ironic and isn't that the point? Overusing words like just: This error is apparently so egregious that there's an e-mail plug-in, targeted mainly at women, to help avoid it. However, as other linguists have pointed out, not only is there limited evidence that women say just more than men, but even if there were a difference why would the assumption be that women were overusing just? Couldn't it be that men aren't using it enough? Double negatives: Also called negative concord, this error happens when multiple negatives are used in a sentence, as in, There isn't nothing wrong with my language. This particular construction is perfectly natural and correct in a lot of dialects of American English, including African American English and Southern English, not to mention the standard in some other languages, including French.In each of these cases, the error in question is one that's produced more by certain groups of people. And those groups of people--less educated individuals, women, African Americans--face disadvantages in other aspects of their life too. This isn't a mistake or coincidence. When we talk about certain ways of talking, we're talking about certain types of people. And almost always we're talking about people who already have the deck stacked against them.Think about this: why don't American English speakers point out whenever the Queen of England says things differently? For instance, she often fails to produce the r sound in words like father, which is definitely not standardized American English. But we don't talk about how the Queen is talking lazy or dropping letters like we do about, for instance, th being produced as d in African American English. They're both perfectly regular, logical language varieties that differ from standardized American English...but only one group gets flack for it.Now I'm not arguing that language errors don't exist, since they clearly do. If you've ever accidentally said a spoonerism or suffered from a tip of the tongue moment then you know what it feel like when your language system breaks down for a second. But here's a fundamental truth of linguistics: barring a condition like aphasia, a native speaker of a language uses their language correctly. And I think it's important for us all to examine exactly why it is that we've been led to believe otherwise...and who it is that we're being told is wrong. ]]>
Disclaimer: this mostly applies to experimental or quantitative articles, since those are what are common in my field. Your milage, especially in more formal fields like syntax or semantics, may vary dramatically. Ok, so you're not a professional linguist or anything, but you've come across an article in a linguistics journal and it sounds interesting. Or maybe you've just taken your first linguistics class and you heard about something really cool you want to learn more about. But when you start reading you're quickly swamped by terms you don't understand, IPA symbols you've never seen before and all sorts of statistics. You're tempted to just throw in the towel.Don't panic! I'm here to help you out with Rachael's patented* guide to reading linguistics articles.The first thing to do is take a deep breath and accept that you may not understand everything right away. That's ok! If you could easily read scientific literature in a field it would mean you were already an expert. Academic writing is designed to be read by other academics, and so it's full of terms that have very specific meanings in the field. It's a sort of time-saving code and it takes time to learn. Don't beat yourself up for being at the beginning of your journey!With that in mind, here's the steps I like to follow when I'm starting a new article, especially if it's in a field I'm less familiar with. Read the abstract. This will give you a broad outline of what the paper will be about and help you know if the whole article would be interesting or relevant for you. I like to call this the sandwich step. I read the introduction and then the conclusion. Why? Again, this gives me idea about what will be in the article. Sure, there may be spoilers, but knowing the answer will make it easier to understand how questions were asked. Notice any new terms that are both in the introduction and the abstract but don't get explained? This might be a good time to look them up, since the author might be assuming you already know about it. Some places to look up terms: The SIL linguistics glossary can be a good place to start. Linguistics topics on Wikipedia are also a good choice. Linguists even get together at professional events to edit and add to linguistics-related pages. For a bit more in-depth introduction, Language and Linguistics Compass publishes short articles written by experts that are designed to be introductions to whatever topic they're on. Flip through and look for any charts or figures and read their captions. These will be where the author(s) highlight their results. Now that you have a general idea about what's going on you'll have a better chance of interpreting these. Next, read the background section. This is where the author will talk about things that other people have done and how thier work fits in to the big picture of the field. This is the second place you're likely to find new terms you're unfamiliar with. If they're only used once or twice, don't worry about looking them up. Your aim is to understand the general thrust of the article, not every little detail! (Now, if you're a grad student, on the other hand... ;) ) Now read the methods section. You can probably skim this; unless you're interested in replicating the study or reviewing its merit you're not going to have to have a full grasp of all the nitty-gritty nuances of item design and participant recruitment. Finally read the results. Unless you have some stats background, you're probably safe in skipping over the statistical analyses. Again, you just want to understand the general point. Extra credit: Go back and read the abstract again. This is a very condensed version of what was in the article and is a good way to review/check your understanding. Sit back and enjoy having read a linguistics article!Grats on making it through! Now that you've caught the bug, what are some ways to find more stuff to read? Go find one of the articles referenced in the one you just read. Since you're already familiar with similar work, you'll probably have an easier time understanding the new article. Or read something more recent that cites the article you've read. You can look up articles that cite the one you've read on Google Scholar, as this video explains. Look up other issues of the journal your paper was in. Most journals publish in a pretty narrow range of topics so you'll have a leg up on understanding the new articles. Ask a linguist! We're a friendly bunch and pretty responsive to e-mail. You might even see if you can find the contact info of the author(s) of the article you read to ask them for suggestions for other stuff to read.I hope this has been helpful and piqued your interest about diving into linguistics research. Now get out there are get reading!*Not actually patented.]]>
I say good morning to nearly everyone I see while I'm out running. But I don't actually say good, do I? It's more like g' morning or uh morning. Never just morning by itself, and never a fully articulated good. Is there a name for this grunt that replaces a word? Is this behavior common among English speakers, only southeastern speakers, or only pre-coffee speakers?This sort of thing is actually very common in speech, especially in conversation. (Or in the wild as us laboratory types like to call it.) The fancy-pants name for it is hypoarticulation. That's less (hypo) speech-producing movements of the mouth and throat (articulation). On the other end of the spectrum you have hyperarticulation where you very. carefully. produce. each. individual. sound.Ok, so you can change how much effort you put into producing speech sounds, fair enough. But why? Why don't we just sort of find a happy medium and hang out there? Two reasons: Humans are fundamentally lazy. To clarify: articulation costs energy, and energy is a limited resource. More careful articulation also takes more time, which, again, is a limited resource. So the most efficient speech will be very fast and made with very small articulator movements. Reducing the word good to just g or uh is a great example of this type of reduction. On the other hand, we do want to communicate clearly. As my advisor's fond of saying, we need exactly enough pointers to get people to the same word we have in mind. So if you point behind someone and say er! and it could be either a tiger or a bear, that's not very helpful. And we're very aware of this in production: there's evidence that we're more likely to hyperarticulate words that are harder to understand.So we want to communicate clearly and unambiguously, but with as little effort as possible. But how does that tie in with this example? G could be great or grass or génial , and uh could be any number of things. For this we need to look outside the linguistic system.The thing is, language is a social activity and when we're using language we're almost always doing so with other people. And whenever we interact with other people, we're always trying to guess what they know. If we're pretty sure someone can get to the word we mean with less information, for example if we've already said it once in the conversation, then we will expend less effort in producing the word. These contexts where things are really easily guessable are called low entropy. And in a social context like jogging past someone in the morning, phrases liked good morning have very low entropy. Much lower than, for example Could you hand me that pickle?--if you jogged past someone and said that you'd be very likely to hyperarticulate to make sure they understood.]]>
What a devastating gut wrenching loss for Michigan. But dats spawts, dats life. SpawtsChat (@legitsportstalk) October 17, 2015There are three different spelling here, two which look like th-stopping (where the th sound as in that is produced as a d sound instead) and one that looks like r-lessness (where someone doesn't produce the r sound in some words). But unfortunately I don't have a recording of the person who wrote this tweet; there's no way I can know if they produce these words in the same way in their speech as they do when typing.Fortunately, I was able to find someone who 1) uses variant spellings in their Twitter and 2) I could get a recording of:[youtube https://www.youtube.com/watch?v=ptPBCSsPJR4?rel=0]This let me directly compare how this particular speaker tweets to how they speak. So what did I find? Do they tweet the same way they speak? It turns out that that actually depends. Yes! For some things (like the th-stopping and r-lessness like I mentioned above) this person does tweet and speak in pretty much the same way. They won't use an r in spelling where they wouldn't say an r sound and vice versa. No! But for other things (like saying ing words in or saying words like coffin and coughing with a different vowel in the first syllable) while this person does them a lot in thier speech, they aren't using variant spellings at the same level in thier tweets. So they'll say runnin 80% of the time, for example, but type it as running 60% of the time (rather than 20%, which is what we'd expect if the Twitter and speech data were showing the same thing).So what's going on? Why are only some things being used in the same way on Twitter and in speech? To answer that we'll need to dig a little deeper into the way these things in speech. How are th-stopping and r-lessness being used in speech? So when you compare the video above to one of the sports radio announcer that's being parodied (try this one) you'll find that they're actually used more in the video above than they are in the speech that's being parodied. This is pretty common in situations where someone's really laying on a particular accent (even one they speak natively), which sociolinguists call a performance register. What about the other things? The things that aren't being used as often Twitter as they are on speech, on the other hand, actually show up at the same levels in speech, both for the parody and the original. This speaker isn't overshooting thier use of these features; instead they're just using them in the way that another native speaker of a dialect would.So there's a pretty robust pattern showing up here. This person is only tweeting the way they speak for a very small set of things: those things that are really strongly associated with this dialect and that they're really playing up in thier speech. In other words, they tend to use the things that they're paying a lot of attention to in the same way both in speech and on Twitter. That makes sense. If you're very careful to do something when you're talking--not splitting an infinitive or ending a sentence with a preposition, maybe--you're probably not going to do it when you're talking. But if there's something that you do all the time when you're talking and aren't really aware of then it probably show up in your writing. For example, there are lots of little phrases I'll use in my speech (like no worries, for example) that I don't think I've ever written down, even in really informal contexts. (Except for here, obviously.)So the answer to whether tweets and speech act the same way is... is depends. Which is actually really useful! Since it looks like it's only the things that people are paying a lot of attention to that get overshot in speech and Twitter, this can help us figure out what things people think are really important by looking at how they use them on Twitter. And that can help us understand what it is that makes a dialect sound different, which is useful for things like dialect coaching, language teaching and even helping computers understand multiple dialects well.(BTW, If you're interested in more details on this project, you can see my poster, which I'll be presenting at NWAV44 this weekend, here.)]]>
Do you have an ear for your dialect?[/caption]One of the earliest studies to find differences between production and perception of dialect forms was carried out in the early seventies by William Labov, Malcah Yaeger, and Richard Steiner (you can read a discussion here on page 266). They found that speakers from Essex, in England, produced words like line and loin slightly differently. However, when they played recordings and asked speakers which word they heard, the participants weren't able to reliably hear a difference. And it wasn't just those two words or even that one dialect that they found this happening in: people reported hearing lots of mergers that they weren't, in fact, producing as mergers. They found the same effect for source and sauce in New York City, hock and hawk in Pennsylvania, full and fool in Albuquerque and too and toe in Norwich. And this pattern keeps cropping up in continuing work. Alan Yu, for example, found evidence of a near-merger between two tones in Cantonese in 2007.So you have pretty strong evidence of a split between dialectal perception and production here. This is pretty weird, since we tend to think of both production and perception as facets of one thing: capital-L Language.But there's a second side to the story as well. On the one side you have people that have a difference in production but no difference in perception. But on the other side you have people who can perceive and remember dialectal features effortlessly, but who don't produce them at all. Sumner and Samuels called these people fluent listeners. (You can read the whole paper here--experiment three has some interesting investigation of how fluent listeners store things in short and long term memory.) We've all probably run across someone who was a fluent listener: they're usually surprised that you can't understand thier friend whose accent is impenetrable to you, and who sounds nothing like the person who can understand them so easily.So if perception and production can have such marked mismatches, does this mean that we have to entirely abandon the idea that they're related? Not necessarily. Even though they may not perfectly mirror each other, dialect differences in perception and production do seem to be linked. Tyler Kendall and Valerie Fridland, for instanced, looked at perception and production in the Southern Vowel Shift (a type of ongoing sound change in the Southern United States). They found that, while individuals differed in how they heard and said these vowels, there was also a general trend: the more someone produced shifted vowels, the more likely they were to hear vowels as shifted. So there's no guarantee that someone will hear and produce things in the same way... but there is a relationship between them.It's not a solved problem by any means. There's a lot that we don't understand about the way that people perceive speech sounds, and a lot of work to be done. We can, however, make one robust observation: someone's dialect is likely to be related to the way they hear things.]]>
actually focus on the eyes of the person they're signing with -- not the hands at all. That makes it easier to see things like grammatical facial expressions. But it the use of other body parts doesn't stop there. In fact, I was recently surprised to learn that several sign languages around the world actually make use of the feet during signing! (If you'd asked me even a couple of months ago, I'd have guessed there weren't any, and I was super wrong.)Signs Produced on the FeetSo one way in which the feet are used during signing is that some signs are produced with the hands, but on top of or in contact with the feet. Signers aren't usually bending down to touch their toes in the middle of signing, though. Usually these are languages that are mainly used while sitting cross-legged on the ground. As a result, the feet are easily within the signing space. Some sign languages that produce signs on the feet: Yolgnu Sign Language (Australia) Al-Sayyid Bedouin Sign Language (Israel)Signs Produced With the Feet!Now these are even more exciting for me. Some languages actually use the feet as active articulators. This was very surprising to me. Why? Well, like I said before, most signers tend to look at other signers' eyes while they're communicating. If you're using your feet during signing, though, your communication partner will need to break eye contact, look down at your feet, and then look all the way back up to your face again. That may not sound like a whole lot of work, but imagine if you were reading this passage and every so often there was a word written on your knee instead of the screen. It would be pretty annoying, and languages tend not to do things that are annoying to their users (because language users stop doing it!). Some sign languages that produce signs with the feet: Walpiri Sign Language (Australia): Signs like RUN and WALK in this language actually involve moving the feet as if running or walking. Central Taurus Sign Language (Turkey): Color signs are produced by using the toe to point to appropriately colored parts of richly colored carpets. (Thanks to Rabia Ergin for the info!) Highland Mayan Sign Language/Meemul Tziij (Guatamala): Signers in this language not only use their feet, but they will actually reach down to the feet while standing. (Which is really interesting--I'd love to see more data on this language.)So, yes, multiple sign languages do make use of the feet as both places of articulation and active articulators. Interestingly, it seems to be predominantly village sign languages--that is, sign languages used by both deaf and hearing members in small communities with a high incidence of deafness. I don't know of any Deaf community sign languages--which are used primarily by culturally Deaf individuals who are part of a larger, non-signing society--that make use of the feet. I'd be very interested to hear if anyone knows of any!]]>
You teach/learn in a STEM classroom You'd like to be more inclusiveIf that's not you, you might want to skip this one. Sorry; I'll be back to my usual haunts with the next post.If you're still with me, you may be wondering what triggered this sudden departure from fun facts about linguistics. The answer is that I recently had an upsetting experience, and it's been been niggling at me. I'm a member of an online data analysis community that's geared towards people who program professionally. Generally, it's very helpful and a great way to find out about new packages and tricks I can apply in my work. The other day, though, someone posted a link to a project designed to sort women by thier physical attractiveness. I commented that it was not really appropriate for a professional environment, and was especially off-putting to the women in the group. I'm not upset that I spoke out, but I'm a little unhappy that I had to. I'm also upset that at least one person thought my criticisms were completely unnecessary. (And, yes, both the person who originally posted the link and the aforementioned commenter are male.)It got me thinking about inclusiveness in professional spaces, though. Am I really doing all I can to ensure that the field of linguistics is being inclusive? While linguistics as a whole is not horribly skewed male, professional linguists are more likely to be male, especially in computational linguistics. And we are definitely lacking in racial diversity; as the Linguistics Society of America (our main professional organization) puts it:The population of ethnic minorities with advanced degrees in linguistics is so low in the U.S. that none of the federal agencies report data for these groups.If you're like me, you see that as a huge problem and you want to know what you can do to help fix it. That's why I've put together this list of concrete strategies you can use in your classroom and interactions with students to be more inclusive, especially towards women. (Since I'm not disabled or a member of an ethnic minority group or I can't speak to those experiences, but I invite anyone who can and has additional suggestions to either comment below or contact me anonymously.) The suggestions below are drawn from my experience as both a teacher and a student, as well as input from the participants and other facilitators in last year's Including All Students: Teaching in the Diverse Classroom workshops.For Teachers: If someone calls you on non-inclusive behavior, acknowledge it, apologize and don't do it again. I know this seems like an obvious one, but it can be really, really important. For example, a lot of linguistics teaching materials are really geared towards native English speakers. The first quarter I taught I used a problem set in class that required native knowledge of English. When a student (one of several non-native speakers) mentioned it, I was mortified and tempted to just ignore the problem. If I had, though, that student would have felt even more alienated. If someone has the courage to tell you about a problem with your teaching you should acknowledge that, admit your wrong-doing and then make sure it doesn't happen again. Have space for anonymous feedback. That said, it takes a lot of courage to confront an authority figure--especially if you're already feeling uncomfortable or like you're not wanted or valued. To combat that, I give my students a way to contact me anonymously (usually through a webform of some kind). While it may seem risky, all the anonymous feedback I have ever received has been relevant and useful. Group work. This may seem like an odd thing to have on the list, but I've found that group work in the classroom is really valuable, both as an instructor and as a student. I may not feel comfortable speaking up or asking question in front of the class as a whole, but small groups are much less scary. My favorite strategy for group work is to put up a problem or discussion question and then drift from group to group, asking students for thier thoughts and answering questions. Structure interactive portions of the class. Sometimes small group work doesn't work well for your material. It's still really helpful to provide a structure for students to interact and ask questions, because it lets you ensure that all students are included (it has the additional benefit of keeping everyone awake during those drowsy after-lunch classes). Talbot Taylor, for example, would methodically go around in the classroom in order and ask every single student a question during class. Or you could have every student write a question about the course content to give to you at the end of class that you address at the beginning of the next class. Or, if you have readings, you can assign one or two students to lead the discussion for each reading. Don't tokenize. This is something that one of the workshop participants brought up and I realized that it's totally something I've been guilty of doing (especially if I know one of my students speaks a rare language). If there is only one student of a certain group in your class, don't ask them to speak for or represent thier group. So if you have one African American student, don't turn to them every time you discuss AAE. If they volunteer to speak about, great! But it's not fair to expect them too, and it can make students feel uncomfortable. If someone asks you to speak to someone else for them, don't mention the person who asked you. I know this one is oddly specific, but it's another thing that came out of the workshop. One student had asked thier advisor to ask another faculty member to stop telling sexist jokes in class. Their advisor did so, but also mentioned that it was the student who'd complained, and the second faculty member then ridiculed the student during the next class. (This wasn't in linguistics, but still--yikes!) If someone's asking you to pass something on for them, there's probably a very good reason why they're not confronting that person directly. Don't objectify minority students. This one mainly applies to women. Don't treat women, or women's bodies, like things. That's what was so upsetting for me about the machine learning example I brought up at the beginning of the article: the author was literally treating women like objects. Another example comes from geoscience, where a student tells about their experience at a conference where lecturers... included... photo[s] of a woman in revealing clothing.... I got the feeling that female bodies were shown not only to illustrate a point, but also because they were thought to be pretty to look at (Women in the Geosciences: Practical, Positive Practices Toward Parity, Holes et al., P.4).For Everybody: Actively advocate for minority students. If you're outside of a minority that you notice is not receiving equal treatment, please speak up about it. For example, if you're a man and you notice that all the example sentences in a class are about John--a common problem--suggest a sentence with Mei-Ling, or another female name, instead. It's not fair to ask students who are being discriminated against to be the sole advocates for themselves. We should all be on the lookout for sneaky prejudices. Don't speak for/over minority students. That said, don't put words in people's mouths. If you're speaking up about something, don't say something like, I think x is making Sanelle uncomfortable. It may very well be making Sanelle uncomfortable, but that's up for Sanelle to say. Try something like I'm not sure that's an appropriate example, instead.Those are some of my pointers. What other strategies do you have to help make the classroom more inclusive?]]>
at 60/40, the ratio of men to women in linguistics is better than it is in, for example, every single engineering field.Which is not to say that we don't have our problems. More women than men get undergraduate degrees in linguistics, but, as in other fields, the more senior you go the fewer women there are. In addition, we're overwhelmingly white. As the Linguistics Society of America (our main professional organization) puts it:The population of ethnic minorities with advanced degrees in linguistics is so low in the U.S. that none of the federal agencies report data for these groups. Ouch. I'm going to be focusing on the experience of women here because I am a woman, so I can tell you explicitly what that's like and what would make me personally feel more welcome. And while I think that many of these things would also be applicable to other minority populations in academia, I can't know that. I'd encourage you to seek out I'm going to be making three assumptions here: You're currently in a STEM field. You see the gender imbalance/underrepresentation of minorities as a problem You'd like to know what you specifically can do to fix itOk, so, what can we do about it?]]>
haven't changed all the much since 1500 B.C.E. (Seriously, phonology is real old school.) And that's not even touching on the many rich traditions of qualitative analysis. And yet, somehow, we all end up in the same departments and conferences. Why?Simple: because we all look at language. And there's something very special about language.]]>
My neighbor talks loudly on the phone and I can't sleep. What is the best method to block his voice noise?Great question Atif! There are few things more distracting than hearing someone else's conversation, and only hearing one side of a phone conversation is even worse. Even if you don't want it to, your brain is trying to fill in the gaps and that can definitely keep you awake. So what's the best way to avoid hearing your neighbor? Well, probably the very best way is to try talking to them. Failing that, though, you have three main options: isolation, damping and masking.So what's the difference between them and what's the best option for you? Before we get down to the nitty gritty I think it's worth a quick reminder of what sound actually is: sound waves are just that--waves. Just like waves in a lake or ocean. Imagine you and a neighbor share a small pond and you like to go swimming every morning. Your neighbor, on the other hand, has a motorboat that they drive around on thier side. The waves the motorboat makes keep hitting you as you try to swim and you want to avoid them. This is very similar to your situation: your neighbor's voice is making waves and you want to avoid being hit by them.Isolation: So one way to avoid feeling the effects of waves in a pond, to use our example, is to build a wall down the center of the pond. As long as there no holes in the wall for the waves to diffract through, you should be able to avoid feeling the effects of the waves. Noise isolation works much the same way. You can use earplugs that are firmly mounted in your ears to form a seal and that should prevent any sound waves from reaching your eardrums, right? Well, not quite. The wrinkle is that sound can travel through solids as well. It's like we built our wall in our pond out of something flexible, like rubber, instead of something solid, like brick. As waves hit the wall the wall itself will move with the wave and then transmit it to your side. So you may still end up hearing some noises, even with well-fitted headphones.Techniques: earplugs/earbuds, noise isolating headphone or earbuds, noise-isolating architecture,Damping: So in our pond example we might imagine doing something that makes it harder for waves to move through the water. If you replaced all the water with molasses or honey, for example, it would take a lot more energy for the sound waves to move through it and they'd dissipate more quickly.Techniques: acoustic tiles, covering the intervening wall (with a fabric wall-hanging, foam, empty egg cartons, etc.), covering vents, placing a rolled-up towel under any doors, hanging heavy curtains over windows, putting down carpetingMasking: Another way to avoid noticing our neighbor's waves is to start making our own waves. We can either make waves that are exactly the same size as our neighbor's but out of phase (so when theirs are at their highest peak, ours is at our lowest) so they end up cancelling each other out. That's basically what noise-cancelling headphones do. Or we can make a lot of own waves that all feel enough like our neighbor's that when thier wave arrives we don't even notice it. Of course, if the point it to hear no sound that won't work quite as well. But if the point is to avoid abrupt, distracting changes in sound then this can work quite nicely.Techniques: Listening to white noise or music, using noise-cancelling headphones or earbudsSo what would I do? Well, first I'd take as many steps as I could to sound-proof my environment. Try to cover as many of the surfaces in your bedroom as in absorbent, ideally fluffy, surfaces as you can. (If it can absorb water it will probably help absorb sound.) Wall hangings, curtains and a throw rug can all help a great deal.Then you have a couple options for masking. A fan help to provide both a bit of acoustic masking and a nice breeze. Personally, though, I like a white noise machine that gives you some control over the frequency (how high or low the pitch is) and intensity (loudness) of the sounds it makes. That lets you tailor it so that it best masks the sounds that are bothering you. I also prefer the ones with the fans rather than those that loop recorded sounds, since I often find the loop jarring. If you don't want to or can't buy one, though, myNoise has a number of free generators that let you tailor the frequency and intensity of a variety of sounds and don't have annoying loops. (There are a bunch of additional features available that you can access for a small donation as well.)If you can wear earbuds in bed, try playing a non-distracting noise at around 200-1000 Hertz, which will cover a lot of the speech sounds you can't easily dampen. Make sure your earbuds are well-fitted in the ear canal so that as much noise is isolated as possible. In addition, limiting the amount of exposed hard surface on them will also increase noise isolation. You can knit little cozies, try to find earbuds with a nice thick silicon/rubber coating or even try coating your own.By using many different strategies together you can really reduce unwanted noises. I hope this helps and good luck!]]>
Twitter research I discussed a couple weeks ago, but it's still kind of a cool project. (If I do say so myself.)[caption id= align=alignnone width=1200] Shhhh. I'm listening to linguistic data. Plush bunny with headphones. Licensed under Public Domain via Wikimedia Commons.[/caption]Basically, I wanted to know whether there are carryover effects for some of the mostly commonly-used linguistics tasks. A carryover effect is when you do something and whatever it was you were doing continues to affect you after you're done. This comes up a lot when you want to test multiple things on the same person.An example might help here. So let's say you're testing two new malaria treatments to see which one works best. You find some malaria patients, they agree to be in your study, and you give them treatment A and record thier results. Afterwards, you give them treatment B and again record their results. But if it turns out that treatment A cures Malaria (yay!) it's going to look like treatment B isn't doing anything, even if it is helpful, because everyone's been cured of Malaria. So thier behavior in the second condition (treatment B) is affected by thier participation in the first condition (treatment A): the effects of treatment A have carried over.There are a couple of ways around this. The easiest one is to split your group of participants in half and give half of them A first and half of them B first. However, a lot of times when people are using multiple linguistic tasks in the same experiment, then won't do that. Why? Because one of the things that linguists--especially sociolinguists--want to control for is speech style. And there's a popular idea in sociolinguistics that you can make someone talk more formally, but it's really hard to make them talk less formally. So you tend to end up with a fixed task order going from informal tasks to more formal tasks.So, we have two separate ideas here: The idea that one task can affect the next, and so we need to change task order to control for that The idea that you can only go from less formal speech to more formal speech, so you need to not change task order to control for thatSo what's a poor linguist to do? Balance task order to prevent carryover effects but risk not getting the informal speech they're interested in? Or keep task order fixed to get informal and formal speech but at the risk of carryover effects? Part of the problem is that, even though they're really well-studied in other fields like psychology, sociology or medicine, carryover effects haven't really been studied in linguistics before. As a result, we don't know how bad they are--or aren't!Which is where my research comes in. I wanted to see if there were carryover effects and what they might look like. To do this, I had people come into the lab and do a memory game that involved saying the names of weird-looking things called Fribbles aloud. No, not the milkshakes, one of the little purple guys below (although I could definitely go for a milkshake right now). Then I had them do one linguistic elicitation tasks (reading a passage, doing an interview, reading a list of words or, to control for the effects of just sitting there for a bit, an arithmetic task). Then I had them repeat the Fribble game. Finally, I compared a bunch of measures from speech I recorded during the two Fribble games to see if there was any differences.[caption id=attachment_1189 align=alignnone width=267] Greeble designed by Scott Yu and hosted by the Tarr Lab wiki (click for link).[/caption]What did I find? Well, first, I found the same thing a lot of other people have found: people tend to talk while doing different things. (If I hadn't found that, then it would be pretty good evidence that I'd done something wrong when designing my experiment.) But the really exciting thing is that I found, for some specific measures, there weren't any carryover effects. I didn't find any carryover effects for speech speed, loudness or any changes in pitch. So if you're looking at those things you can safely reorder your experiments to help avoid other effects, like fatigue.But I did find that something a little more interesting was happening with the way people were saying their vowels. I'm not 100% sure what's going on with that yet. The Fribble names were funny made-up words (like Kack and Dut) and I'm a little worried that what I'm seeing may be a result of that weirdness... I need to do some more experiments to be sure.Still, it's pretty exciting to find that there are some things it looks like you don't need to worry about carryover effects for. That means that, for those things, you can have a static order to maintain the style continuum and it doesn't matter. Or, if you're worried that people might change what they're doing as they get bored or tired, you can switch the order around to avoid having that affect your data.]]>
Picture of a bird saying Let's Tawk. Taken from the website of the Center for the Psychology of Women in Seattle. Click for link.[/caption]So if you've been following the Great Ideas in Linguistics series, you'll remember that I wrote about sociolinguistic variables a while ago. If you didn't, sociolinguistic variables are sounds, words or grammatical structures that are used by specific social groups. So, for example, in Southern American English (representing!) the sound in I is produced with only one sound, so it's more like ah.Now, in speech these sociolinguistic variables are very well studied. In fact, the Dictionary of American Regional English was just finished in 2013 after over fifty years of work. But in computer mediated communication--which is the fancy term for internet language--they haven't been really well studied. In fact, some scholars suggested that it might not be possible to study speech sounds using written data. And on the surface of it, that does make sense. Why would you expect to be able to get information about speech sounds from a written medium? I mean, look at my attempt to explain an accent feature in the last paragraph. It would be far easier to get my point across using a sound file. That said, I'd noticed in my own internet usage that people were using variant spellings, like tawk for talk, and I had a hunch that they were using variant spellings in the same way they use different dialect sounds in speech.While hunches have their place in science, they do need to be verified empirically before they can be taken seriously. And so before I submitted my abstract, let alone gave my talk, I needed to see if I was right. Were Twitter users using variant spellings in the same way that speakers use different sound patterns? And if they are, does that mean that we can investigate sound patterns using Twitter data?Since I'm going to present my findings at a conference and am writing this blog post, you can probably deduce that I was right, and that this is indeed the case. How did I show this? Well, first I picked a really well-studied sociolinguistic variable called the low back merger. If you don't have the merger (most African American speakers and speakers in the South don't) then you'll hear a strong difference between the words cot and caught or god and gaud. Or, to use the example above, you might have a difference between the words talk and tock. Talk is little more backed and rounded, so it sounds a little more like tawk, which is why it's sometimes spelled that way. I used the Twitter public API and found a bunch of tweets that used the aw spelling of common words and then looked to see if there were other variant spellings in those tweets. And there were. Furthermore, the other variant spellings used in tweets also showed features of Southern American English or African American English. Just to make sure, I then looked to see if people were doing the same thing with variant spellings of sociolinguistic variables associated with Scottish English, and they were. (If you're interested in the nitty-gritty details, my slides are here.)Ok, so people will sometimes spell things differently on Twitter based on their spoken language dialect. What's the big deal? Well, for linguists this is pretty exciting. There's a lot of language data available on Twitter and my research suggests that we can use it to look at variation in sound patterns. If you're a researcher looking at sound patterns, that's pretty sweet: you can stay home in your jammies and use Twitter data to verify findings from your field work. But what if you're not a language researcher? Well, if we can identify someone's dialect features from their Tweets then we can also use those features to make a pretty good guess about their demographic information, which isn't always available (another problem for sociolinguists working with internet data). And if, say, you're trying to sell someone hunting rifles, then it's pretty helpful to know that they live in a place where they aren't illegal. It's early days yet, and I'm nowhere near that stage, but it's pretty exciting to think that it could happen at some point down the line.So the big take away is that, yes, people can tweet with an accent, and yes, linguists can use Twitter data to investigate speech sounds. Not all of them--a lot of people aren't aware of many of their dialect features and thus won't spell them any differently--but it's certainly an interesting area for further research.]]>
pictures of people writing, and I noticed that there were two gendered sub-categories, one for men and one for women. Leaving aside the question of having only two genders, what really stuck out to me were the names. The category with pictures of men was called Men Writing and the category with pictures of women was called Females Writing.[caption id= align=alignnone width=512] According to this sign, the third most common gender is child.[/caption]So why did that bother me? It is true that male humans are men and that women are female humans. Sure, a writing professor might nag about how the two terms lack parallelism, but does it really matter?The thing is, it wouldn't matter if this was just a one-off thing. But it's not. Let's look at the Category: Males and Category: Females*. At the top of the category page for men, it states This category is about males in general. For human males, see Category:Male humans. And the male humans category is, conveniently, the first subcategory. Which is fine, no problem there. BUT. There is no equivalent disclaimer at the top of Category: Females, and the first subcategory is not female humans but female animals. So even though Females is used to refer specifically to female humans when talking about writing, when talking about females in general it looks as if at least one editor has decided that it's more relevant for referring to female animals. And that also gels with my own intuitions. I'm more like to ask How many females? when looking at a bunch of baby chickens than I am when looking at a bunch of baby humans. Assuming the editors responsible for these distinctions are also native English speakers, their intuitions are probably very similar.So what? Well, it makes me uncomfortable to be referred to with a term that is primarily used for non-human animals while men are referred to with a term that I associate with humans. (Or, perhaps, women are being referred to as female men, but that's equally odd and exclusionary.)It took me a while to come to that conclusion. I felt that there was something off about the terminology, but I had to turn and talk it over with my officemate for a couple minutes before finally getting at the kernel of the problem. And I don't think it's a concious choice on the part of the editors--it's probably something they don't even realize they're doing. But I definitely do think that it's related to the gender imbalance of the editors of Wikimedia. According to recent statistics, over ninety percent (!) of Wikipedia editors are male. And this type of sexist language use probably perpetuates that imbalance. If I feel, even if it's for reasons that I have a hard time articulating, that I'm not welcome in a community then I'm less likely to join it. And that's not just me. Students who are presented with job descriptions in language that doesn't match thier gender are less likely to be interested in those jobs. Women are less likely to respond to job postings if he is used to refer to both men and women. I could go on citing other studies, but we could end up being here all day.My point is this: sexist language affects the behaviour and choices of those who hear it. And in this case, it makes me less likely to participate in this on-line community because I don't feel as if I would be welcomed and respected there. It's not only Wikipedia/Wikimedia, either. This particular usage pattern is also something I associate with Reddit (a good discussion here). The gender breakdown of Reddit? About 70% male.For some reason, the idea that we should avoid sexist language usage seems to really bother people. I was once a TA for a large lecture class where, in the middle of discussions of the effects of sexist language, a male student interrupted the professor to say that he didn't think it was a problem. I've since thought about it quite a bit (it was pretty jarring) and I've come to the conclusion that the reason the student felt that way is that, for him, it really wasn't a problem. Since sexist language is almost always exclusionary to women, and he was not a women, he had not felt that moment of discomfort before.Further, I think he may have felt that, because this type of language tends to benefit men, he felt that we were blaming him. I want to be clear here: I'm not blaming anyone for thier unconscious biases. And I'm not saying that only men use sexist language. The Wikimedia editors who made this choice may very well have been women. What I am saying is that we need to be aware of these biases and strive to correct them. It's hard, and it takes constant vigilance, but it's an important and relatively simple step that we can all take in order to help eliminate sexism.*As they were on Wednesday, April 8 2015. If they've been changed, I'd recommend the Way Back Machine.]]>
lot of variation in the vocal tract (all those parts of your head and neck that you use to produce speech sounds). For example, the epiglottis, that little flap that keeps you from swallowing your food into your lungs, has between five and six completely different shapes. It can be thin and flat, with serrated edges, thick with rounded edges, or a mixture of the two. If looking at it didn't involve sticking cameras down the throat via the mouth or nose, it would actually be pretty useful for biometrics.The tongue doesn't have quite as much variation in shape as the epiglottis, but there is one bit of variation that seems to get quite a bit of interest: tongue length.[caption id= align=alignnone width=600] Now hold that while I get a measuring tape.[/caption]So what can affect tongue length? Well, the biggest factor is probably how you measure it. The Guinness Book of World Records, for example, measures the length of the tongue from the tip of the extended tongue to the middle of the top lip. (The current record holder, Nick Stoeberl, can extended his tongue almost four inches past his top lip.) But, as you'll notice looking at the diagram above, the amount of the tongue that can stick out past your lips is actually pretty limited. The tongue itself goes all the way down to the hyoid bone, in your throat. So if you want to accurately measure the entire tongue, probably the most accurate way is to measure from the tongue tip to the epiglottis (down in the throat) while the tongue is at rest. The downside to this, of course, is that it will trigger gagging and it's hard to see what you're doing at the back of someone's throat. Plus it has the definite potential to block the airways. As a result, tongue measurement of this type tend to be done on cadavers. There are also some imaging techniques like x-ray, ultrasound or MRI. But let's assume that you don't have a couple hundreds of thousands of dollars' worth of equipment or a medical cadaver just lying around and just focus on that first measurement--although be warned that it doesn't have very strong inter-rater reliability.Now that that's out of the way, we can get down to business: what affects how far you can stick your tongue out? There are actually a lot of factors at work here: Frenulum: The lingual frenulum, that is. This is the little bit of tissue that connects the bottom of your tongue to the floor of you mouth. For most people this actually won't affect tongue extension, but for some people it's a big problem. Have you ever heard the expression tongue tied? This actually refers to a lingual frenulum that's too short and extends too far towards the tip of the tongue. This condition, which is called ankyloglossia, is especially problematic when trying to produce speech sounds or for babies who are trying to nurse. In some cases, doctors may actually cut the lingual frenulum in order to free the tongue. For most people, though, cutting the frenulum would not increase freedom of movement or length of extension in the tongue. Plus, the risks associated with oral surgery are substantial. Bone structure and tooth placement: Bone structure and tooth placement can also affect how far the tongue can be extended. People with short face syndrome--yes, that's a real medical diagnosis--and overjet tend to have smaller tongues. Other factors such as incisor position and whether a line drawn between the upper and lower sets of teeth tilts or not also co-vary with tongue length. Age: One obvious factor that affects tongue size is age. Adults' tongues are approximately twice the size of infants'. This is surprising, given that the infant's skull makes up 1/4 of its height where as for adults that figure is only 1/7. As a result, an adult skull is only roughly 1.75 times as large as an infant skull. Biological sex. Finally, there is a slight affect of biological sex. During puberty, high levels of testosterone and human growth hormone trigger growth, especially in the jaw and chin, and this effect is more pronounced in individuals with testes. As a result, their tongues tend to be longer. Too much human growth hormone--acromegaly--can cause growth to continue well past the point of comfort. It also causes the tongue to enlarge and shift forwards in the mouth.You may notice all these factors have one thing in common: they're not something you can change. Like your height or body-shape, tongue length isn't really something you can really change about yourself. The good news, though, is that you can produce speech perfectly well with pretty much any length of tongue.]]>
Sociolinguistics is in a earlier GIiL post. But what I didn't really talk about are sociolinguistic variables, the specific things in that language that co-vary with some sociological factor.[caption id= align=alignnone width=256] Man, these sociolinguistic variables are really hard to isolate. Maybe combinatorics isn't the right approach here... Photo: Konrad Jacobs[/caption]So that's the dictionary definition, but what makes something a sociolinguistic variable? Let's start off with some examples. Sociolinguistic variables exist at all levels of the grammar. Here are some examples from African American English, the systematic, rule-governed variety of English used predominantly by African Americans: Phonetics: The vowels in trap and bath have not undergone merger: there is a difference between ant and aunt Phonology: Word-initial unstressed syllables can undergo deletion: bout instead of about Morphology: Be is used as a free morpheme to express habitual, repeated action: My lip gloss be poppin. Syntax: Under certain rule-governed circumstances, the copula can be deleted: That dog cute. Semantics: Negative inversion for semantic intensification: Don't nobody like raw kale. Pragmatics: The use of signifying as a linguistic act.Ok, so that means that pretty much anything can be a sociolinguistic variable, right? Not exactly. So these are all variables that are associated African American English (AAE), but there are some things that almost all speakers of African American English do that aren't dialect markers. For example, almost all speakers of AAE will flap. But the same thing is true of pretty much every other speaker of English in America. So if you were looking at speakers of AAE you wouldn't find that their use of flapping was different from the surrounding linguistic communities.To be a sociolinguistic variable, something has to vary along with social categories. So something linguistic that men do more than women--such as interrupting--would be a gendered sociolinguistic variable, but something that men and women do equally wouldn't be.How do you find a sociolinguistic variable? Well, like most science, it starts with a general observation. After that, you need to carefully collect linguistic samples containing places where you think the variable should show up from people who are part of the group you're interested in. If it's well-studied, you can then use other people's data for comparison with different populations. If it's something new, though, you'll need to collect your own comparison data. Then, a careful analysis will show you whether or not the thing you noticed is something that varies systematically along with your social variable of interest. If it does, congratulations: you've found a sociolinguistic variable!]]>
If tests arent your style and you just want to play the odds, though, guess their, there and theyre in that order. According to Googles n-gram viewer (click the chart to go play around with it) their is the most common [ð<U+025B>r] in writing, followed by there and then theyre.[/caption] There. So the confusing thing here is that there are really *two* there's in English and they play really different roles. Pleonastic there. So in English we really need subjects, even when we don't. Some sentences like It's raining and There's no more ice-cream don't actually need a subject to convey what we're getting at. There's no thing, it, up in the sky that is doing the raining like there's a person throwing a ball in They threw the ball. We just stick it up in there to fill out our sentence. Test: Can you replace [ð<U+025B>r] with it? If so, it's probably there. Test: If the sentence has [ð<U+025B>r] was/were/is/are/will it will almost always be there. Locative there. So locative is just a fancy word for relating to a place. Are you talking about a place? If so, then you probably need there. Test: Is [ð<U+025B>r] referring to a place? If so, it's probably there. Their. So people tend to use a semantic definition for this one; does it belong to someone? It's way easier to figure it out with part of speech, though. Their is part of a pretty small class of words called determiners-- you may also have heard articles. One good way to test if a word belongs to the same part of speech as another is to replace it in the sentence. You know snake and pudding are both nouns because you say either My snake fell off the shelf or My pudding fell off the shelf. So all you have to do is swap it out with one of the other English Determiners and see if it works. Test: Can you replace [ð<U+025B>r] with words like my, our, the or some? If so, it's their. They're. This is probably the easiest one. They're is a contraction of they and are. If you can uncontract them and the sentence still works, you're golden. Test: Can you replace [ð<U+025B>r] with they are? If so, it's probably they're.Try out these tests next time you're not sure which [ð<U+025B>r] is the right one and you should figure it out pretty quickly. Of course, there are some marginal cases (like when you're talking about the words themselves) that may throw you off, but these guidelines should pull you through 99% of the time.* Not actually my middle name.]]>
Easy! It's A, E, I, O, U and sometimes Y. A speech sound produced without constriction of the vocal tract above the glottis.Everyone got the second one, right? No? Huh, maybe we're not on the same page after all.There's two problems with the andsometimesY definition of vowels. The first is that it's based on the alphabet and, as I've discussed before, English has a serious problem when it comes to mapping sounds onto letters in a predictable way. (It gives you the very false impression that English has six-ish vowels when it really has twice that many.) The second is that isn't really a good way of modelling what a vowel actually is. If we got a new letter in the alphabet tomorrow, zborp, we'd have no principled way of determining whether it was a vowel or not.[caption id= align=alignnone width=600] Ah, a new letter is it? Time to get out the old vowelizing dice and re-roll. Letter dice d6. Licensed under CC BY-SA 3.0 via Wikimedia Commons.[/caption]But the linguistic definition captures some other useful qualities of vowels as well. Since vowels don't have a sharp constriction, you get acoustic energy pretty much throughout the entire spectrum. Not all frequencies are created equal, however. In vowels, the shape of the vocal tract creates pockets of more concentrated acoustic energy. We call these formants and they're so stable between repetitions of vowels that they can be used to identify which vowel it is. In fact, that's what you're using to distinguish beat from bet from bit when you hear them aloud. They're also easy to measure, which means that speech technologies rely really heavily on them.Another quality of vowels is that, since the whole vocal tract has to unkink itself (more or less) they tend to take a while to produce. And that same openness means that not much of the energy produced at the vocal folds is absorbed. In simple terms, this means that vowels tend to be longer and louder than other sounds, i.e. consonants. This creates a neat little one-two where vowels are both easier to produce and hear. As a result, languages tend to prefer to have quite a lot of vowels, and to tack consonants on to them. This tendency shakes out create a robust pattern across languages where you'll get one or two consonants, then a vowel, then a couple consonants, then a vowel, etc. You've probably run across the term linguists use for those little vowel-nuggets: we call them syllables.If you stick with the andsometimesY definition, though, you lose out on including those useful qualities. It may be easier to teach to five-year-olds, but it doesn't really capture the essential vowelyness of vowels. Fortunately, the linguistics definition does.]]>
phrase structure rules) that will let you produce all the grammatical sentences in a language and none of the ungrammatical ones. So, if you're proposing a new rule you need to show that the sentences it outputs are grammatical... but how do you do that?[caption id= align=alignnone width=262] I sentence you to ten hours of community service for ungrammatical utterances![/caption]One way to test whether something is grammatical is to see whether someone's said it before. Back in the day, before you had things like large searchable corpora--or, heck even the internet--this was difficult, so say the least. Especially since the really interesting syntactic phenomena tend to be pretty rare. Lots of sentences have a subject and an object, but a lot fewer have things like wh-islands.Another way is to see if someone will say it. This is a methodology that is often used in sociolinguistics research. The linguist interviews someone using questions that are specifically designed to elicit certain linguistic forms, like certain words or sounds. However, this methodology is chancy at best. Often times the person won't produce whatever it is you're looking for. Also it can be very hard to make questions or prompts to access very rare forms.Another way to see whether something is grammatical is to see whether someone would say it. This is the type of evidence that has, historically, been used most often in syntax research. The concept is straightforward. You present a speaker of a language with a possible sentence and they use thier intuition as a native speaker to determine whether it's good (grammatical) or not (ungrammatical). These sentences are often outputs of a proposed structure and used to argue either for or against it.However, in practice grammaticality judgements can occasionally be a bit more difficult. Think about the following sentences: I ate the carrot yesterday. This sounds pretty good to me. I'd say it's grammatical. *I did ate the carrot yesterday. I put a star (*) in front of this sentence because it sounds bad to me, and I don't think anyone would say it. I'd say it's ungrammatical. ? I done ate the carrot yesterday. This one is a little more borderline. It's actually something I might say, but only in a very informal context and I realize that not everyone would say it.So if you were a syntactician working on these sentences, you'd have to decide whether your model should account for the last sentence or not. One way to get around this is by building probability into the syntactic structure. So I'm more likely to use a structure that produces the first example but there's a small probability I might use the structure in the third example. To know what those probabilities are, however, you need to figure out how likely people are to use each of the competing structures (and whether there are other factors at play, like dialect) and for that you need either lots and lots of grammaticality judgements. It's a new use of a traditional tool that's helping to expand our understanding of language.]]>
maybe DDT wasn't such a panacea after all, all important scientific discoveries have sprung from that same general process of recognizing patterns.And that process is at work in language as well. Let's take a look at the following way of conjugating English verbs.I walk We walkYou walk You walkHe/she/it walks They walkNow, if you're the noticing type of person you might find that there' s a glaring problem that's messing up an otherwise nice, predictable pattern: that odd out-of-place s in She walks. Why, it's downright irksome. Wouldn't it make a lot more sense just to get rid of it entirely and have a nice, lovely, completely predictable conjugation like this one:I walk We walkYou walk You walkHe/she/it walk They walkOf course it would. And in fact, there are some speakers of English who do just that. Dropping the third person singular s, as it turns out, is a common feature of African American English. And if similar processes in other languages, such as Latin, are any guide, we may all one day adopt this entirely sensible practice, which is commonly referred to as paradigm levelling.In fact, English has already undergone a massive process of morphological simplification, including a lot of paradigm levelling, once before. During the transition from Old English to Middle English, we lost a whole bucketful of cases and person markings. This was partly due to language contact in the Danelaw, where Viking settlers interacted and intermarried with the local English-speaking population. Being no-nonsense second language learners, they did away with a lot of the odder patterns and left us with something that much more closely resembled the comparatively morphologically streamlined English of today.And the same process has occurred over and over again the world's languages. People notice that something isn't what you'd expect, given the pattern in place, and choose to follow the pattern rather than historical precedent, tidying away some of the messiness that inevitably creeps into languages over time. Paradigm levelling is a powerful force for linguistic change and a useful theoretical tool in historical linguistics.]]>
Expect rigour! Just to clarify here, by rigour I don't mean difficulty. Rather, I mean rigour in the mathematical sense. Linguistics is a very exact discipline and part of learning how to be a linguist is learning how to carefully, precisely solve problems. There will be right and wrong answers. You may be expected to explain how you solved a problem. If you come from a background with a lot of mathematics or formal logic linguistics problems will feel probably very familiar to you. (I have a friend, now a math PhD candidate, who really enjoyed phonology because, in his words, It's applied set theory!.) A lot of students who have an interest in language from literary or foreign-language studies are often surprised by this aspect of linguistics courses, however. Be prepared for a little bit of memorization. Every introductory linguistics course I'm familiar with covers the International Phonetic Alphabet pretty early on in the class and students are expected to memorize at least part of it. I'm a fan of this, since knowing IPA is a pretty handy life skill and it allows you solve phonology problems much more quickly. But it can be a nasty surprise if you're not ready for it and don't set aside enough time for studying. Get ready to unlearn. You speak at least one language. You're in college. You know a fair amount about how language works... right? Well, yes, but not in the way you think. You're going to have to unlearn a lot of things you've been taught about language, especially about what you should do/write/say and a lot of the grammar rules you've been taught. Again, this can be frustrating for a lot of students. You've spent a long time laboriously learning about language, you've obviously developed enough of an interest in language to take a linguistics course, and in the first week of class we basically tell you you've been lied to! This can actually be a blessing in disguise, though. It lets the whole class start out at a similar place and you'll be learning the basics of morphology and syntax right along with you classmates. Study group, anyone? Be patient with yourself. Introductory linguistics classes are always a bit of a whirlwind. You're swept from subdispline to subdisipline and just as soon as you're feeling comfortable with morphology suddenly it's on to syntax with no chance to catch your breath. It's just the nature of a introductory survey course, though; it's a tasting menu, not a a la carte. Remember what catches your interest and pursue it in more coursework or readings later, don't try to do it all just as you're encountering ideas and methods for the first time. Ask for help. Don't be afraid of asking for extra help! Go to office hours if you don't understand something. Form a study group. (It's even better if you can get people from different academic backgrounds.) There are also lots of great resources online. This blog post has a lot of great resources and this post gives a lot of great, really concrete advice about doing assignments in intro linguistics courses.But it's also really important just to relax and have fun. You'll cover a lot of material, granted, but that also means you'll learn a lot! And introductory courses tend to be a great place to learn lots of fun facts and find the answers to language mysteries that have been niggling at you. Welcome to linguistics; I think you're going to like it.*Don't worry, we'll be getting back to the Great Ideas in Linguistics series after these short messages.]]>
Babies don't benefit from direct language instruction and it may actually hurt them.In other words:Language acquisition is process unique to humans that allows us to learn our first language without directly being taught it.Which doesn't sound so ground-breaking... until you realize that that means that language use is utterly unique among human behaviours. Oh sure, we learn other things without being directly taught them, even relatively complex behaviours like swallowing and balancing. But unlike speaking, these aren't usually under concious control and when they are it's usually because something's gone wrong. Plus, as I've discussed before, we have the ability to be infinitely creative with language. You can learn to make a soufflé without knowing what happens when you combine the ingredients in every possible combination, but knowing a language means that you know rules that allow you to produce all possible utterances in that language.So how does it work? Obviously, we don't have all the answers yet, and there's a lot of research going on on how children actually learn language. But we do know what it generally tends to look like, precluding things like language impairment or isolation. Vocal play. The kid's figured out that they have a mouth capable of making noise (or hands capable of making shapes and movements) and are practising it. Back in the day, people used to say that infants would make all the sounds of all the world's languages during this stage. Subsequent research, however, suggests that even this early children are beginning to reflect the speech patterns of people around them. Babbling. Kids will start out with very small segments of language, then repeat the same little chunk over and over again (canonical babbling), and then they'll start to combine them in new ways (variegated babbling). In hearing babies, this tends to be syllables, hence the stereotypical mamamama. In Deaf babies it tends to be repeated hand motions. One word stage. By about 13 months, most children will have begun to produce isolated words. The intended content is often more than just the word itself, however. A child shouting Dog! at this point could mean Give me my stuffed dog or I want to go see the neighbour's terrier or I want a lion-shaped animal cracker (since at this point kids are still figuring out just how many four-legged animals actually are dogs). These types of sentences-in-a-word are known as holophrases. Two word stage. By two years, most kids will have moved on to two-word phrases, combining words in way that shows that they're already starting to get the hang of their language's syntax. Morphology is still pretty shaky, however: you're not going to see a lot of tense markers or verbal agreement. Sentences. At this point, usually around age four, people outside the family can generally understand the child. They're producing complex sentences and have gotten down most, if not all, of the sounds in their language.These general stages of acquisition are very robust. Regardless of the language, modality or even age of acquisition we still see these general stages. (Although older learners may never completely acquire a language due to, among other things, reduced neuroplasticity.) And the fact they do seem to be universal is yet more evidence that language acquisition is a unique process that deserves its own field of study.]]>
discourse analysis, which is certainly a fascinating area of study, but I wasn't sure it was enough to serve as the basis for a major discipline of linguistics. Fortunately, I've learned a great deal about sociolinguistics since that time.Sociolinguistics is the sub-field of linguistics that studies language in its social context and derives explanatory principles from it. By knowing about the language, we can learn something about a social reality and vice versa.Now, at first glance this may seem so intuitive that it's odd someone would to the trouble of stating it directly. As social beings, we know that the behaviour of people around us is informed by their identities and affiliations. At the extreme of things it can be things like having a cultural rule that literally forbids speaking to your mother-in-law, or requires replacing the letters ck with cc in all written communication. But there are more subtle rules in place as well, rules which are just as categorical and predictable and important. And if you don't look at what's happening with the social situation surrounding those linguistic rules, you're going to miss out on a lot.Case in point: Occasionally you'll here phonologists talk about sound changes being in free variation, or rules that are randomly applied. BUT if you look at the social facts of the community, you'll often find that there is no randomness at all. Instead, there are underlying social factors that control which option a person makes as they're speaking. For example, if you were looking at whether people in Montreal were making r-sounds with the front or back of the tongue and you just sampled a bunch of them you might find that some people made it one way most of the time and others made it the other way most of the time. Which is interesting, sure, but doesn't have a lot of explanatory power.However, if you also looked at the social factors associated with it, and the characteristics of the individuals who used each r-sound, you might notice something interesting, as Clermont and Cedergren did (see the illustration). They found that younger speakers preferred the back-of-the-mouth r-sound, while older people tended to use the tip of the tongue instead. And that has a lot more explanatory power. Now we can start asking questions to get at the forces underlying that pattern: Is this the way the younger people have always talked, i.e. some sort of established youthful style, or is there a language change going on and they newer form is going to slowly take over? What causes younger speakers to use the the form they do? Is there also an effect of gender, or who you hang out with?[caption id=attachment_1101 align=aligncenter width=523] Figure one from Sankoff and Blondeau. 2007. (Click picture to look at the whole study.) As you can see, younger speakers are using [R] more than older speakers, and the younger a speaker is the more likely they are to use [R].[/caption]And that's why sociolinguistics is all kinds of awesome. It lets us peel away and reveal some of the complexity surrounding language. By adding sociological data to our studies, we can help to reduce statistical noise and reveal new and interesting things about how language works, what it means to be a language-user, and why we do what we do.]]>
loser. Mea culpa.) I'll occasionally find that my students aren't familiar with something I'd assumed they'd covered at some point already. I've also found that there are relatively few resources for looking up linguistic ideas that don't require a good deal of specialized knowledge going in. SIL's glossary of linguistic terms is good but pretty jargon-y, and the various handbooks tend not to have on-line versions. And even with a concerted effort by linguists to make Wikipedia a good resource, I'm still not 100% comfortable with recommending that my students use it.Therefore! I've decided to make my own list of Things That Linguistic-Type People Should Know and then slowly work on expounding on them. I have something to point my students to and it's a nice bite-sized way to talk about things; perfect for a blog.Here, in no particular order, are 50ish Great Ideas of Linguistics sorted by sub-discipline. (You may notice a slightly sub-disciplinary bias.) I might change my mind on some of these--and feel free to jump in with suggestions--but it's a start. Look out for more posts on them. Sociolinguistics Sociolinguistic variables Social class and language Social networks Accommodation Style Language change Linguistic security Linguistic awareness Covert and overt prestige Phonetics Places of articulation Manners of articulation Voicing Vowels and consonants Categorical perception Ease Modality Phonology Rules Assimilation and dissimilation Splits and mergers Phonological change Morphology Paradigm levelling Case Tense and aspect Affixes Syntax Hierarchical structure Competence vs. Performance Movement Grammaticality judgements Semantics Pragmatics Truth values Scope Lexical semantics Compositional semantics Computational linguistics Classifiers Natural Language Processing Speech recognition Speech synthesis Automata Documentation/Revitalization Language death Self-determination Psycholinguistics Language acquisition Reading]]>
this page or this page.]]>
right up there with brains so big they make birth potentially fatal. (Hardly a triumph of effective design.) You see, the human larynx, though in many ways very similar to that of other mammals, has a few key differences. It's much further down in the neck and it's pulled the tongue down with it. As a result, we have the unique and rather stupid ability to choke to death on our own food. Of course, an anatomical handicap of that magnitude must have been compensated for by something else, otherwise we wouldn't be here. And the pay-off in this case was pretty awesome: speech.[caption id= align=alignnone width=326] All the comforts of agriculture, air conditioning and medicine and he can still breathe and swallow at the same time, the smug pup.[/caption]But what does all this have to do with dogs? Well, dogs do have a larynx that looks very like humans'. In fact, they make sound in very similar ways: by forcing air through abducted vocal folds. But dogs have a very short vocal folds and they're scrunched up right above the root of the tongue. This has two main effects: There's a very limited number of possible tongue positions available to dogs during phonation. This means that dogs aren't able to modulate air with the same degree of fine control that we humans are. (Which is why Scooby-Doo sounds like he really needs some elocution lessons.) We do have this control, and that's what gives us the capacity to make so many different speech sounds. Dogs have their soft palate touching their epiglottis when they're at rest. The soft palate is the spongy bit of tissue at the back of your mouth that separates your nasal and oral cavities. The epiglottis is a little piece of tongue-shaped or leaf-shaped cartilage in your throat that flips down to neatly cover your esophagus when you swallow. If they touch, then you've got a complete separation between your food-tube and your air-tube and choking becomes a non-issue.Some humans have that same whole palate-epiglottis-kissing things going on: very young babies. You can see what I'm talking about here. That and how proportionally huge the tongue is is probably why babies can acquire sign quite a bit before they can start speaking; their vocal instruments just aren't fully developed yet. The upside of this is that babies also don't have to worry about choking to death.But once the larynx drops, breathing and swallowing require a bit more coordination. For one thing, while you're swallowing breathing is suppressed in the brain-stem, so that even if you're unconscious you don't try to breathe in your own saliva. We also have a very specific pattern of breath while we're eating. Try paying attention next time you sit down to a meal: while you're eating or drinking you tend to stick to a pattern of exhale -- swallow -- exhale. That way you avoid incoming air trying to carry little bits of food or water into your lungs. (Aspiration pneumonia ain't no joke.)So your dog doesn't choke for the same reason it can't strike up a conversation with you: its larynx is too high. Who knows? Maybe in a few hundred years, and with a bit of clever genetic engineering, dogs will be talking, and choking, along with us. That doesn't mean that something large or oddly-shaped can't get stuck in esophagus, though.]]>
radio dramas and use a headset to chat with my guildies while I'm gaming. As a result, I spend a lot of time with things on/in my ears. And, because of my background, I'm also fairly well informed about the acoustic properties of earphones and headphones and how they interact with anatomy. All of which helps me answer the question: which is better? Or, more accurately, what are some of the pros and cons of each? There are a number of factors to consider, including frequency response, noise isolation, noise cancellation and comfort/fit. Before I get into specifics, however, I want to make sure we're on the same page when we talk about headphones and earphones.Earphones: For the purposes of this article, I'm going to use the term earphone to refer to devices that are meant to be worn inside the pinna (that's the fancy term for the part of the ear you can actually see). These are also referred to as earbuds, buds, in-ears, canalphones, in-ear moniters, IEM's and in-ear headphones. You can see an example of what I'm calling earphones below.[caption id=false align=alignnone width=512] Ooo, so white and shiny and painful.[/caption]Headphones: I'm using this term to refer to devices that are not meant to rest in the pinna, whether they go around or on top of the ear. These are also called earphones, (apparently) earspeakers or, my favorites, cans. You can see somewhat antiquated examples of what I'm calling headphones below.[caption id=false align=alignnone width=512] I mean, sure, it's a wonder of modern technology and all, but the fidelity is just so low.[/caption]Alright, now that we've cleared that up, let's get down to brass tacks. (Or, you might say.... bass tacks.) Frequency response curve: How much distortion do they introduce? In an ideal world, 'phones should responded equally well to all frequencies (or pitches), without transmitting one frequency rage more loudly than another. This desirable feature is commonly referred to as a flat frequency response. That means that the signal you're getting out is pretty much the same one that was fed in, at all frequency ranges. Earphones: In general, earphones tend to have a worse frequency response. Headphones: In general, headphones tend to have better frequency response. Winner: Headphones are probably the better choice if you're really worried about distortion. You should read the specifications of the device you're interested in, however, since there's a large amount of variability. Frequency response: What is their pitch range? This term is sometimes used to refer to the frequency response curve I talked about above and sometimes used to refer to pitch range. I know, I know, it's confusing. Pitch range is usually expressed as the lowest sound the 'phones can transmit followed by the highest. Most devices on the market today can pretty much play anything between 20 and 20k Hz. (You can see what that sounds like here. Notice how it sounds loudest around 300Hz? That's an artifact of your hearing, not the video. Humans are really good at hearing sounds around 300Hz which [not coincidentally] is about where the human voice hangs out.) Earphones: Earphones tend to have a smaller pitch range than headphones. Of course, there are always exceptions. Headphones: Headphones tend to have a better frequency range than earphones. Winner: In general, headphones have a better frequency range. That said, it's not really that big of a deal. You can't really hear very high or very low sounds that well because of the way your hearing system works regardless of how well your 'phones are delivering the signal. Anything that plays sounds between 20Htz and 20,000Htz should do you just fine. Noise isolation: How well do they isolate you from sounds other than the ones you're trying to listen to? More noise isolation is generally better, unless there's some reason you need to be able to hear environmental sounds as well whatever you're listening to. Better isolation also means you're less likely to bother other people with your music. Earphones: A properly fitted pair of in-ear earphones will give you the best noise isolation. It makes sense; if you're wearing them properly they should actually form a complete seal with your ear canal. No sound in, no sound out, excellent isolation. Headphones: Even really good over-ear headphones won't form a complete seal around your ear. (Well, ok, maybe if you're completely bald and you make some creative use of adhesives, but you know what I mean.) As a result, you're going to get some noise leakage . Winner: You'll get the best noise isolation from well-fitting earphones that sit in the ear canal. Noise cancellation: How well can they correct for atmospheric sounds? So noise cancellation is actually completely different from noise isolation. Noise isolation is something that all 'phones have. Noise-cancelling 'phones, on the other hand, actually do some additional signal processing before you get the sound. They listen for atmospheric sounds, like an air-conditioner or a car engine. Then they take that waveform, reproduce it and invert it. When they play the inverted waveform along with your music, it exactly cancels out the sound. Which is awesome and space-agey, but isn't perfect. They only really work with steady background noises. If someone drops a book, they won't be able to cancel that sudden, sharp noise. They also tend not to work as well with really high-pitched noises. Earphones: Noise-cancelling earphones tend not be as effective as noise-cancelling headphones until you get to the high end of the market (think $200 plus). Headphones: Headphones tend to be slightly better at noise-cancellation than earphones of a similar quality, in my experience. This is partly due to the fact that there's just more room for electronics in headphones. Winner: Headphones usually have a slight edge here. Of course, really expensive noise-cancelling devices, whether headphones or earphones, usually perform better than their bargain cousins. Comfort/fit: Is they comfy? Earphones: So this is where earphones tend to suffer. There is quite a bit of variation in the shape of the cavum conchæ, which is the little bowl shape just outside your ear canal. Earphone manufacturers have to have somewhere to put their magnets and drivers and driver support equipment and it usually ends up in the head of the earphone, nestled right in your concha cavum. Which is awesome if it's a shape that fits your ear. If it's not, though, it can quickly start to become irritating and eventually downright painful. Personally, this is the main reason I prefer over-ear headphones. Headphones: A nicely fitted pair of over-ear headphones that covers your whole ear is just incredibly comfortable. Plus, they keep your ears warm! I find on-ear headphones less comfortable in general, but a nice cushy pair can still feel awesome. There are other factors to take into account, though; wearing headphones and glasses with a thick frame can get really uncomfortable really fast. Winner: While this is clearly a matter of personal preference, I have a strong preference for headphones on this count.So, for me at least, headphones are the clear winner overall. I find them more comfortable, and they tend to reproduce sound better than earphones. There are instances where I find earphones preferable, though. They're great for travelling or if I really need an isolated signal. When I'm just sitting at my desk working, though, I reach for headphones 99% of the time.One final caveat: the sound quality you get out of your 'phones depends most on what files you're playing. The best headphones in the world can't do anything about quantization noise (that's the noise introduced when you convert analog sound-waves to digital ones) or a background hum in the recording.]]>
And this is where I accidentally wrote four instead of six because someone started shouting out random numbers.[/caption] ]]>
They all look good to me.[/caption]First off, a couple caveats. Dyslexia is an educational diagnosis. There's a pretty extensive battery of tests, any of which may be used to diagnose dyslexia. The International Dyslexia Association defines dyslexia thusly:It is characterized by difficulties with accurate and / or fluent word recognition and by poor spelling and decoding abilities. These difficulties typically result from a deficit in the phonological component of language that is often unexpected in relation to other cognitive abilities and the provision of effective classroom instruction. Secondary consequences may include problems in reading comprehension and reduced reading experience that can impede growth of vocabulary and background knowledge.Which sounds pretty standard, right? But! There are a number of underlying causes that might lead to this. One obvious one is an undiagnosed hearing problem. Someone who only has access to part of the speech signal will probably display all of these symptoms. Or someone in an English-only environment who speaks, say, Kola, as their home language. It's hard to learn that 'p' means /p/ if your language doesn't have that sound. Of course, educators know that these things affect reading ability. But there are also a number of underlying mental processes that might lead to a diagnosis of dyslexia, which may or may not be related to each other but are all almost certainly genetic. Let's look at a couple of them. Phonological processing. I've talked a little bit about phonology before. Basically, someone with phonological disorder has a hard time with language sounds specifically. For example, they may have difficulty with rhyming tasks, or figuring out how many sounds are in a word. And this does seem to have a neurological compontent. One study shows that, when children with dyslexia were asked to come up with letters that rhymed, they did not show activity in the Temporoparietal junction, unlike their non-dyslexic peers. Among other things, the Temporoparietal junction plays a role in interpreting sequences of events. Auditory processing. Auditory processing difficulties aren't necessarily linguistic in nature. Someone who has difficulty processing sounds may be tone deaf, for example, unable to tell whether two notes are the same or different. For dyslexics, this tends to surface as difficulty with sounds that occur very quickly. And there's pretty much no sounds that humans need to process more quickly than speech sounds. A flap, for example, lasts around 20 milliseconds. To put that in perspective, that's about 15 times slower than a fast blink. And it looks like there's a genetic cause for these auditory processing problems: dyslexic brains have a localized asymmetry in their neurons. They also have more, and smaller neurons. Sequential processing. For me, this is probably the most interesting. Sequential processing isn't limited to language. It had to do with doing or perceiving things in the correct order. So, for example, if I gave you all the steps for baking a cake in the wrong order, you'd need to use sequential processing to put them in the correct order. And there's been some really interesting work done, mainly by Beate Peter at the University of Washington (represent!) that suggests that there is a single genetic cause responsible for a number of rapid sequential processing task, and one of the effects of an abnormality in this gene is dyslexia. But people with this mutation also tend to be bad at, for example, touching each of their fingers to their thumb in order. Being a dude. Ok, this one is a little shakier, but depending on who you listen to, dyslexia is either equally common men and women, 4 to 5 times more common in men or 2 to 3 times more common in men. This may be due to structural differences, since it seems that male dyslexics have less gray matter in language processing centers, whereas females have less gray matter in sensing and motor processing areas. Or the difference could be due to the fact that estrogen does very good things for your brain, especially after traumatic injury. I include it here because sex is genetic (duh) and seems to (maybe, kinda, sorta) have an effect on dyslexia.Long story short, there's been quite a bit of work done on the genetics of dyslexia and the evidence points to a probably genetic common cause. Which in some ways is really exciting! That means that we can predict better who will have learning difficulties and work to provide them with additional tutoring and help. And it also means that some reading difficulties are due to anatomy and genes. If you're dyslexic, it's because you're wired that way, and not because your parents did or didn't do something (well, other than contribute your genetic material, obvi) or because you didn't try hard enough. I really wish I could go back in time and tell that to my younger self after I completely failed yet another spelling test, even though I'd copied the words a hundred times each. But the genetic underpinning of dyslexia might also seem like a bad thing. After all, if dyslexia is genetic, does that mean that children with reading difficulties will just never get over them? Not at all! I don't have space here to talk about the sort of interventions and treatments that can help dyslectics. (Perhaps I'll write a future post on the subject.) Suffice to say, the dyslexic brain can learn to compensate and adapt over time. Like I said above, I'm currently a very fluent reader. And dyslexia can be a good thing. The same skills that can make learning to read hard can make you very, very good at picking out one odd thing in a large group, or at surveying a large quantity of visual information quickly -- even if you only see it out of the corner of your eye. For example, I am freakishly good at finding four-leaf clovers. In high school, I collected thousands of them just while doing chores around the farm. And that's not the only advantage. I'd recommend the Dyslexic Advantage (it's written for a non-scientific audience) if you're interested in learning more about the benefits of dyslexia. The authors point out that dyslectics are very good at making connections between things, and suggest that they enjoy an advantage when reasoning spatially, narratively, about related but unconnected things (like metaphors) or with incomplete or dynamic information.The current research suggests pretty strongly that dyslexia is something you're born with. And even though it might make some parts of your school career very difficult, it won't stop you from thriving. It might even end up helping you later on in life.]]>
published an article discussing the rise of English as a business lingua franca. This is an issue that I've come across quite a bit in my own life as someone who's lived and traveled quite a bit overseas. And not just in professional settings: as an English speaker I've received an education in English in countries where it's not even an official language and I have quite a few friends, mainly Brazil and the Nordic countries, who I only ever talk to in English despite the fact that it's not their native language. And I'm certainly not alone in this. Ethnologue estimates that there are approximately 335 million native English speakers, but over 430 million non-native speakers. English is emerging as the predominant global language, and many people see an English education as an investment.[caption id= align=alignnone width=512] I don't care that we all speak the same language natively. We're holding this meeting in English and that's final![/caption]But for me, at least, the more interesting question is why? There are a lot of languages in the world, and, in theory at least, any of them could be enjoying the preference currently shown for English. I'm hardly alone in asking these questions. The Economist article I mentioned above suggests a few:There are some obvious reasons why multinational companies want a lingua franca. Adopting English makes it easier to recruit global stars (including board members), reach global markets, assemble global production teams and integrate foreign acquisitions. Such steps are especially important to companies in Japan, where the population is shrinking. There are less obvious reasons too. Rakutens boss, Hiroshi Mikitani, argues that English promotes free thinking because it is free from the status distinctions which characterise Japanese and other Asian languages. Antonella Mei-Pochtler of the Boston Consulting Group notes that German firms get through their business much faster in English than in laborious German. English can provide a neutral language in a merger: when Germanys Hoechst and Frances Rhône-Poulenc combined in 1999 to create Aventis, they decided it would be run in English, in part to avoid choosing between their respective languages.Let's break this down a little bit. There seem to be two main arguments for using English. One is social. Using English makes it easier to collaborate with other companies or company offices in other countries and, if no one is a native English speaker, helps avoid conflict by choosing a working language that doesn't unfairly benefit one group. The argument is linguistic: there is some special quality to English that makes it more suited for business. Let's look at each of them in turn.The social arguments seem perfectly valid to me. English education is widely available and, in many countries, a required part of primary and secondary education. There is a staggering quantity of resources available to help master English. Lots of people speak English, with varying degrees of frequency. As a result, there's a pretty high likelihood that, given a randomly-selected group of people, you'll be able to get your point across in English. While it might be more fair to use a language that no-one speaks natively, like Latin or Esperanto, English has high saturation and an instructional infrastructure already in place. Further, the writing system is significantly more accessible and computer-friendly than Mandarin's, which actually has more speakers than English. (Around 847 million, in case you were wondering.) All practical arguments for using English as an international business language.Now let's turn to the linguistic arguments. These are, sadly, much less reasonable. As I've mentioned before, people have a distressing tendency to make testable claims about language without actually testing them. And both of the claims above--that honorifics confine thinking and that English is faster than German-- have already been investigated. Honorifics do appear to have an effect on cognition, but it seems to be limited to a spatial domain, i.e. higher status honorifics are associated with up and lower ones with down. Beyond subtle priming, I find it incredibly unlikely that a rich honorific system has any effect on individual cognition. A social structure which is reflected in language use, however, might make people less willing to propose new things or offer criticism. But that's hardly language-dependent. Which sounds more likely: Your idea is horrible, sir, or Your idea is horrible, you ninny? TL;DR: It's not the language, it's the social structure the language operates in. While it is true that different languages have different informational density, the rate of informational transmission is actually quite steady. It appears that, as informational density increases, speech rate decreases. As a result, it looks like, regardless of the language, humans tend to convey information at a pretty stable rate. This finding is cross-modal, too. Even though signs take longer to produce than spoken words, they are more dense and so the rate of information flow in signed and spoken languages seems to be about the same. Which makes sense: there's a limit to how quickly the human brain can process new information, so it makes sense that we'd produce information at about that rate. TL;DR: All languages convey information at pretty much the same rate. If there's any difference in the amount of time meetings take, it's more likely because people are using a language they're less comfortable in (i.e. English). In conclusion, it very well may be the case that English is currently the best language to conduct business in. But that's because of language-external social factors, not anything inherent about the language itself.]]>
multiple definitions of 'linguist'. As a result, people tend to equate mastery of a language with explicit knowledge of it's workings. Which, on the one hand, is reasonable. If you know French, the idea is that you know how to speak French, but also how it works. And, in general, that isn't the case. Partly because most language instruction is light on discussions of grammatical structures--reasonably so; I personally find inductive grammar instruction significantly more helpful, though the research is mixed--and partly because, frankly, there's a lot that even linguists don't know about how grammar works. Language is incredibly complex, and we've only begun to explore and map out that complexity. But there are a few things we are reasonably certain we know. And one of those is that your media consumption does not erase your regional dialect [pdf]. The premise is flawed enough that it begins to collapse under it's own weight almost immediately. Even the most dedicated American fans of Dr. Who or Downton Abby or Sherlock don't slowly develop British accents.[caption id= align=alignnone width=256] Lots of planets have a North with a distinct accent that is not being destroyed by mass media.[/caption]So why is this myth so persistent? I think that the most likely answer is that it is easy to mischaracterize what we see on television and to misinterpret what it means. Standard American English (SAE), what newscasters tend to use, is a dialect. It's not just a certain set of vowels but an entire, internally consistent grammatical system. (Failing to recognize that dialects are more than just adding a couple of really noticeable sounds or grammatical structures is why some actors fail so badly at trying to portray a dialect they don't use regularly.) And not only is it a dialect, it's a very prestigious dialect. Not only newscasters make use of it, but so do political figures, celebrities, and pretty much anyone who has a lot of social status. From a linguistic perspective, SAE is no better or worse than any other dialect. From a social perspective, however, SAE has more social capital than most other dialects. That means that being able to speak it, and speak it well, can give you opportunities that you might not otherwise have had access to. For example, speakers of Southern American English are often characterized as less intelligent and educated. And those speakers are very aware of that fact, as illustrated in this excrpt from the truely excellent PBS series Do You Speak American:ROBERT:Do you think northern people think southerners are stupid because of the way they talk?JEFF FOXWORTHY:Yes I think so and I think Southerners really don't care that Northern people think that eh. You know I mean some of the, the most intelligent people I've ever known talk like I do. In fact I used to do a joke about that, about you know the Southern accent, I said nobody wants to hear their brain surgeon say, 'Alight now what we're gonna do is, saw the top of your head off, root around in there with a stick and see if we can't find that dad burn clot.So we have pressure from both sides: there are intrinsic social rewards for speaking SAE, and also social consequences for speaking other dialects. There are also plenty of linguistic role-models available through the media, from many different backgrounds, all using SAE. If you consider these facts alone it seems pretty easy to draw the conclusion that regional dialects in America are slowly being replaced by a prestigious, homogeneous dialect.Except that's not what's happening at all. Some regional dialects of American English are actually becoming more, rather than less, prominent. On the surface, this seems completely contradictory. So what's driving this process, since it seems to be contradicting general societal pressure? The answer is that there are two sorts of pressure. One, the pressure from media, is to adopt the formal, standard style. The other, the pressure from family, friends and peers, is to retain and use features that mark you as part of your social network. Giles, Taylor and Bourhis showed that identification with a certain social group--in their case Welsh identity--encourages and exaggerates Welsh features. And being exposed to a standard dialect that is presented as being in opposition to a local dialect will actually increase that effect. Social identity is constructed through opposition to other social groups. To draw an example from American politics, many Democrats define themselves as not Republicans and as in opposition to various facets of Republican-ness. And vice versa.Now, the really interesting thing is this: television can have an effect on speaker's dialectal features. But that effect tends to be away from, rather than towards, the standard. For example, some Glaswegian English speakers have begun to adopt features of Cockney English based on their personal affiliation with the show Eastenders. In light of what I discussed above, this makes sense. Those speakers who had adopted the features are of a similar social and socio-economic status as the characters in Eastenders. Furthermore, their social networks value the characters who are shown using those features, even though they are not standard. (British English places a much higher value on certain sounds and sound systems as standard. In America, even speakers with very different sound systems, e.g. Bill Clinton and George W. Bush, can still be considered standard.) Again, we see retention and re-invigoration of features that are not standard through a construction of opposition. In other words, people choose how they want to sound based on who they want to be seen as. And while, for some people, this means moving towards using more SAE, in others it means moving away from the standard.One final note: Another factor which I think contributes to the idea that television is destroying accents is the odd idea that we all only have one dialect, and that it's possible to lose it. This is patently untrue. Many people (myself included) have command of more than one dialect and can switch between them when it's socially appropriate, or blend features from them for a particular rhetorical effect. And that includes people who generally use SAE. Oprah, for example, will often incorporate more features of African American English when speaking to an African American guest. The bottom line is that television and mass media can be a force for linguistic change, but they're hardly the great homogonizier that it is often claimed they are.For other things I've written about accents and dialects, I'd recommend: Why do people have accents? Ask vs. Aks Coke vs. Soda vs. Pop]]>
an excellent article by the BBC about the results of a survey put out by the non-profit organization Action on Hearing Loss. The survey showed that most British adults are taking dangerous risks with their hearing: almost a third played music above the recommended volume and a full two-thirds left noisy venues with ringing in their ears. It may seem harmless to enjoy noisy concerts, but it can and does irrecoverably damage your hearing. But how, exactly, does it do that? And how loud can you listen to music without being at risk?[caption id= align=alignnone width=512] Turn it up! Just... not too loud, ok?[/caption]Let's start with the second question. You're at risk of hearing loss if you subject yourself to sounds above 85 decibels. For reference, that's about as loud as a food processor or blender, and most music players will warn if you try to play music much louder than that. They will, however, sometimes play music up to 110 dB, which is roughly the equivalent of someone starting a chainsaw in your ear and verges on painful. And that is absolutely loud enough to damage your hearing.Hearing damage is permanent and progressive. Inside your inner ear are tiny, hair-shaped cells. These sway back and forth as the fluid of your inner ear is compressed by sound-waves. It's a bit like seaweed being pulled back and forth by waves, but on a smaller scale and much, much faster. As these hair cells brush back and forth they create and transmit electrical impulses that are sent to the brain and interpreted as sound. The most delicate part of this process is those tiny hair cells. They're very sensitive. Which is good, because it means that we're able to detect noise well, but also bad, because they're very easy to damage. In fish and birds, that damage can heal over time. In mammals, it can not. Once your hair cells are damaged, they can no longer transmit sound and you lose that part of the signal permanently. There's a certain amount of unavoidable wear and tear on the system. Even if you do avoid loud noises, you're still slowly losing parts of your hearing as you and your hair cells age, especially in the upper frequencies. But loud music will accelerate that process drastically. Listen to loud enough music long enough and it will slowly take away at your ability to hear it. But that doesn't necessarily mean you should avoid loud environments altogether. As with all things, moderation is key. One noisy concert won't leave you hard of hearing. (In fact, your body has limited defence mechanisms for sustained loud noises, including temporarily shifting the bones of your inner ear so that less acoustic energy is transmitted.) The best things you can do for your ears are to avoid exposure to very loud sounds and, if you have to be in a noisy environment, wear protection. It's also possible that magnesium supplements might help to reduce the damage to the auditory system, but when it comes to protecting your hearing, the best treatment is prevention. ]]>
including a minor, and I've been learning to sign and reading about linguistic research in ASL concurrently. I have to say, it's probably my favourite language that I've studied so far. However, I've encountered two very prevalent misconceptions about signed languages when I chat with people about my studies, which are these: Sign languages are basically iconic; you can figure out what's being said without learning the language. All sign languages are pretty much the same.On the one hand, I can understand where these misconceptions spring from. On the other, they are absolutely false and I think it's important that they're cleared up.[caption id= align=alignnone width=512] You don't want to end up like this guy. (No, no, not President Obama, fraudulent sign language interpreter Thamsanqa Jantjie.)[/caption]First of all, it's important to distinguish between a visual language and gesture. A language, regardless of its modality (i.e. how it's transmitted, whether it's through compressed and rarefied particles or by light bouncing off of things) is arbitrary, abstract and has an internal grammar. Gesture, on the other hand, can be thought of the visual equivalent of non-linguistic vocalizations. Gestures and pantomime and more like squeals, screams, shrieks and raspberries than they are like telling a joke or giving someone directions. They don't have a grammatical structure, they're not arbitrary and you can, in fact, figure out what they mean without a whole lot of background information. That's kind of the point, after all. And, yes, Deaf individuals will often use gesture, especially when communicating with non-signers. But this is distinct from signed language. Signed languages have all the same complexities and nuance of spoken languages, and you can no more understand them without training than you could wake up one morning suddenly speaking Kapampangan. Try to see how much of this ASL vlog entry you understand! (Subtitles not included.)[youtube=http://www.youtube.com/watch?v=j-QJaCkqgX4&w=560&h=315]But that just shows that you signed languages are real languages that you really need to learn. (I'm looking at you Mr. Jantjie). What about the mutual intelligibly thing? Well, since we've already seen that signed languages are not iconic, this myth seems to be somewhat silly now. We might expect there to be some sort of pan-Deaf signing if there were constant and routine contact between Deaf communities in different countries. And, in fact, there is! Events such as meetings of the World Federation of the Deaf have fostered the creation of International Sign Language. It is, however, a constructed language that is rarely used outside of international gatherings. Instead, most signers tend to use the signed language is most popular in their home country, and these are vastly different.For an example of the differences between sign languages, let's look at the alphabet in two signed languages: American Sign Language and British Sign Language. These are both relatively mature sign languages which both exist as substrate languages in predominately English-speaking communities. (Substrate just means that it's not the most socially-valued and widely-used language in a given community. In America, any language that's not English is pretty much a substrate language, despite the fact that we don't have a government-mandated official language.) So, if sign languages really are universal, we'd expect that these two communities would use basically the same signs. Instead, the two languages are completely unintelligible, as you can see below. (I picked these videos because for the first ten or so signs their pacing is close enough that you can play them simultaneously. You're welcome. :) )The alphabet in British Sign Language:[youtube=http://www.youtube.com/watch?v=1pwRDT71YCA&w=420&h=315]The alphabet in American Sign Language:[youtube=http://www.youtube.com/watch?v=tkMg8g8vVUo&w=560&h=315]As you can see, they're completely different. Signed languages are a fascinating area of study, and a source of great pride and cultural richness in their respective Deaf communities. I highly recommend that you learn more about visual languages. Here are a couple resources to get you started.]]>
'L' is one of those sounds in English that ]]>
More like speaking around tongues, in this guy's case.[/caption] People don't tend to use sounds that aren't in their native language. (citation) So if you're an English speaker, you're not going to bust out some Norwegian vowels. This rather lets the air out of the theory that individuals engaged in glossolalia are actually speaking another language. It is more like playing alphabet soup with the sounds you already know. (Although not always all the sounds you know. My instinct is that glossolalia is made up predominately of the sounds that are the most common in the person's language.) It lacks the structure of language. (citation) So one of the core ideas of linguistics, which has been supported again and again by hundreds of years of inquiry, is that there are systems and patterns underlying language use: sentences are usually constructed of some sort of verb-like thing and some sort of noun-like thing or things, and it's usually something on the verb that tells you when and it's usually something on the noun that tells you things like who possessed what. But these patterns don't appear in glossolalia. Plus, of course, there's not really any meaningful content being transmitted. (In fact, the language being unintelligible to others present is one of the markers that's often used to identify glossolalia.) It may sort of smell like a duck, but it doesn't have any feathers, won't quack and when we tried to put it in water it just sort of dissolved, so we've come to conclusion that it is no, in fact, a duck. It's associated with a dissociative psychological state. (citation) Basically, this means that speakers are aware of what they're doing, but don't really feel like they're the ones doing it. In glossolalia, the state seems to come and then pass on, leaving speakers relatively psychologically unaffected. Disassociation can be problematic, though; if it's particularly extreme and long-term it can be characterized as multiple personality disorder. It's a learned behaviour. (citation) Basically, you only see glossolalia in cultures where it's culturally expected and only in situations where it's culturally appropriate. In fact, during her fieldwork, Dr. Goodman (see the citation) actually observed new initiates into a religious group being explicitly instructed in how to enter a dissociative state and engage in glossolalia.So glossolalia may seem language-like, but from a linguistic standpoint it doesn't seem to be actually be language. (Which is probably why there hasn't been that much research done on it.) It's vocalization that arises as the result of a learned psychological stated that lacks linguistic systematicity.]]>
What was that? I couldn't hear you, you were touching too gently.[/caption]I've already talked about how we can see sounds, and the role that sound plays in speech perception before. But just how much overlap is there between our sense of touch and hearing? There is actually pretty strong evidence that what we feel can actually override what we're hearing. Yau et. al. (2009), for example, found that tactile expressions of frequency could override auditory cues. In other words, you might hear two identical tones as different if you're holding something that is vibrating faster or slower. If our vision system had a similar interplay, we might think that a person was heavier if we looked at them while holding a bowling ball, and lighter if we looked at them while holding a volleyball.And your sense of touch can override your ears (not that they were that reliable to begin with...) when it comes to speech as well. Gick and Derrick (2013) have found that tactile information can override auditory input for speech sounds. You can be tricked into thinking that you heard a peach rather than beach, for example, if you're played the word beach and a puff of air is blown over your skin just as you hear the b sound. This is because when an English speaker says peach, they aspirate the p, or say it with a little puff of air. That isn't there when they say the b in beach, so you hear the wrong word.Which is all very cool, but why might this be useful to us as language-users? Well, it suggests that we use a variety of cues when we're listening to speech. Cues act as little road-signs that point us towards the right interpretation. By having access to a lots of different cues, we ensure that our perception is more robust. Even when we lose some cues--say, a bear is roaring in the distance and masking some of the auditory information--you can use the others to figure out that your friend is telling you that there's a bear. In other words, even if some of the road-signs are removed, you can still get where you're going. Language is about communication, after all, and it really shouldn't be surprising that we use every means at our disposal to make sure that communication happens.]]>
such as this, which has the supposedly untranslatable word alongsideironicallyits translation. But I think that if we ask why certain words can't be translated, we're actually asking the wrong question. The right question is: why do we think anything at all can be translated?Why is it that we shy away from trying to translate dépaysement but feel quite strongly that a pomme is the same thing as an apple? While a French speaker and and English speaker would probably use the those respective words to ask for the same piece of fruit from a fruit bowl, the phrase the apple of my eye is better tranlsted into French as prunelle de mes yeux. And if you asked for the prunelle from a fruit bowl, you'd be given something an English speaker would call a plum. So while we think of these two words as the same, on some level, it cannot be denied that they play different roles in their respective languages. No one claims either apple or pomme are untranslatable, though.Well, let's talk a little about what translation is. In linguistics, the standard when discussing languages that the reader is not familiar with (and, since descriptive linguists often work with languages that have a few dozen speakers, this is not uncommon) is to use three lines. The first is in the original language (usually in the International Phonetic Alphabet), the second line is a morpheme-by-morpheme translation and the third is a 'sense translation', which is how an English speaker might have expressed the same thought. (Morphemes, you may be aware, are the smallest unit of language to contain meaning. So the single word dogs has two morphemes. Dog, which has the meaning of canis familiaris, and -s, which tells us that there's more than one.)While we tend to idealize translation as the first, a word-to-word correspondence. But even that's a bit of a simplification, for you'll often hear people referring to the literal translation of something like an idiom, while the actual translation is something that maintains the sense but not the wording. The idea that it's the sense that translations should capture and not the exact wording can sometimes be taken to extremes. Consider FItzgerald's translation of the The Rubáiyát of Omar Khayyám, which in places diverges wildly from the source material. It is true translation, in a morpheme-by-morpheme sense? No. But we still accept it as essentially the same material in two different languages.And if we accept that on the level of the poem, then I feel like we also have to accept it on the level of the word. It may take more or fewer words to express the same idea in different languages, but if we believe that we're capable of sharing thoughts between people (Richard Wright once called language a very inefficient means of telepathy) then shuttling them between languages, no matter how difficult the transition, should also be possible. If anything is translatable, than everything has to be.Of course, accounting for and explaining the cultural baggage associated with a certain term or replicating levels of meaning below the morpheme may pose a greater challenge. But that's a post for another day.]]>
Since I'm teaching Language and Society this quarter, this is a question that I anticipate coming up early and often. Accents--or dialects, though the terms do differ slightly--are one of those things in linguistics that is effortlessly fascinating. We all have experience with people who speak our language differently than we do. You can probably even come up with descriptors for some of these differences. Maybe you feel that New Yorkers speak nasally, or that Southerners have a drawl, or that there's a certain Western twang. But how did these differences come about and how are perpetuated?[caption id= align=alignnone width=512] Clearly people have Accents because they're looking for a nice little sub-compact commuter car.[/caption]First, two myths I'd like to dispel.Only some people have an accent or speak a dialect. This is completely false with a side of flat-out wrong. Every single person who speaks or signs a language does so with an accent. We sometimes think of newscasters, for example, as accent-less. They do have certain systematic variation in their speech, however, that they share with other speakers who share their social grouping... and that's an accent. The difference is that it's one that tends to be seen as proper or correct, which leads nicely into myth number two:Some accents are better than others. This one is a little more tricky. As someone who has a Southern-influenced accent, I'm well aware that linguistic prejudice exists. Some accents (such as the British received pronunciation) are certainly more prestigious than others (oh, say, the American South). However, this has absolutely no basis in the language variation itself. No dialect is more or less logical than any other, and geographical variation of factors such as speech rate has no correlation with intelligence. Bottom line: the differing perception of various accents is due to social, and not linguistic, factors.Now that that's done with, let's turn to how we get accents in the first place. To begin with, we can think of an accent as a collection of linguistic features that a group of people share. By themselves, these features aren't necessarily immediately noticeable, but when you treat them as a group of factors that co-varies it suddenly becomes clearer that you're dealing with separate varieties. Which is great and all, but let's pull out an example to make it a little clearer what I mean.Imagine that you have two villages. They're relatively close and share a lot of commerce and have a high degree of intermarriage. This means that they talk to each other a lot. As a new linguistic change begins to surface (which, as languages are constantly in flux, is inevitable) it spreads through both villages. Let's say that they slowly lose the 'r' sound. If you asked a person from the first village whether a person from the second village had an accent, they'd probably say no at that point, since they have all of the same linguistic features.But what if, just before they lost the 'r' sound, an unpassable chasm split the two villages? Now, the change that starts in the first village has no way to spread to the second village since they no longer speak to each other. And, since new linguistic forms pretty much come into being randomly (which is why it's really hard to predict what a language will sound like in three hundred years) it's very unlikely that the same variant will come into being in the second village. Repeat that with a whole bunch of new linguistic forms and if, after a bridge is finally built across the chasm, you ask a person from the first village whether a person from the second village has an accent, they'll probably say yes. They might even come up with a list of things they say differently: we say this and they say that. If they were very perceptive, they might even give you a list with two columns: one column the way something's said in their village and the other the way it's said in the second village.But now that they've been reunited, why won't the accents just disappear as they talk to each other again? Well, it depends, but probably not. Since they were separated, the villages would have started to develop their own independent identities. Maybe the first village begins to breed exceptionally good pigs while squash farming is all the rage in the second village. And language becomes tied that that identity. Oh, I wouldn't say it that way, people from the first village might say, people will think I raise squash. And since the differences in language are tied to social identity, they'll probably persist.Obviously this is a pretty simplified example, but the same processes are constantly at work around us, at both a large and small scale. If you keep an eye out for them, you might even notice them in action.]]>
The Sleep Talkin Man. Sleep talking can range from grunts or moans to relatively clear speech. While most people know what sleep talking is (there was even a hit song about it that's older than I am) fewer people know what causes it.[caption id= align=alignnone width=512] Sure, she looks all peaceful, but you should hear her go on.[/caption]To explain what happens when someone's talking in their sleep, we first need to talk about 1) what happens during sleep and 2) what happens when we talk normally. Sleeping normally: One of the weirder things about sleep talking is that it happens at all. When you're asleep normally, your muscles undergo atony during the stage of sleep called Rapid Eye Movement, or REM sleep. Basically, your muscles release and go into a state of relaxation or paralysis. If you've ever woken suddenly and been unable to move, it's because your body is still in that state. This serves an important purpose: when we dream we can rehearse movements without actually moving around and hurting ourselves. Of course, the system isn't perfect. When your muscles fail to turn off while you dream, you'll end up acting out your dream and sleep walking. This is particularly problematic for people with narcolepsy. Speaking while awake: So speech is an incredibly complex process. Between a tenth and a third of a second before you begin to speak you start brain activation in the insula. This is where you plan the movements you'll need to successfully speak. These come in three main stages, that I like to call breathing, vibrating and tonguing. All speech comes from breath, so you need to inhale in preparation for speaking. Normal exhalation won't work for speaking, though--it's too fast--so you switch on your intercostal muscles, in the walls of your ribcage, to help your lungs empty more slowly. Next, you need to tighten your vocal folds as you force air through them. This makes them vibrate (like so) and gives you the actual sound of your voice. By putting different amounts of pressure on your vocal folds you can change your pitch or the quality of your voice. Finally, your mouth needs to manipulate the buzzing sound your vocal folds make to make the specific speech sounds you need. You might flick your tongue, bring your teeth to your lips, or open your soft palate so that air goes through your nose instead of your mouth. And voila! You're speaking.Ok, so, it seems like sleep talking shouldn't really happen, then. When you're asleep your muscles are all turned off and they certainly don't seem up to the multi-stage process that is speech production. Besides, there's no need for us to be making speech movements anyway, right? Wrong. You actually use your speech planning processes even if you're not planning to speak aloud. I've already talked about the motor theory of speech perception, which suggests that we use our speech planning mechanisms to understand speech. And it's not just speech perception. When reading silently, we still plan out the speech movements we'd make if we were to read out loud (though the effect is smaller with more fluent readers). So you sometimes do all the planning work even if you're not going to say anything... and one of the times you do that is when you're asleep. Usually, your muscles are all turned off when you're asleep. But, sometimes, especially in young children or people with PTSD, the system will occasionally stop working as well. And if it happens to stop working when you're dreaming that you're talking and therefore planning out your speech movements? You start sleep talking.Of course, all of this means that some of the things that we've all heard about about sleep talking are actually myths. Admissions of guilt while asleep, for example, aren't reliable and not admissible in court. (Unless, of course, you really did put that purple beaver in the banana pudding.) It's also very common; about 50% of children talk in their sleep. Unless it's causing problems--like waking people you're sleeping with--sleep talking isn't generally problematic. But you can help reduce the severity by getting enough sleep (which is probably a good goal anyway), and avoiding alcohol and drugs.]]>
]]>
understanding speech is hard to model and the first model we discussed, motor theory, while it does address some problems, leaves something to be desired. The big one is that it doesn't suggest that the main fodder for perception is the acoustic speech signal. And that strikes me as odd. I mean, we're really used to thinking about hearing speech as a audio-only thing. Telephones and radios work perfectly well, after all, and the information you're getting there is completely audio. That's not to say that we don't use visual, or, heck, even tactile data in speech perception. The McGurk effect, where a voice saying ba dubbed over someone saying ga will be perceived as da or tha, is strong evidence that we can and do use our eyes during speech perception. And there's even evidence that a puff of air on the skin will change our perception of speech sounds. But we seem to be able to get along perfectly well without these extra sensory inputs, relying on acoustic data alone.[caption id= align=alignnone width=512] This theory sounds good to me. Sorry, I'll stop.[/caption]Ok, so... how do we extract information from acoustic data? Well, like I've said a couple time before, it's actually a pretty complex problem. There's no such thing as invariance in the speech signal and that makes speech recognition monumentally hard. We tend not to think about it because humans are really, really good at figuring out what people are saying, but it's really very, very complex.You can think about it like this: imagine that you're looking for information online about platypuses. Except, for some reason, there is no standard spelling of platypus. People spell it platipus, pladdypuss, plaidypus, plaeddypus or any of thirty or forty other variations. Even worse, one person will use many different spellings and may never spell it precisely the same way twice. Now, a search engine that worked like our speech recognition works would not only find every instance of the word platypus--regardless of how it was spelled--but would also recognize that every spelling referred to the same animal. Pretty impressive, huh? Now imagine that every word have a very variable spelling, oh, and there are no spaces between words--everythingisjustruntogetherlikethisinonelongspeechstream. Still not difficult enough for you? Well, there is also the fact that there are ambiguities. The search algorithm would need to treat pladypuss (in the sense of a plaid-patterned cat) and palattypus (in the sense of the venomous monotreme) as separate things. Ok, ok, you're right, it still seems pretty solvable. So let's add the stipulation that the program needs to be self-training and have an accuracy rate that's incredibly close to 100%. If you can build a program to these specifications, congratulations: you've just revolutionized speech recognition technology. But we already have a working example of a system that looks a heck of a lot like this: the human brain.So how does the brain deal with the different spellings when we say words? Well, it turns out that there are certain parts of a word that are pretty static, even if a lot of other things move around. It's like a superhero reboot: Spiderman is still going to be Peter Parker and get bitten by a spider at some point and then get all moody and whine for a while. A lot of other things might change, but if you're only looking for those criteria to figure out whether or not you're reading a Spiderman comic you have a pretty good chance of getting it right. Those parts that are relatively stable and easy to look for we call cues. Since they're cues in the acoustic signal, we can be even more specific and call them acoustic cues.If you think of words (or maybe sounds, it's a point of some contention) as being made up of certain cues, then it's basically like a list of things a house-buyer is looking for in a house. If a house has all, or at least most, of the things they're looking for, than it's probably the right house and they'll select that one. In the same way, having a lot of cues pointing towards a specific word makes it really likely that that word is going to be selected. When I say selected, I mean that the brain will connect the acoustic signal it just heard to the knowledge you have about a specific thing or concept in your head. We can think of a word as both this knowledge and the acoustic representation. So in the platypuss example above, all the spellings started with p and had an l no more than one letter away. That looks like a pretty robust cue. And all of the words had a second p in them and ended with one or two tokens of s. So that also looks like a pretty robust queue. Add to that the fact that all the spellings had at least one of either a d or t in between the first and second p and you have a pretty strong template that would help you to correctly identify all those spellings as being the same word.Which all seems to be well and good and fits pretty well with our intuitions (or mine at any rate). But that leaves us with a bit of a problem: those pesky parts of Motor Theory that are really strongly experimentally supported. And this model works just as well for motor theory too, just replace the letters with specific gestures rather than acoustic cues. There seems to be more to the story than either the acoustic model or the motor theory model can offer us, though both have led to useful insights.]]>
previous two posts, modelling speech perception is an ongoing problem with a lot of hurdles left to jump. But there are potential candidate theories out there, all of which offer good insight into the problem. The first one I'm going to talk about is motor theory.[caption id= align=alignnone width=256] So your tongue is like the motor body and the other person's ear are like the load cell...[/caption]So motor theory has one basic premise and three major claims. The basic premise is a keen observation: we don't just perceive speech sounds, we also make them. Whoa, stop the presses. Ok, so maybe it seems really obvious, but motor theory was really the first major attempt to model speech perception that took this into account. Up until it was first posited in the 1960's , people had pretty much been ignoring that and treating speech perception like the only information listeners had access to was what was in the acoustic speech signal. We'll discuss that in greater detail, later, but it's still pretty much the way a lot of people approach the problem. I don't know of a piece of voice recognition software, for example, that include an anatomical model.So what's the fact that listeners are listener/speakers get you? Well, remember how there aren't really invariant units in the speech signal? Well, if you decide that what people are actually perceiving aren't actually a collection of acoustic markers that point to one particular language sound but instead the gestures needed to make up that sound, then suddenly that's much less of a problem. To put it in another way, we're used to thinking of speech being made up of a bunch of sounds, and that when we're listening speech we're deciding what the right sounds are and from there picking the right words. But from a motor theory standpoint, what you're actually doing when you're listening to speech is deciding what the speaker's doing with their mouth and using that information to figure out what words they're saying. So in the dictionary in your head, you don't store words as strings of sounds but rather as strings of gestures. If you're like me when I first encountered this theory, it's about this time that you're starting to get pretty skeptical. I mean, I basically just said that what you're hearing is the actual movement of someone else's tongue and figuring out what they're saying by reverse engineering it based on what you know your tongue is doing when you say the same word. (Just FYI, when I say tongue here, I'm referring to the entire vocal tract in its multifaceted glory, but that's a bit of a mouthful. Pun intended. ;) ) I mean, yeah, if we accept this it gives us a big advantage when we're talking about language acquisition--since if you're listening to gestures, you can learn them just by listening--but still. It's weird. I'm going to need some convincing.Well, let's get back to the those three principles I mentioned earlier, which are taken from Galantucci, Flower and Turvey's excellent review of motor theory. Speech is a weird thing to perceive and pretty much does its own thing. I've talked about this at length, so let's just take that as a given for now. When we're listening to speech, we're actually listening to gestures. We talked about that above. We use our motor system to help us perceive speech.Ok, so point three should jump out at you a bit. Why? Of these three points, its the easiest one to test empirically. And since I'm a huge fan of empirically testing things (Science! Data! Statistics!) we can look into the literature and see if there's anything that supports this. Like, for example, a study that shows that when listening to speech, our motor cortex gets all involved. Well, it turns out that there are lots of studies that show this. You know that term active listening? There's pretty strong evidence that it's more than just a metaphor; listening to speech involves our motor system in ways that not all acoustic inputs do.So point three is pretty well supported. What does that mean for point two? It really depends on who you're talking to. (Science is all about arguing about things, after all.) Personally, I think motor theory is really interesting and address a lot of the problems we face in trying to model speech perception. But I'm not ready to swallow it hook, line and sinker. I think Robert Remez put it best in the proceedings of Modularity and The Motor Theory of Speech Perception:I think it is clear that Motor Theory is false. For the other, I think the evidence indicates no less that Motor Theory is essentially, fundamentally, primarily and basically true. (p. 179)On the one hand, it's clear that our motor system is involved in speech perception. On the other, I really do think that we use parts of the acoustic signal in and of themselves. But we'll get into that in more depth next week.]]>
Ok, so, a couple weeks ago I talked about why speech perception was hard to model. Really, though, what I talked about was why building linguistic models is a hard task. There's a couple other thorny problems that plague people who work with speech perception, and they have to do with the weirdness of the speech signal itself. It's important to talk about because it's on account of dealing with these weirdnesses that some theories of speech perception themselves can start to look pretty strange. (Motor theory, in particular, tends to sound pretty messed-up the first time you encounter it.)The speech signal and the way we deal with it is really strange in two main ways. The speech signal doesn't contain invariant units. We both perceive and produce speech in ways that are surprisingly non-linear.So what are invariant units and why should we expect to have them? Well, pretty much everyone agrees that we store words as larger chunks made up of smaller chunks. Like, you know that the word beet is going to be made with the lips together at the beginning for the b and your tongue behind your teeth at the end for the t. And you also know that it will have certain acoustic properties; a short break in the signal followed by a small burst of white noise in a certain frequency range (that's a the b again) and then a long steady state for the vowel and then another sudden break in the signal for the t. So people make those gestures and you listen for those sounds and everything's pretty straightforwards right? Weeellllll... not really.It turns out that you can't really be grabbing onto certain types of acoustic queues because they're not always reliably there. There are a bunch of different ways to produce t, for example, that run the gamut from the way you'd say it by itself to something that sound more like a w crossed with an r. When you're speaking quickly in an informal setting, there's no telling where on that continuum you're going to fall. Even with this huge array of possible ways to produce a sound, however, you still somehow hear is at as t.And even those queues that are almost always reliably there vary drastically from person to person. Just think about it: about half the population has a fundamental frequency, or pitch, that's pretty radically different from the other half. The old interplay of biological sex and voice quality thing. But you can easily, effortlessly even, correct for the speaker's gender and understand the speech produced by men and women equally well. And if a man and woman both say beet, you have no trouble telling that they're saying the same word, even though the signal is quite different in both situations. And that's not a trivial task. Voice recognition technology, for example, which is overwhelmingly trained on male voices, often has a hard time understanding women's voices. (Not to mention different accents. What that says about regional and sex-based discrimination is a topic for another time.)And yet. And yet humans are very, very good a recognizing speech. How? Well linguists have made some striking progress in answering that question, though we haven't yet arrived at an answer that makes everyone happy. And the variance in the signal isn't the only hurdle facing humans as the recognize the vocal signal: there's also the fact that the fact that we are humans has effects on what we can hear.[caption id= align=alignnone width=512] Ooo, pretty rainbow. Thorny problem, though: this shows how we hear various frequencies better or worse. The sweet spot is right around 300 kHz or so. Which, coincidentally, just so happens to be where we produce most of the noise in the speech signal. But we do still produce information at other frequencies and we do use that in speech perception: particularly for sounds like s and f.[/caption]We can think of the information available in the world as a sheet of cookie dough. This includes things like UV light and sounds below 0 dB in intensity. Now imagine a cookie-cutter. Heck, make it a gingerbread man. The cookie-cutter represents the ways in which the human body limits our access to this information. There are just certain things that even a normal, healthy human isn't capable of perceiving. We can only hear the information that falls inside the cookie cutter. And the older we get, the smaller the cookie-cutter becomes, as we slowly lose sensitivity in our auditory and visual systems. This makes it even more difficult to perceive speech. Even though it seems likely that we've evolved our vocal system to take advantage of the way our perceptual system works, it still makes the task of modelling speech perception even more complex.]]>
Userdesign asked me to review their newest volume, Punctuation..? and I was happy to oblige. Linguists rarely study punctuation (it falls under the sub-field orthography, or the study of writing systems) but what we do study is the way that language attitudes and punctuation come together. I've written before about language attitudes when it come to grammar instruction and the strong prescriptive attitudes of most grammar instruction books. What makes this book so interesting is that it is partly prescriptive and partly descriptive. Since a descriptive bent in a grammar instruction manual is rare, I thought I'd delve into that a bit.[caption id=attachment_889 align=aligncenter width=523] Image copyright Userdesign, used with permission. (Click for link to site.)[/caption]So, first of all, how about a quick review of the difference between a descriptive and prescriptive approach to language? Descriptive: This is what linguists do. We don't make value or moral judgments about languages or language use, we just say what's going on as best we can. You can think of it like an anthropological ethnography: we just describe what's going on. Prescriptive: This is what people who write letters to the Times do. They have a very clear idea of what's right and wrong with regards to language use and are all to happy to tell you about it. You can think of this like a manner book: it tells you what the author thinks you should be doing. As a linguist, my relationship with language is mainly scientific, so I have a clear preference for a descriptive stance. An ichthyologist doesn't tell octopi, No, no, no, you're doing it all wrong! after all. At the same time, I live in a culture which has very rigid expectations for how an educated individual should write and sound, and if I want to be seen as an educated individual (and be considered for the types of jobs only open to educated individuals) you better believe I'm going to adhere to those societal standards. The problem comes when people have a purely prescriptive idea of what grammar is and what it should be. That can lead to nasty things like linguistic discrimination. I.e., language B (and thus all those individuals who speak language B) is clearly inferior to language A because they don't do things properly. Since I think we can all agree that unfounded discrimination of this type is bad, you can see why linguists try their hardest to avoid value judgments of languages.As I mentioned before, this book is a fascinating mix of prescriptive and descriptive snippets. For example, the author says this about exclamation points: In everyday writing, the exclamation mark is often overused in the belief that it adds drama and excitement. It is, perhaps the punctuation mark that should be used with the most restraint (p 19). Did you notice that should'? Classic marker of a prescriptivist claiming their territory. But then you have this about Guillements: Guillements are used in several languages to indicate passages of speech in the same way that single and double quotation marks ('' ) are used in the English language (p. 22). (Guillements look like this, since I know you were wondering; « and ». ) See, that's a classical description of what a language does, along with parallels drawn to another, related, languages. It may not seem like much, but try to find a comparably descriptive stance in pretty much any widely-distributed grammar manual. And if you do, let me know so that I can go buy a copy of it. It's change, and it's positive change, and I'm a fan of it. Is this an indication of a sea-change in grammar manuals? I don't know, but I certainly hope so.Over all, I found this book fascinating (though not, perhaps, for the reasons the author intended!). Particularly because it seems to stand in contrast to the division that I just spent this whole post building up. It's always interesting to see the ways that stances towards language can bleed and melt together, for all that linguists (and I include myself here) try to show that there's a nice, neat dividing line between the evil, scheming prescriptivists and the descriptivists in their shining armor here to bring a veneer of scientific detachment to our relationship with language. Those attitudes can and do co-exist. Data is messy. Language is complex. Simple stories (no matter how pretty we might think them) are suspicious. But these distinctions can be useful, and I'm willing to stand by the descriptivist/prescriptivist, even if it's harder than you might think to put people in one camp or the others.But beyond being an interesting study in language attitdues, it was a fun read. I learned lots of neat little factoids, which is always a source of pure joy for me. (Did you know that this symbol: ¶ is called a Pilcrow? I know right? I had no idea either; I always just called it the paragraph mark.)]]>
Not only can she discriminate velar, uvular and pharyngeal fricatives with 100% accuracy, but she can also do it in heels.[/caption]No, not really. (I wish that was a job...) I'm talking about a scientific model of how humans perceive speech sounds. If you've ever taken an introductory science class, you already have some experience with scientific models. All of Newton's equations are just a way of generalizing general principals generally across many observed cases. A good model has both explanatory and predictive power. So if I say, for example, that force equals mass times acceleration, then that should fit with any data I've already observed as well as accurately describe new observations. Yeah, yeah, you're saying to yourself, I learned all this in elementary school. Why are you still going on about it? Because I really want you to appreciate how complex this problem is.Let's take an example from an easier field, say, classical mechanics. (No offense physicists, but y'all know it's true.) Imagine we want to model something relatively simple. Perhaps we want to know whether a squirrel who's jumping from one tree to another is going to make. What do we need to know? And none of that assume the squirrel is a sphere and there's no air resistance stuff, let's get down to the nitty-gritty. We need to know the force and direction of the jump, the locations of the trees, how close the squirrel needs to get to be able to hold on, what the wind's doing, air resistance and how that will interplay with the shape of the squirrel, the effects of gravity... am I missing anything? I feel like I might be, but that's most of it.So, do you notice something that all of these things we need to know the values of have in common? Yeah, that's right, they're easy to measure directly. Need to know what the wind's doing? Grab your anemometer. Gravity? To the accelerometer closet! How far apart the trees are? It's yardstick time. We need a value , we measure a value, we develop a model with good predictive and explanatory power (You'll need to wait for your simulations to run on your department's cluster. But here's one I made earlier so you can see what it looks like. Mmmm, delicious!) and you clean up playing the numbers on the professional squirrel-jumping circuit.Let's take a similarly simple problem from the field of linguistics. You take a person, sit them down in a nice anechoic chamber*, plop some high quality earphones on them and play a word that could be bite and could be bike and ask them to tell you what they heard. What do you need to know to decide which way they'll go? Well, assuming that your stimuli is actually 100% ambiguous (which is a little unlikely) there a ton of factors you'll need to take into account. Like, how recently and often has the subject heard each of the words before? (Priming and frequency effects.) Are there any social factors which might affect their choice? (Maybe one of the participant's friends has a severe overbite, so they just avoid the word bite all together.) Are they hungry? (If so, they'll probably go for bite over bike.) And all of that assumes that they're a native English speaker with no hearing loss or speech pathologies and that the person's voice is the same as theirs in terms of dialect, because all of that'll bias the listener as well.The best part? All of this is incredibly hard to measure. In a lot of ways, human language processing is a black box. We can't mess with the system too much and taking it apart to see how it works, in addition to being deeply unethical, breaks the system. The best we can do is tap a hammer lightly against the side and use the sounds of the echos to guess what's inside. And, no, brain imaging is not a magic bullet for this. It's certainly a valuable tool that has led to a lot of insights, but in addition to being incredibly expensive (MRI is easily more than a grand per participant and no one has ever accused linguistics of being a field that rolls around in money like a dog in fresh-cut grass) we really need to resist the urge to rely too heavily on brain imaging studies, as a certain dead salmon taught us.But! Even though it is deeply difficult to model, there has been a lot of really good work done on towards a theory of speech perception. I'm going to introduce you to some of the main players, including: Motor theory Acoustic/auditory theory Double-weak theory Episodic theories (including Exemplar theory!)Don't worry if those all look like menu options in an Ethiopian restaurant (and you with your Amharic phrasebook at home, drat it all); we'll work through them together. Get ready for some mind-bending, cutting-edge stuff in the coming weeks. It's going to be [f<U+028C>n] and [f<U+028C>net<U+026A>k]. :D*Anechoic chambers are the real chambers of secrets.]]>
an entire genre that fuses the two. Since this is a linguistics blog, however, I'm going to be discussing the vocal similarities and differences between the two. The most obvious difference between a metal male vocalist growling and a operatic female vocalist singing is their pitch difference. Pitch, an acoustic measure of frequency (how often a wave oscillates between its peak and trough), is perceived as the [caption id= align=alignnone width=256] Vyacheslav Mayer as Geralt of Rivia in the rock opera A Road Without Return based on The Witcher video games and novels.[/caption]]]>
The night before last I had the good fortune to see Goeff Pullum, noted linguist and linguistics blogger, give a talk entitled: The scandal of English grammar teaching: Ignorance of grammar, damage to writing skills, and what we can do about it. It was an engaging talk and clearly showed that the basis for many of the grammar rules that are taught in English language and composition courses have little to no bearing on how the English language is actually used. Some of the bogeyman rules (his term) that he lambasted included the interdiction against ending a sentence in a preposition, the notion that since can only to refer to the passage of time and not causality and the claim that only which can begin a restrictive clause. Counterexamples for all of these grammar rules are easy to find, both in written and spoken language. (If you're interested in learning more, check out Geoff Pullum on Language Log.)[caption id= align=alignnone width=512] And then they python ate little Johnny because he had the gall to cheekily split his infinitives.[/caption]So there's a clear problem here. Rules that have no bearing on linguistic reality are being used as the backbone of grammar instruction, just as they have for over two hundred years. Meanwhile, the investigation of human language has advanced considerably. We know much more about the structure of language now than we did when E. B. White was writing his grammar guide. It's linguistic inquiry that has lead to better speech therapy, speech recognition and synthesis programs and better foreign language teaching. Grammar, on the other hand, has led to little more than frustration and an unsettling elitism. (We all know at least one person who uses their knowledge of correct usage as a weapon.) So what can be done about it? Well, I propose that instead of traditional grammar, we teach grammar as linguists understand it. What's the difference?Traditional grammar: A variety of usage and style rules that are based on social norms and a series of historic accidents.Linguistic grammar: The set of rules which can accurately discribe a native speaker's knowaldge of their language.I'm not the first person to suggest a linguistics education as a valuable addition to the pre-higher educational experience. You can read proposals and arguments from others here, here, and here, and an argument for more linguistics in higher education here.So, why would you want to teach linguistic grammar? After all, by the time you're five or six, you already have a pretty good grasp of your language. (Not a perfect one, as it turns out; things like the role of stress in determining the relationship between words in a phrase tend to come in pretty late in life.) Well, there are lots of reasons. Linguistic grammar is the result of scientific inquiry and is empirically verifiable. This means that lessons on linguistic grammar can take the form of experiments and labs rather than memorizing random rules. Linguistic grammar is systematic. This can appeal to students who are gifted at math and science but find studying language more difficult. Linguistic grammar is a good way to gently introduce higher level mathematics. Semantics, for example, is a good way to introduce set theory or lambda calculus. Linguistic grammar is immediately applicable for students. While it's difficult to find applications for oceanology for students who live in Kansas, everyone uses language every day, giving students a multitude of opportunities to apply and observe what they're learned. Linguistic grammar shows that variation between different languages and dialects is systematic, logical and natural. This can help reduce the linguistic prejudice that speakers of certain languages or dialects face. Linguistic grammar helps students in learning foreign languages. For example, by increasing students' phonetic awareness (that's their awareness of language sounds) and teaching them how to accurately describe and produce sounds, we can avoid the frustration of not knowing what sound they're attempting to produce and its relation to sounds they already know. Knowledge of linguistic grammar, unlike traditional grammar, is relatively simple to evaluate. Since much of introductory linguistics consists of looking at data sets and constructing rules that would generate that data set, and these rules are either correct or not, it is easier to determine whether or not the student has mastered the concepts.I could go on, but I think I'll leave it here for now. The main point is this: teaching linguistics is a viable and valuable way to replace traditional grammar education. What needs to happen for linguistic grammar to supplant traditional grammar? That's a little thornier. At the very least, teachers need to receive linguistic training and course materials appropriate for various ages need to be developed. A bigger problem, though, is a general lack of public knowledge about linguistics. That's part of why I write this blog; to let you know about what's going on in a small but very productive field. Linguistics has a lot to offer, and I hope that in the future more and more people will take us up on it. ]]>
should be used, are called prescriptive. I'm not going to talk that much more about it here; if you're interested, Language Log and Language Hippie both discuss the issue at length. The reason that I bring this up is that prescriptive rules tend to favor older forms. (An occasionally forms from other languages. That whole don't split an infinitive thing? Based on Latin. English speakers have been happily splitting infinitives since the 13th century, and I imagine we'll continue to boldly split them for centuries to come.) There is, however, one glaring exception: the whole [ask] vs. [aks] debate.[caption id= align=alignnone width=512] In a way, it's kinda like Theseus' paradox or Abe Lincoln's axe. If you replace all the sounds in a word one by one, it is the same word at the end of the process as it was in the beginning?[/caption]Historically, it's [aks], the homophone of the chopping tool pictured above, that has precedence. Let's take a look at the Oxford English Dictionary's take on the history of the word, shall we?The original long á gave regularly the Middle English (Kentish) oxi ; but elsewhere was shortened before the two consonants, giving Middle English a , and, in some dialects, e . The result of these vowel changes, and of the Old English metathesis asc- , acs- , was that Middle English had the types ox , ax , ex , ask , esk , ash , esh , ass , ess . The true representative of the orig. áscian was the s.w. and w.midl. ash , esh , also written esse (compare æsce ash n.1, wæsc(e)an wash n.), now quite lost. Acsian, axian, survived inax, down to nearly 1600 the regular literary form, and still used everywhere in midl. and southern dialects, though supplanted in standard English by ask, originally the northern form. Already in 15th cent. the latter was reduced dialectally to asse, past tense ast, still current dialectally.*So, [aks] was the regular literary form (i.e. the one you would have been taught to say in school if you were lucky enough to have gone to school) until the 1600 or so? Ok, so, if older forms are better, than that should be the right one. Right? Well, let's see what Urban Dictionary has to say on the matter, since that tends to be a pretty good litmus test of language attitudes.What retards say when they don't know how to pronounce the word ask. -- User marcotte on Urban Dictionary, top definitionOh. Sorry, Chaucer, but I'm going to have to inform you that you were a retard who didn't know how to pronounce the word ask. Let's unpack what's going on here a little bit, shall we? There's clearly a disconnect between the linguistic facts and language attitudes. Facts: these two forms have both existed for centuries, and [aks] was considered the correct form for much of that time. Language attitude: [aks] is not only wrong, it reflects negatively on those people who use it, making them sound less intelligent and less educated.This is probably (at least in America) tangled in with the fact that [aks] is a marker of African American English. Even within the African American community, the form is stigmatized. Oprah, for example, who often uses markers of African American English (especially when speaking with other African Americans) almost never uses [aks] for [ask]. So the idea that [aks] is the wrong form and that [ask] is correct is based on a social construction of how an intelligent, educated individual should speak. It has nothing to do with the linguistic qualities of the word itself. (For a really interesting discussion of how knowledge of linguistic forms is acquired by children and the relationship between that and animated films, see Lippi-Green's chapter Teaching children to discriminate from English with an Accent: Language ideology and discrimination in the United States here.)Now, the interesting thing about these forms is that they both have phonological pressures pushing English speakers towards using them. That's because [s] has a special place in English phonotactics. In general, you want the sounds that are the most sonorant nearer the center of a syllable. And [s] is more sonorant than [k], so it seems like [ask] should be the favored form. But, like I said, [s] is special. In special, for example, it comes at the very beginning of the word, before the less-sonorant [p]. And all the really long syllables in English, like strengths, have [s] on the end. So the special status of [s] seems to favor [aks]. The fact that each form can be modeled perfectly well based on our knowledge of the way English words are formed helps to explain why both forms continue to be actively used, even centuries after they emerged. And, who knows? We might decide that [aks] is the correct form again in another hundred years or so. Try and keep that in mind the next time you talk about the right and wrong ways to say something.* ask, v.. OED Online. December 2012. Oxford University Press. 12 February 2013 <http://www.oed.com.offcampus.lib.washington.edu/view/Entry/11507>.]]>
diachronic semantics. That's the study of how word meanings change over time. Some of these changes are relatively easy to track. A mouse to a farmer in 1900 was a small rodent with unfortunate grain-pilfering proclivities. To a farmer today, it's also one of the tools she uses to interact with her computer. The word has gained a new semantic sense without losing it's original meaning. Sometimes, however, you have a weird little dance where a couple of words are negotiating over the same semantic space--that's another way of saying a related group of concepts that a language groups together--and that's where things get interesting. Cup, mug and glass are engaged in that little dance-off right now (at least in American English). Let's see how they're doing, shall we?[caption id= align=alignnone width=512] Cup? Glass? Jug? Mug? Why don't we just call them all drinking vessels and be done with it?[/caption]Cup: Ok, quick question for you: does a cup have to have a handle? The Oxford dictionaries say yes, but I really think that's out of date at this point. Dr. Reed pointed out that this was part of her criteria for whether something could be called a cup or not, but that a lot of younger speakers no longer make that distinction. In fact, recently I noticed that someone of my acquaintance uses cup to refer only to disposable cups. Cup also has the distinct advantage of being part of a lot of phrases: World cup, Stanley cup, cup of coffee, teacup, cuppa, cup of sugar, in your cups, and others that I can't think of right now.So cup is doing really well, and gaining semantic ground.Glass: Glass, on the other hand, isn't doing as well. I haven't yet talked to someone who can use glass to refer to drinking vessels that aren't actually made of glass including, perhaps a little oddly, clear disposable cups. On the other hand, there are some types of drinking vessels that I can only refer to as glasses. Mainly those for specific types of alcohol: wine glass, shot glass, martini glass, highball glass (though I've heard people referring to the glass itself just as a highball, so this might be on the way out). There are alcohol-specific pieces of glassware that don't count as glasses though--e.g. champagne flute, brandy snifter--so it's not a categorical distinction by any means.Glass seems to be pretty stable, but if cup continues to become broader and broader it might find itself on the outs.Mug: I don't have as much observational data on this one, but there seems to be another shift going on here. Mug originally referred only to drinking vessels that were larger than cups (see below), and still had handles.[caption id= align=alignnone width=512] Note that the smaller ones on top are cups and the larger ones on the bottom are labelled as mugs.[/caption]Most people call those insulated drinking vessels with the attached lids travel mugs rather than travel cups (640,000 Google hits vs. 22,400) but I find myself calling them cups instead. I think it's because 1) I pattern it with disposable coffee cups and 2) I find handledness is a necessary quality for mugs. I can call all of the drinking vessels in the picture above mugs and prefer mug to cup.So, at least for me, mug is beginning to take over the semantic space allotted to cup by older speakers.Of course, this is a very cursory, impressionistic snapshot of the current state of the semantic space. Without more robust data I'm hesitant to make concrete predictions about the ways in which these terms are negotiating their semantic space, but there's definitely some sort of drift going on.]]>
not scientific. Rigorous, yes. Scientific, no.[caption id= align=alignnone width=512] Hmm, I dunno. Looks science-y, but I don't see any lab coats. Or goggles. There should definitely be more goggles.[/caption]This subject is particularly near and dear to me because my own research looks into, among other things, how the ways in which linguists gather data affect the data they gather and the potential for systematic bias that introduces. In order to look at how we do things, I also need to know why. And that's where this discussion of science comes in. This can be a hard discussion to have, however, since conversations about what science is, or should be, tends to get muddied by the popular conception of science. I'm not saying people don't know what science is, 'cause I think most people do, just that we (and I include myself in that) also have a whole bucketful of other socially-motivated ideas that we tend to lump in with science.I'm going to call the social stuff that we've learned to associate with science The Science Mystique. I'm not the first person to call it that, but I think it's fitting. (Note that if you're looking for the science of Mystique, you'll need to look elsewhere.) To start in our exploration of the Science Mystique, let's start with a quote from another popular science writer, Phil Plait.They [the scientists who made the discoveries discussed earlier in the speech] used physics. They used math. They used chemistry, biology, astronomy, engineering.They used science.These are all the things you discovered doing your projects. All the things that brought you here today.Computers? Cell phones? Rockets to Saturn, probes to the ocean floor, PSP, gamecubes, gameboys, X-boxes? All by scientists.Those places I talked about before? You can get to know them too. You can experience the wonder of seeing them for the first time, the thrill of discovery, the incredible, visceral feeling of doing something no one has ever done before, seen things no one has seen before, know something no one else has ever known.No crystal balls, no tarot cards, no horoscopes. Just you, your brain, and your ability to think.Welcome to science. Youre gonna like it here.Inspirational! Science-y! Misleading! Wait, what?So there are a couple things here that I find really troubling, and I'm just going to break them down and go though them one by one. These are things that are part of the science mystique, that permeate our cultural conception of what science is, and I've encountered them over and over and over again. I'm just picking on this particular speech because it's been slathered all over the internet lately and I've encountered a lot of people who really resonated with its message. Science and engineering and math are treated as basically the same thing. This. This is one of my biggest pet peeves when it comes to talking about science. Yes, I know that STEM fields (that's Science, Technology, Engineering and Mathematics) are often lumped together. Yes, I know that there's a lot of cross-pollination. But one, and only one, of these fields has as its goal the creation of testable models. And that's science. The goal of engineering is to make stuff. And I know just enough math to know that there's no way I know what the goal of mathematics is. The takeaway here is that, no matter how science-y they may seem, how enfolded they are into the science mystique, neither math nor engineering is a science. There's an insinuation that science = thinking and non-science = NOT thinking. This is really closely tied in with the idea that you have to be smart to be a scientist. False. Absolutely false. In fact, raw intelligence isn't even on my list of the top five qualities you need to be a scientist: Passion. You need to love what you do, because otherwise being in grad school for five to ten years while living under the poverty line and working sixty hour weeks just isn't worth it. Dedication. See above. Creativity. Good scientists ask good questions, and coming up with a good but answerable question that no one has asked before and that will help shed new light on whatever it is you're studying takes lateral thinking. Excellent time management skills. Particularly if you're working in a university setting. You need to be able to balance research, teaching and service, all while still maintaining a healthy life. It's hard. Intelligibility. A huge part of science is taking very complex concepts and explaining them clearly. To your students. To other scientists. To people on the bus. To people on the internet (Hi guys!). You can have everything else on this list in spades, but if you can't express your ideas you're going to sink like a lead duck. Science is progress! Right? Right? Yes. Absolutely. There is no way in which science has harmed the human race and no way in which things other than science have aided it. It sounds really silly when you just come out and say it, doesn't it? I mean, we have the knowledge to eradicate polio, but because of social and political factors it hadn't happened yet. And you can't solve social problems by just throwing science at them. And then there's the fact that, while the models themselves maybe morally neutral, the uses to which they are put are not always so. See Einstein and the bomb. See chemical and biological warfare. And, frankly, I think the greatest advances of the 20th century weren't in science or engineering or technology. They were deep-seated changes in how we, particularly Americans, treated people. My great-grandmother couldn't go to high school because she was a woman. My mother couldn't take college-level courses because she was a woman, though she's currently working on her degree. Now, I'm a graduate student and my gender is almost completely irrelevant. Segregation is over. Same sex relationships are legally acknowledged by nine states and DC. That's the progress I would miss most if a weeping angel got me. Go quantitative or go home. I've noticed a strong bias towards quantitative data, to the point that a lot of people argue that it's better than qualitative data. I take umbridge at this. Quantitative data is easier, not necessarily better. Easier? Absolutely. It's easier to get ten people to agree that a banana is ten inches than it does to agree that it's tasty. And yet, from a practical standpoint, banana growers want to grow tastier bananas, ones that will ship well and sell well, not longer bananas. But it can be hard to plug banana tastiness into your mathematical models and measuring tastiness leaves you open to criticism that your data collection is biased. (That's not to say that qualitative data can't be biased.) This idea that quantitative data is better leads to an overemphasis on the type of questions that can best be answered quantitatively and that's a problem. This also leads some people to dismiss the squishy sciences that use mainly qualitative data and that's also a problem. All branches of science help us to shed new light on the world and universe around us and to ignore work because it doesn't fit the science mystique is a grave mistake.So what can we do to help lessen the effects of these biases? To disentangle the science mystique from the actual science? Well, the best thing we can do is be aware of it. Critically examine the ways the people talk about science. Closely examine your own biases. I, for example, find it far too easy to slip into the quantitative is better trap. Notice systematic similarities and question them. Science is, after all, about asking questions.]]>
The Selfish Gene. He used the term to describe cultural ideas that are transmitted from individual to individual much like a virus or bacteria. The science mystique I've written about is a great example of a meme of this type. If you have fifteen minutes, I suggest Dan Dennett's TED talk on the subject of memes as a much more thorough introduction.[youtube http://www.youtube.com/watch?v=KzGjEkp772s?rel=0]So what about the internet part? Well, internet memes tend to be a bit narrower in their scope. Viral videos, for example, seem to be a separate category from intent memes even though they clearly fit into Dawkin's idea of what a meme is. Generally, internet meme refers to a specific image and text that is associated with that image. These are generally called image macros. (For a through analysis of emerging and successful internet memes, as well as an excellent object lesson in why you shouldn't scroll down to read the comments, I suggest Know Your Meme.) It's the text that I'm particularly interested in here.Memes which involve language require that it be used in a very specific way, and failure to obey these rules results in social consequences. In order to keep this post a manageable size, I'm just going to look at the use of language in the two most popular image memes, as ranked by memegenerator.net, though there is a lot more to study here. (I think a study of the differing uses of the initialisms MRW [my reaction when] and MFW [my face when] on imgur and 4chan would show some very interesting patterns in the construction of identity in the two communities. Particularly since the 4chan community is made up of anonymous individuals and the imgur community is made up of named individuals who are attempting to gain status through points. But that's a discussion for another day...)[caption id=attachment_791 align=aligncenter width=523] The God tier (i.e. most popular) characters at on the website Meme Generator as of February 23rd, 2013. Click for link to site. If you don't recognize all of these characters, congratulations on not spending all your free time on the internet.[/caption]Without further ado, let's get to the grammar. (I know y'all are excited.)Y U NoThis meme is particularly interesting because its page on Meme Generator already has a grammatical description.The Y U No meme actually began as Y U No Guy but eventually evolved into simply Y U No, the phrase being generally followed by some often ridiculous suggestion. Originally, the face of Y U No guy was taken from Japanese cartoon Gantz Chapter 55: Naked King, edited, and placed on a pink wallpaper. The text for the item reads I TXT U
Y U NO TXTBAK?! It appeared as a Tumblr file, garnering over 10,000 likes and reblogs.It went totally viral, and has morphed into hundreds of different forms with a similar theme. When it was uploaded to MemeGenerator in a format that was editable, it really took off. The formula used was : (X, subject noun), [WH]Y [YO]U NO (Y, verb)?. [Bold mine.]A pretty good try, but it can definitely be improved upon. There are always two distinct groupings of text in this meme, always in impact font, white with a black border and in all caps. This is pretty consistent across all image macros. In order to indicate the break between the two text chunks, I will use -- throughout this post. The chunk of text that appears above the image is a noun phrase that directly addresses someone or something, often a famous individual or corporation. The bottom text starts with Y U NO and finishes with a verb phrase. The verb phrase is an activity or action that the addressee from the first block of text could or should have done, and that the meme creator considers positive. It is also inflected as if Y U NO were structurally equivalent to Why didn't you. So, since you would ask Steve Jobs Why didn't you donate more money to charity?, a grammatical meme to that effect would be STEVE JOBS -- Y U NO DONATE MORE MONEY TO CHARITY. In effect, this meme questions someone or thing who had the agency to do something positive why they chose not to do that thing. While this certainly has the potential to be a vehicle for social commentary, like most memes it's mostly used for comedic effect. Finally, there is some variation in the punctuation of this meme. While no punctuation is the most common, an exclamation points, a question mark or both are all used. I would hypothesize that the the use of punctuation varies between internet communities... but I don't really have the time or space to get into that here.[caption id= align=alignnone width=400] A meme (created by me using Meme Generator) following the guidelines outlined above.[/caption]Futurama FryThis meme also has a brief grammatical analysisThe text surrounding the meme picture, as with other memes, follows a set formula. This phrasal template goes as follows: Not sure if (insert thing), with the bottom line then reading or just (other thing). It was first utilized in another meme entitled I see what you did there, where Fry is shown in two panels, with the first one with him in a wide-eyed expression of surprise, and the second one with the familiar half-lidded expression.As an example of the phrasal template, Futurama Fry can be seen saying: Not sure if just smart
. Or British. Another example would be Not sure if highbeams
or just bright headlights. The main form of the meme seems to be with the text Not sure if trolling or just stupid.This meme is particularly interesting because there seems to an extremely rigid syntactic structure. The phrase follow the form NOT SURE IF _____ -- OR _____. The first blank can either be filled by a complete sentence or a subject complement while the second blank must be filled by a subject complement. Subject complements, also called predicates (But only by linguists; if you learned about predicates in school it's probably something different. A subject complement is more like a predicate adjective or predicate noun.), are everything that can come after a form of the verb to be in a sentence. So, in a sentence like It is raining, raining is the subject complement. So, for the Futurama Fry meme, if you wanted to indicate that you were uncertain whther it was raining or sleeting, both of these forms would be correct: NOT SURE IF IT'S RAINING -- OR SLEETING NOT SURE IF RAINING -- OR SLEETINGNote that, if a complete sentence is used and abbreviation is possible, it must be abbreviated. Thus the following sentence is not a good Futurama Fry sentence: *NOT SURE IF IT IS RAINING -- OR SLEETINGThis is particularly interesting because the phrasal template description does not include this distinction, but it is quite robust. This is a great example of how humans notice and perpetuate linguistic patterns that they aren't necessarily aware of.[caption id= align=alignnone width=400] A meme (created by me using Meme Generator) following the guidelines outlined above. If you're not sure whether it's phonetics or phonology, may I recommend this post as a quick refresher?[/caption]So this is obviously very interesting to a linguist, since we're really interested in extracting and distilling those patterns. But why is this useful/interesting to those of you who aren't linguists? A couple of reasons. I hope you find it at least a little interesting and that it helps to enrich your knowledge of your experience as a human. Our capacity for patterning is so robust that it affects almost every aspect of our existence and yet it's easy to forget that, to let our awareness of that slip our of our conscious minds. Some patterns deserve to be examined and criticized, though, and linguistics provides an excellent low-risk training ground for that kind of analysis. If you are involved in internet communities I hope you can use this new knowledge to avoid the social consequences of violating meme grammars. These consequences can range from a gentle reprimand to mockery and scorn The gatekeepers of internet culture are many, vigilant and vicious. As with much linguistic inquiry, accurately noting and describing these patterns is the first step towards being able to use them in a useful way. I can think of many uses, for example, of a program that did large-scale sentiment analyses of image macros but was able to determine which were grammatical (and therefore more likely to be accepted and propagated by internet communities) and which were not.]]>
Short answer: they're all correct (at least in the United States) but some are more common in certain dialectal areas. Here's a handy-dandy map, in case you were wondering:[caption id= align=alignnone width=757] Maps! Language! Still one of my favorite combinations. This particular map, and the data collection it's based on is courtesy of popvssoda.com. Click picture for link and all the lovely statistics. (You do like statistics, right?)[/caption]Long answer: I'm going to sort this into reactions I tend to get after answering questions like this one.What do you mean they're all correct? Coke/Soda/Pop is clearly wrong. Ok, I'll admit, there are certain situations when you might need to choose to use one over the other. Say, if you're writing for a newspaper with a very strict style guide. But otherwise, I'm sticking by my guns here: they're all correct. How do I know? Because each of them in is current usage, and there is a dialectal group where it is the preferred term. Linguistics (at least the type of linguistics that studies dialectal variation) is all about describing what people actually say and people actually say all three.But why doesn't everyone just say the same thing? Wouldn't that be easier? Easier to understand? Probably, yes. But people use different words for the same thing for the same reasons that they speak different languages. In a very, very simplified way, it kinda works like this: You tend to speak like the people that you spend time with. That makes it easier for you to understand each other and lets other people in your social group know that you're all members of the same group. Like team jerseys. Over time, your group will introduce or adopt new linguistic makers that aren't necessarily used by the whole population. Maybe a person you know refers to sodas as phosphates because his grandfather was a sodajerk and that form really catches on among your friends. As your group keeps using and adopting new words (or sounds, or grammatical markers or any other facet of language) that are different from other groups their language slowly begins to drift away from the language used by other groups. Eventually, in extreme cases, you end up with separate languages. (Like what happened with Latin: different speech communities ended up speaking French, Italian, Spanish, Portuguese, and the other Romance languages rather than the Latin they'd shared under Roman rule.)This is the process by which languages or dialectal communities tend to diverge. Divergence isn't the only pressure on speakers, however. Particularly since we can now talk to and listen to people from basically anywhere (Yay internet! Yay TV! Yay radio!) your speech community could look like mine does: split between people from the Pacific Northwest and the South. My personal language use is slowly drifting from mostly Southern to a mix of Southern and Pacific Northwestern. This is called dialect leveling and it's part of the reason why American dialectal regions tend include hundreds or thousands of miles instead of two or three.Dialect leveling: Where two or more groups of people start out talking differently and end up talking alike. Schools tend to be a huge factor in this.So, on the one hand, there is pressure to start all talking alike. On the other hand, however, I still want to sound like I belong with my Southern friends and have them understand me easily (and not be made fun of for sounding strange, let's be honest) so when I'm talking to them I don't retain very many markers of the Pacific Northwest. That's pressure that's keeping the dialect areas separate and the reason why I still say soda, even though I live in a pop region.Huh. That's pretty cool. Yep. Yep, it sure is.]]>
this guy. Or this guy. Or this person, whose patience in detailing errors borders on obsession. Or, heck, this person, who isn't so sure that voice recognition is even a thing we need.[caption id= align=alignnone width=512] You mean you wouldn't want to be able to have pleasant little chats with your computer? I mean, how could that possibly go wrong?[/caption]Now, to be fair to linguists, we've kinda been out of the loop for a while. Fred Jelinek, a very famous researcher in speech recognition, once said Every time we fire a phonetician/linguist, the performance of our system goes up. Oof, right in the career prospects. There was, however, a very good reason for that, and it had to do with the pressures on computer scientists and linguists respectively. (Also a bunch of historical stuff that we're not going to get into.)Basically, in the past (and currently to a certain extent) there was this divide in linguistics. Linguists wanted to model speaker's competence, not their performance. Basically, there's this idea that there is some sort of place in your brain where you knew all the rules of language and have them all perfectly mapped out and described. Not in a consious way, but there nonetheless. But somewhere between the magical garden of language and your mouth and/or ears you trip up and mistakes happen. You say a word wrong or mishear it or switch bits around... all sorts of things can go wrong. Plus, of course, even if we don't make a recognizable mistake, there's a incredible amount of variation that we can decipher without a problem. That got pushed over to the performance side, though, and wasn't looked at as much. Linguistics was all about what was happening in the language mind-garden (the competence) and not the messy sorts of things you say in everyday life (the performance). You can also think of it like what celebrities actually say in an interview vs. what gets into the newspaper; all the ums and uhs are taken out, little stutters or repetitions are erased and if the sentence structure came out a little wonky the reporter pats it back into shape. It was pretty clear what they meant to say, after all.So you've got linguists with their competence models explaining them to the computer folks and computer folks being all clever and mathy and coming up with algorithms that seem to accurately model our knowledge of human linguistic competency... and getting terrible results. Everyone's working hard and doing their best and it's just not working.I think you can probably figure out why: if you're a computer and just sitting there with very little knowledge of language (consider that this was before any of the big corpora were published, so there wasn't a whole lot of raw data) and someone hands you a model that's supposed to handle only perfect data and also actual speech data, which even under ideal conditions is far from perfect, you're going to spit out spaghetti and call it a day. It's a bit like telling someone to make you a peanut butter and jelly sandwich and just expecting them to do it. Which is fine if they already know what peanut butter and jelly are, and where you keep the bread, and how to open jars, and that food is something humans eat, so you shouldn't rub it on anything too covered with bacteria or they'll get sick and die. Probably not the best way to go about it.So the linguists got the boot and they and the computational people pretty much did their own things for a bit. The model that most speech recognition programs use today is mostly statistical, based on things like how often a word shows up in whichever corpus they're using currently. Which works pretty well. In a quiet room. When you speak clearly. And slowly. And don't use any super-exotic words. And aren't having a conversation. And have trained the system on your voice. And have enough processing power in whatever device you're using. And don't get all wild and crazy with your intonation. See the problem?Language is incredibly complex and speech recognition technology, particularly when it's based on a purely statistical model, is not terrific at dealing with all that complexity. Which is not to say that I'm knocking statistical models! Statistical phonology is mind-blowing and I think we in linguistics will get a lot of mileage from it. But there's a difference. We're not looking to conserve processing power: we're looking to model what humans are actually doing. There's been a shift away from the competency/performance divide (though it does still exist) and more interest in modelling the messy stuff that we actually see: conversational speech, connected speech, variation within speakers. And the models that we come up with are complex. Really complex. People working in Exemplar Theory, for example, have found quite a bit of evidence that you remember everything you've ever heard and use all of it to help parse incoming signals. Yeah, it's crazy. And it's not something that our current computers can do. Which is fine; it give linguists time to further refine our models. When computers are ready, we will be too, and in the meantime computer people and linguistic people are showing more and more overlap again, and using each other's work more and more. And, you know, singing Kumbayah and roasting marshmallows together. It's pretty friendly.So what's the take-away? Well, at least for the moment, in order to get speech recognition to a better place than it is now, we need to build models that work for a system that is less complex than the human brain. Linguistics research, particularly into statistical models, is helping with this. For the future? We need to build systems that are as complex at the human brain. (Bonus: we'll finally be able to test models of child language acquisition without doing deeply unethical things! Not that we would do deeply unethical things.) Overall, I'm very optimistic that computers will eventually be able to recognize speech as well as humans can.TL;DR version: Speech recognition has been light on linguists because they weren't modeling what was useful for computational tasks. Now linguists are building and testing useful models. Yay! Language is super complex and treating it like it's not will get you hit in the face with an error-ridden fish. Linguists know language is complex and are working diligently at accurately describing how and why. Yay! In order to get perfect speech recognition down, we're going to need to have computers that are similar to our brains. I'm pretty optimistic that this will happen. ]]>
a post recently where I suggested that a trained phonetician can help you learn to pronounce things and I thought I'd put my money where my mouth is and run you though how to pronounce Gangnam; phonetics style. (Note: I'm assuming you're a native English speaker here.)First, let's see how a non-phonetician does it. Here's a brief guide to the correct pronunciation offered on Reddit by ThatWonAsianGuy, who I can only assume is a native Korean speaker.The first G apparently sounds like a K consonant to non-Korean speakers, but it's somewhere between a G and a K, but more towards the G. (There are three letters similar, <U+314B>,<U+3131>, and <U+3132>. The first is a normal k, the second the one used in Gangnam, and the third being a clicky, harsh g/k noise.)The angpart is a very wide ahh (like when a doctor tells you to open your mouth) followed by an ng (like the end of ending). The ahh part, however, is not a long vowel, so it's pronounced quickly.Nam also has the ahh for the a. The other letters are normal.So it sounds like (G/K)ahng-nahm.Let's see how he did. Judges?Full marks for accuracy, Rachael. Nothing he said is incorrect. On the other hand, I give it a usability score of just 2 out of 10. While the descriptions of the vowels and nasal sounds are intelligible and usable to most English speakers, even I was stumped by his description of a sound between a g and a k. A strong effort, though; with some training this kid could make it to the big leagues of phonetics.Thank you Rachael, and good luck to ThatWonAsianGuy in his future phonetics career. Ok, so what is going on here in terms of the k/g/apparently clicky harsh sound? Funny you should ask, because I'm about to tell you in gruesome detail.First things first: you need to know what voicing is. Put your hand over your throat and say k. Now say g. Can you feel how, when you say g, there's sort of a buzzing feeling? That's what linguists call voicing. What's actually happening is that you're pulling your vocal folds together and then forcing air through them. This makes them vibrate, which in turn makes a sound. Like so:[youtube http://www.youtube.com/watch?v=-XGds2GAvGQ](If you're wondering that little cat-tongue looking thing is, that's the epiglottis. It keeps you from choking to death by trying to breath food and is way up there on my list of favorite body parts.)But wait! That's not all! What we think of as regular voicing (ok, maybe you don't think of it all that often, but I'm just going to assume that you do) is just one of the things you can do with your voicing. What other types of voicing are there? It's the type of thing that's really best described vocally, so here goes:[soundcloud url=http://api.soundcloud.com/tracks/67168203 params=show_comments=false&auto_play=false&color=08ff00 width=100% height=81 iframe=false /]Ok, so, that's what's going on in your larynx. Why is this important? Well it turns out that only one of the three sounds is actually voiced, and it's voiced using a different type of voicing. Any guesses as to which one?Yep, it's the harsh, clicky one and it's got glottal voicing (that really low, creaky sort of voice)*. The difference between the regular k and the k/g sound has nothing to do with voicing type. Which is crazy talk, because almost every learn Korean textbook or online course I've come across has described them as k and g respectively and, as we already established, the difference between k and g is that the k is voiced and the g isn't.Ok, I simplified things a bit. When you say k and g at the beginning of a word in English (and only at the beginning of a word), there's actually one additional difference between them. Try this. Put your hand in front of your mouth and say cab. Then say gab. Do you notice a difference?You should have felt a puff of air when you said the k but not when you said the g. Want proof that it only happens at the beginning of words? Try saying back and bag in the same way, with your hand in front of you mouth. At the end of words they feel about the same. What's going on?Well, in English we always say an unvoiced k with a little puff of air at the beginning of the word. In fact, we tend to listen for that puff more than we listen for voicing. So if you say kat without voicing the sound, but also without the little puff of air, it sounds more like gat. (Which is why language teachers tell you to say it g instead of k. It's not, strictly speaking, right, but it is a little easier to hear. The same thing happens in Mandarin, BTW.) And that's the sound that's at the beginning of Gangnam.You'll probably need to practice a bit before you get it right, but if you can make a sound at the beginning of a word where your vocal chords aren't vibrating and without that little puff of air, you're doing it right. You can already make the sound, it's just the moving it to the beginning of the word that's throwing a monkey wrench in the works.So it's the unvoiced k without the little puff of air. Then an aahhh sound, just as described above. Then the ng sound, which you tend to see at the end of words in English. It can happen in the middle of words as well, though, like in finger. And then nam, pronounced in the same way as the last syllable as Vietnam.In the special super-secret International Phonetic (Cabal's) Alphabet, that's [ka<U+014B>nam]. Now go out there and impress a Korean speaker by not butchering the phonetics of their language!*Ok, ok, that's a bit of an oversimplification. You can find the whole story here.]]>
University of Washington Scholar's Studio. In it, I covered a couple things that I've already talked about here on my blog: the fact that, acoustically speaking, there's no such thing as a word and that our ears can trick us. My general point was that our intuitions about speech, a lot of the things we think seem completely obvious, actually aren't true at all from an acoustic perspective.What really got to me, though, was that after I'd finished my talk (and it was super fast, too, only five minutes) someone asked why it mattered. Why should we care that our intuitions don't match reality? We can still communicate perfectly well. How is linguistics useful, they asked. Why should they care?[caption id=attachment_380 align=aligncenter width=523] I'm sorry, what was it you plan to spend your life studying again? I know you told me last week, but for some reason all I remember you saying is Blah, blah, giant waste of time.[/caption]It was a good question, and I'm really bummed I didn't have time to answer it. I sometimes forget, as I'm wading through a hip-deep piles of readings that I need to get to, that it's not immediately obvious to other people why what I do is important. And it is! If I didn't believe that, I wouldn't be in grad school. (It's certainly not the glamorous easy living and fat salary that keep me here.) It's important in two main ways. Way one is the way in which it enhances our knowledge and way two is the way that it helps people. Increasing our knowledge. Ok, so, a lot of our intuitions are wrong. So what? So a lot of things! If we're perceiving things that aren't really there, or not perceiving things that are really there, something weird and interesting is going on. We're really used to thinking of ourselves as pretty unbiased in our observations. Sure, we can't hear all the sounds that are made, but we've built sensors for that, right? But it's even more pervasive than that. We only perceive the things that our bodies and sensory organs and brains can perceive, and we really don't know how all these biological filters work. Well, okay, we do know some things (lots and lots of things about ears, in particular) but there's a whole lot that we still have left to learn. The list of unanswered questions in linguistics is a little daunting, even just in the sub-sub-field of perceptual phonetics.Every single one of us uses language every single day. And we know embarrassingly little about how it works. And, what we do know, it's often hard to share with people who have little background in linguistics. Even here, in my blog, without time restraints and an audience that's already pretty interested (You guys are awesome!) I often have to gloss over interesting things. Not because I don't think you'll understand them, but because I'd metaphorically have to grow a tree, chop it down and spends hours carving it just to make a little step stool so you can get the high-level concept off the shelf and, seriously, who has time for that? Sometimes I really envy scientists in the major disciplines because everyone already knows the basics of what they study. Imagine that you're a geneticist, but before you can tell people you look at DNA, you have to convince them that sexual reproduction exists. I dream of the day when every graduating high school senior will know IPA. (That's the international phonetic alphabet, not the beer.)Okay, off the soapbox.Helping people. Linguistics has lots and lots and lots of applications. (I'm just going to talk about my little sub-field here, so know that there's a lot of stuff being left unsaid.) The biggest problem is that so few people know that linguistics is a thing. We can and want to help! Foreign language teaching. (AKA applied linguistics) This one is a particular pet peeve of mine. How many of you have taken a foreign language class and had the instructor tell you something about a sound in the language, like: It's between a k and a g but more like the k except different. That crap is not helpful. Particularly if the instructor is a native speaker of the language, they'll often just keep telling you that you're doing it wrong without offering a concrete way to make it correctly. Fun fact: There is an entire field dedicated to accurately describing the sounds of the world's languages. One good class on phonetics and suddenly you have a concrete description of what you're supposed to be doing with your mouth and the tools to tell when you're doing it wrong. On the plus side, a lot language teachers are starting to incorporate linguistics into their curriculum with good results. Speech recognition and speech synthesis. So this is an area that's a little more difficult. Most people working on these sorts of projects right now are computational people and not linguists. There is a growing community of people who do both (UW offers a masters degree in computational linguistics that feeds lots of smart people into Seattle companies like Microsoft and Amazon, for example) but there's definite room for improvement. The main tension is the fact that using linguistic models instead of statistical ones (though some linguistic models are statistical) hugely increases the need for processing power. The benefit is that accuracy tends to increase. I hope that, as processing power continues to be easier and cheaper to access, more linguistics research will be incorporated into these applications. Fun fact: In computer speech recognition, an 80% comprehension accuracy rate in conversational speech is considered acceptable. In humans, that's grounds to test for hearing or brain damage. Speech pathology. This is a great field and has made and continues to make extensive use of linguistic research. Speech pathologists help people with speech disorders overcome them, and the majority of speech pathologists have an undergraduate degree in linguistics and a masters in speech pathology. Plus, it's a fast-growing career field with a good outlook. Seriously, speech pathology is awesome. Fun fact: Almost half of all speech pathologists work in school environments, helping kids with speech disorders. That's like the antithesis of a mad scientist, right there.And that's why you should care. Linguistics helps us learn about ourselves and help people, and what else could you ask for in a scientific discipline? (Okay, maybe explosions and mutant sharks, but do those things really help humanity?)]]>
So, in the field of semantics sitting around thinking about your intuitions about words is actually pretty solid methodology, so I'm going to do that. (I know, right? Not a single ultrasound or tracheal puncture? What do they do on Saturday nights?) Let's compare the following sentences: That dress is bespoke as shit. His wardrobe is bespoke as shit. That dress is pink as shit. His wardrobe is pink as shit.My intuition is that that two and three are fine, four is... okay but a little weird and that one is downright wrong. And I also feel very strongly that the goodness of a given sentence where some quality of an object is modified by as shit is closely tied to whether or not that quality is a continuous scale. (And, no, I'm not going to say adjective here. Mainly because you can also say Her wardrobe is completely made out of sharks as shit. And, in my universe, at least, completely made out of sharks doesn't really count as an adjective.) Things that are on a continuous scale are like darkness. It can be a little dark or really dark or completely dark; there's not really any point where you switch from being dark to light, right? And something that's dark for me, like a starry night, might be light for a bat. Pink, and all colors, are continuous scales. (FUN FACT: how many color terms various languages have and why is a really big debate.) But things like free (as in costing zero dollars) are more discrete. Something's either free or it's not and there's not really any middle ground.The other thing you need to take into account is whether or not the thing being described is plural and whether it's a mass or count noun. Mass nouns are things like water, sand or bubblegum. You can less or more or some of these things, but you can't count them. I'll have three water just sounds really odd. Count nouns are things like buckets of water, grains of sand or pieces of bubblegum. These are things that have discrete, countable units instead of just a lump of mass. It's a really useful distinction.Ok, so how does this gel with my intuitions? And, more importantly, can I describe qualities in such a way that my description has predictive power? (Remember, linguistics is all about building testable models of language use!) I think I can. Let's roll up our sleeves and get to the knitty-gritty. I've got two separate parts of the sentence that go into whether or not I can use as shit: the thing(s) being described, and the quality it has. The thing being described can be either singular or plural, and either mass or count. The quality it has can either be continuous or discrete. Let's put this in outline form to make the possible different conditions a bit easier to see: Thing being described Is it singular? If yes, is it: A mass noun? If so, assign condition 1. A count noun? If so, assign condition 2. Is it plural? If yes, is it: A mass noun? If so, assign condition TRICK QUESTION, because that's not possible. :P Is it a count noun? If so, assign condition 3. Qualities: continuous or discrete Is it continuous? If so, assign condition A Is it discrete? If so, assign condition B.[What's that, pseudocode? I thought you didn't do computer-y code-y math-y things, Rachael.] Ok, so now we've got six possible conditions for a given sentence (1A, 2A, 3A, 1B, 2B and 3B). Which conditions can take as shit and why? (Keep in mind, this is just my intuition. 1A: Water is big as shit. = acceptable 2A: The dog is big as shit. = acceptable 3A: The dogs are big as shit. = acceptable 1B: Water is still as shit. = unacceptable 2B The dog is still as shit. = unacceptable 3B: The dogs are still as shit. = acceptableOkay, so a little of my reasoning. I feel very strong that as shit serves to intensify the adjective and you can't intensify something that's binary. The light switch it either on or off; it's can't be extremely on or extremely off. So all of the B conditions are bad... except for 3B. What is 3B acceptable? Well, for me what I get the sense that what you're saying is not that you're intensifying the qualities of each individual but that you're talking about the group as whole. And if you add up a bunch of binaries (three still dogs and one moving dog) you can get value somewhere in the middle.But that's just a really informal little model based on my intuitions and I feel like they're getting screwed up because I've spent way too much time thinking about this. And now the tea that I was making is getting cold as shit, so I might as well go drink it.]]>
Blofeld, but I really do think that looking at the way that linguists do linguistics is incredibly important. (Warning: the next paragraph will be kinda preachy, feel free to skip it.)It's something the field, to paint with an incredibly broad brush, tends to skimp on. After all, we're asking all these really interesting questions that have the potential to change people's lives. How is hearing speech different from hearing other things? What causes language pathologies and how can we help correct them? Can we use the voice signal to reliably detect Parkinson's over the phone? That's what linguistics is. Who has time to look at whether asking people to list the date on a survey form affects their responses? If linguists don't use good, controlled methods to attempt to look at these questions, though, we'll either find the wrong answers or miss it completely because of some confounding variable we didn't think about. Believe me, I know firsthand how heart wrenching it is to design an experiment, run subjects, do your stats and end up with a big pile of useless goo because your methodology wasn't well thought out. It sucks. And it happens way more than it needs to, mainly because a lot of linguistics programs don't stress rigorous scientific training.OK, sermon over. Maps! I think using maps to look at language data is a great methodology! Why?[caption id= align=aligncenter width=333] Hmm... needs more data about language. Also the rest of the continents, but who am I to judge? [/caption] You get an end product that's tangible and easy to read and use. People know what maps are and how to use them. Presenting linguistic data as a map rather than, say, a terabyte of detailed surveys or a thousand hours of recordings is a great way to make that same data accessible. Accessible data gets used. And isn't that kind of the whole point? Maps are so. accurate. right now. This means that maps of data aren't just rough approximations, they're the best, most accurate way to display this information. Seriously, the stuff you can do with GIS is just mind blowing. (Check out this dialect map of the US. If you click on the region you're most interested, you get additional data like field recordings, along with the precise place they were made. Super useful.) Maps are fun. Oh, come on, who doesn't like looking at maps? Particularly if you're looking at a region you're familiar with. See, here's my high school, and the hay field we rented three years ago. Oh, and there's my friend's house! I didn't realize they were so close to the highway. Add a second layer of information and BOOM, instant learning.The studiesTwo of the studies I came across were actually based on Twitter data. Twitter's an amazing resource for studying linguistics because you have this enormous data set you can just use without having to get consent forms from every single person. So nice. Plus, because all tweets are archived, in the Library of Congress if nowhere else, other researchers can go back and verify things really easily.This study looks at how novel slang expressions spread across the US. It hasn't actually been published yet, so I don't have the map itself, but they do talk about some interesting tidbits. For example: the places most likely to spawn new successful slang are urban centers with a high African American population.The second Twitter study is based in London and looked at the different languages Londoners tweet in and did have a map:[caption id= align=alignnone width=1000] Click for link to author's blog post.[/caption]Interesting, huh? You can really get a good idea of the linguistic landscape of London. Although there were some potential methodological problems with this study, I still think it's a great way to present this data.The third study I came across is one that's actually here at the University of Washington. This one is interesting because it kind of goes the other way. Basically, the researchers has respondents indicate areas on a map of Washington where they thought language communities existed and then had them describe them. So what you end up with is sort of a representation of the social ideas of what language is like in various parts of Washington state. Like so:[caption id=attachment_375 align=aligncenter width=523] Click for link to study site.[/caption]There are lots more interesting maps on the study site, each of which shows some different perception of language use in Washington State. (My favorite is the one that suggests that people think other people who live right next to the Canadian border sound Canadian.)So these are just a couple of the ways in which people are using maps to look at language data. I hope it's a trend that continues.]]>
Oh, this? It's just the pocket edition. The full one is three hundred volumes and comes with an elephant named George to carry it around your house. And it's covered in gold. This edition is only bound in unicorn skin but it's fine for a quick desk reference.[/caption]The underlying assumption behind the search to see if someone else uses the word is that, if they don't, you can't either. It's not a real word. Which begs the question: what makes a word real? Is there a moment of Pinocchio-like transformation where the hollow wooden word someone created suddenly takes on life and joins the ranks of the English language to much back-slapping and cigar-handing from the other vetted words? Is there a little graduation party where the word gets a diploma from the OED and suddenly it's okay to use it whenever you want? Or does it get hired by the spelling board and get to work right away?OK, so that was getting a bit silly, but my point is that most people have the vague notion that there's a distinction between real words and fake words that's pretty hard and fast. Like most slang words and brand names are fake words. I like to call this the Scrabble distinction. If you can play it in Scrabble, it counts and you can put it in a paper or e-mail and no one will call you on it. If you can't, it's a fake word and you use it at your own risk. Dictionaries play a large part in determining which is which, right? The official Scrabble dictionary is pretty conservative: it doesn't have d'oh in it for example. But it's also not without controversy. The first official Scrabble dictionary, for example, didn't have granola in it, which the Oxford English Dictionary (the great grand-daddy of English dictionaries and probably the most complete record ever complied of the lexicon of any language ever) notes was first used in 1886 and I think most of us would agree is a real word.The line is even blurrier than that, though. English is a language with a long and rich written tradition. In some ways, that's great. We've got a lot more information on how words used to be pronounced than we would have otherwise and a lot of diachronic information. (That's information about how the language has changed over time. :P ) But if you've been exposed mainly to the English tradition, as I have, you tend to forget that writing isn't inseparable from spoken language. They're two different things and there are a lot of traditions that aren't writing-based. Consider, for example, the Odù Ifá, an entirely oral divination text from Nigeria that sometimes gets compared to the bible or the Qur'an. In the cultures I was raised in, the thought of a sacred text that you can't read is strange, but that's just part of the cultural lens that I see the world through; I shouldn't project that bias onto other cultures.So non-literary cultures still need to add words to their lexicons, right? But how do they know which words are real without dictionaries? It depends. Sometimes it just sort of happens organically. We see this in English too. Think about words associated with texting or IMing like lol or brb (that's laughing out loud and be right back for those of you who are still living under rocks). I've noticed people saying these in oral conversations more and more and I wouldn't be surprised if in fifty years burb started showing up in dictionaries. But even cultures which have only had writing systems for a very short amounts of time have gatekeepers. Navajo, which has only been written since around 1940, is a great example. Peter Ladefoged shares the following story in Phonetic Data Analysis:One of our former UCLA linguistics students who is a Navajo tells how she was once giving a talk in a Navajo community. She was showing how words could be put together to create new words (such as sweet + heart creates a word with an entirely new meaning). When she was explaining this an elder called out: 'Stop this blasphemy! Only the gods can create words.' The Navajo language is holy in a way that is very foreign to most of us (p. 13).So in Navajo you have elders and religious leaders who are the guardians of the language and serve as the final authorities. (FUN FACT: authority comes from the same root as author. See how writing-dependent English is?) There are always gray areas though. Language is, after all, incredibly complex. I'll leave you one case to think about.Rammaflagit. That's <U+0279>æm.<U+0259>.flæ<U+0292>.<U+026A>t in the international phonetic alphabet. (I remember how thrilled my dad was when I told him I was studying IPA in college.) I hear it all the time and it means something like gosh darn it, sort of a bolderized curse word. Real word or not? The dictionaries say no, but the people who I've heard using it would clearly say yes. What do you think?]]>
always equals four. Not sometimes. Not only when it felt like it. All the time. Nice and simple.Reading, and particularly phonics, on the other hand, was a minefield of dirty tricks. Oh, sure, they told us that each letter represented a single sound, but even a kid knows that's hooey. Cough? Bough? Come on, that was like throwing sand in a fight; completely unfair. And what about those vowels? What and cut rhyme with each other, not cut and put. Even as phonics training was increasing my phonemic awareness, pushing me to pay more attention to the speech sounds I made, English orthography (that's our spelling system) was dragging me behind the ball-shed and pulling out my hair in clumps. Metaphorically.[caption id= align=alignnone width=512] Oh man, they're trying to tell us that A makes the 'Aaahhh' sound. What do they take us for, complete idiots? Or is that 'whaahhht' to they take us for?I know, right? One-to-one correspondence? Complete rubbish![/caption]Of course, I did eventually pass third grade and gain mastery of the written English language. But it was an uphill battle all the way. Why? Because English orthography is retarded. Wait. I'm sorry. That's completely unfair to individuals suffering from retardation. English orthography is spiteful, contradictory and completely unsuited to representing the second most widely-spoken second language. This poem really highlights the problem:Recovering Sounds from OrthographyBrush up Your EnglishI take it you already knowOf tough and bough and cough and dough?Others may stumble but not youOn hiccough, thorough, slough and through.Well done! And now you wish perhaps,To learn of less familiar traps?Beware of heard, a dreadful wordThat looks like beard and sounds like bird.And dead, it's said like bed, not bead-for goodness' sake don't call it 'deed'!Watch out for meat and great and threat(they rhyme with suite and straight and debt).A moth is not a moth in mother,Nor both in bother, broth, or brother,And here is not a match for there,Nor dear and fear for bear and pear,And then there's doze and rose and lose-Just look them up- and goose and choose,And cork and work and card and wardAnd font and front and word and sword,And do and go and thwart and cart-Come, I've hardly made a start!A dreadful language? Man alive!I'd learned to speak it when I was five!And yet to write it, the more I sigh,I'll not learn how 'til the day I die.A dreadful language? Man alive! I mastered it when I was five.-- T.S. Watt (1954)So why don't we get our acts together and fix this mess? Well... trying to fix it is kind of the reason we're in this mess in the first place. Basically, in renaissance England we started out with a basically phonetic spelling system. You actually sounded out words and wrote them as they sounded. Aks instead of ask, for example. (For what it's worth, aks is the original pronunciation.) And you would be writing by hand. On very expensive parchment with very expensive quills and ink for very rich people.Enter the printing press. Suddenly we can not only produce massive amounts of literature, but everyone can access them. Spelling goes from being something that only really rich people and scribes care about to a popular phenomena. And printing press owners were quick to capitalize on that phenomena by printing spelling lists that showed the correct way to write words. Except there wasn't a whole lot of agreement between the different printing houses and they were already so heavily invested in their own systems that they weren't really willing to all switch over to a centralized system. By the time Samuel Johnson comes around to pin down every word of English like an entomologist in a field of butterflies, we have standardized spellings for most words... that all come from different systems developed by different people. And it's just gotten more complex from there. One of the main reasons is that we keep shoving new words into the language without regard for how they're spelled.The problem with defending the purity of the English language is that the English language is as pure as a crib-house whore. It not only borrows words from other languages; it has on occasion chased other languages down dark alley-ways, clubbed them unconscious and rifled their pockets for new vocabulary.<U+2015> James NicollThere's actually a sound in English, the zh sort of sound in lesiure, that only exists in words we've borrowed from other language and, of course, there's no letter for it. Of course not; that would be too simple. And English detests simple. If you're really interested in more of the gory details, there's a great lecture you can listen to/watch here by Edwin Duncan which goes into way more detail on the historical background. Or you can just scroll through the Oxford English Dictionary and wince constantly. ]]>
I've already dealt with that, but there's a corollary to that I'm getting asked more and more. What do linguists do? It's kind of a tricky question, because what a linguist sits down (or stands up) and does really depends on their discipline. But, hey, I've never really gone over the sub-disciplines of linguistics in any great detail, so now's as good a time as any. :D[caption id= align=alignnone width=512] Linguist action figure, now with tape-recording, listening intently and thinking hard action! Ok, more like lack of action, but there's still a lot going on there.[/caption]Syntacticians:Semanticists:Phoneticians: If you though phoneticians were all prim and proper Henry Higgins types, think again. Phoneticians are the punks of the linguistics and routinely do things like sticking infant feeding tubes down their nostrils to measure airflow without impeding articulators.Phonologists:Psycholinguists: That's folks who study psycholinguistics, not psycho linguists.Computational linguists: Spend all day rolling around in piles of money and expensive computer equipment. Nah, not really, but if you're looking for steady employment in linguistics, I suggest you take a serious look at this specialization.Anthropological linguists:]]>
wonderful free app that lets you learn Yoruba, or at least Yoruba words, and posted about it on Google plus. Someone asked a very good question: why am I interested in Yoruba? Well, I'm not interested just in Yoruba. In fact, I would love to learn pretty much any western African language or, to be a little more precise, any Niger-Congo language.[caption id= align=alignnone width=256] This map's color choices make it look like a chocolate-covered ice cream cone.[/caption]Why? Well, not to put too fine a point on it, I've got a huge language crush on them. Whoa there, you might be thinking, you're a linguist. You're not supposed to make value judgments on languages. Isn't there like a linguist code of ethics or something? Well, not really, but you are right. Linguists don't usually make value judgments on languages. That doesn't mean we can't play favorites! And West African languages are my favorites. Why? Because they're really phonologically and phonetically interesting. I find the sounds and sound systems of these languages rich and full of fascinating effects and processes. Since that's what I study within linguistics, it makes sense that that's a quality I really admire in a language.What are a few examples of Niger-Congo sound systems that are just mind blowing? I'm glad you asked. Yoruba: Yoruba has twelve vowels. Seven of them are pretty common (we have all but one in American English) but if you say four of them nasally, they're different vowels. And if you say a nasal vowel when you're not supposed to, it'll change the entire meaning of a word. Plus? They don't have a 'p' or an 'n' sound. That is crazy sauce! Those are some of the most widely-used sounds in human language. And Yoruba has a complex tone system as well. You probably have some idea of the level of complexity that can add to a sound system if you've ever studied Mandarin, or another East Asian language. Seriously, their sound system makes English look childishly simplistic. Akan: There are several different dialects of Akan, so I'll just stick to talking about Asante, which is the one used in universities and for official business. It's got a crazy consonant system. Remember how Yoruba didn't have an n sound? Yeah, in Akan they have nine. To an English speaker they all pretty much sound the same, but if you grew up speaking Akan you'd be able to tell the difference easily. Plus, most sounds other than p, b, f or m can be made while rounding the lips (linguists call this labialized and are completely different sounds). They've also got a vowel harmony system, which means you can't have vowels later in a word that are completely different from vowels earlier in the word. Oh, yeah, and tones and a vowel nasalization distinction and some really cool tone terracing. I know, right? It's like being a kid in a candy store.But how did these language get so cool? Well, there's some evidence that these languages have really robust and complex sound systems because the people speaking them never underwent large-scale migration to another Continent. (Obviously, I can't ignore the effects of colonialism or the slave trade, but it's still pretty robust.) Which is not to say that, say, Native American languages don't have awesome sound systems; just just tend to be slightly smaller on average.Now that you know how kick-ass these languages, I'm sure you're chomping at the bit to hear some of them. Your wish is my command; here's a song in Twi (a dialect of Akan) from one of my all-time-favorite musicians: Sarkodie. (He's making fun of Ghanaian emigrants who forget their roots. Does it get any better than biting social commentary set to a sick beat?)[youtube http://www.youtube.com/watch?v=phSOWr8kOzU?rel=0]]]>
Something's alarming right enough... but I think it's actually my linguistics sense.[/caption]Now, as both a linguist and native speaker of American English, I find this command troubling. Not because I have a problem with civic-minded individuals alerting the power company to potentially dangerous problems, but because it's ambiguous. I've written about ambiguity in language before, but it's something that I revisit often and it's a complex enough subject that you can easily spend an entire lifetime studying it, let alone more than one blog post.Let's examine why this sign is ambiguous a little more closely.First, there's (what I would consider) a non-standard usage of the word alarming. I tend to imagine something that is alarming to be capable of putting me in a state of alarm, rather than currently expressing alarm. Or, as the OED puts it:Disturbing or exciting with the apprehension of danger.Yeah, that's right, alarming is one of the few words that the OED only has one definition for. Let's put that aside for the moment, though, and assume that there's a linguistically-creative sign maker working for Seattle City Light who has coined a neologism based on parallels with words like understanding or revolving. The real crux of the matter is that the command is not a sentence, and has just too many gaps where the reader has to fill in information.These are just a couple of the possible interpretations I came up for the sign: If [the alarm is] alarming (in the sense of performing the action which alarms traditionally do, such as whooping and revolving) [then] call. If [you are] alarming [other people, then] call. If [the alarm is] alarming [you, regardless of whether or not it's currently flashing or making noise then] call.Now, English syntax is a pretty resilient beast and can put up with a certain amount of words left out. The fancy linguistics term for this is ellipsis, just like the punctuation mark. (This one: ...) Words have to be left out of of certain places in certain ways, though. Like you don't have to say you every time you tell someone to do something. Don't sit there! is perfectly acceptable as a sentence, and if someone told you that you'd have no problem figuring out that they were telling you not to sit on their cat. Like everything else in language, though, there are rules and by breaking them you run the risk of failing to communicate what you're trying to... just like this sign.]]>
language games and in-group/out-group language. But I've recently noticed another interesting linguistic phenomena in rap that you don't really see in English very often: reduplication.[caption id=attachment_327 align=aligncenter width=640] Whose work displays metrical complexity, rich cultural/literary/historical allusions and healthy lashing of dirty jokes? Trick question! It's both. Man, I hope nobody in the future tries to claim that Jay-Z was actually Dick Chaney in disguise...[/caption]Reduplication is one of my favorite linguistic phenomena and a great example of a autological word. Basically, reduplication is a linguistic phenomena where you say the same thing twice. It's also one of those rare phonological phenomena that are semantically meaningful. There are lots of ways to interpret what saying something twice means, but you there are a couple of pretty popular choices: Probably the best English example is Like like, as in I like him, but I don't like like him. It seems to serve as some sort of deintensifier (Yeah I just made that word up. Deal with it.) or to disambiguate between two possible meanings of the same word. It seems to serve to narrow the scope of the base word. So, like-like is a type of like and holiday holiday is a type of holiday. Apparently there's a similar relationship in Italian and French (see comments). In Koasati, (and Cree as well apparently) it's used to indicate a repeated action. So it would be like if I said cut-cut in English to mean that I chopped something finely instead of cutting a piece off of something. In Mandarin it's an almost juvenile marking, used to indicate cuteness or smallness. (You can see this in Hebrew as well.) You'll sometimes see this in English, too, particularly from children. If you hang out with young kids, keep your ears peeled for things like bunbun for bunny. On the other, Mandarin also uses reduplication to indicate plurality. Khmer is another language that does this, and I think Japanese does as well. So that's things like bird for one bird and birdbird or bir-bird for a flock of birds. Finally, and this is what I think's going on in rap, you'll see reduplication to intensify things. Like I'd say a red red is a really intense red, or that someone who's short short is really tiny.I've been noticing this particularly with truetrue. You can hear it in Chamillionare's I'm true, both as I'm true, I'm true and true true in verse two. And Lil Wayne's My Homies Still is absolutely rife with reduplication. You've got click click in the first line, and in verse four (which is Big Sean's) you've got these lines:Whoa, okay, boi this heres what I do doGot your sister dancing, not the kind that's in a tutuGot me in control, no strings attached, that's that voodooShe said cant nobody do it better, I tell her, true true yep ***** true trueTrue true, my my bro bro say...Of course, a grouping this concentrated speaks more towards an artistic choice than pervasive linguistic change... but it is something I've been noticing more and more. The earliest example I could find is GZA's True Fresh MC from 1991, but I'm hesitant to call it reduplication, since there's a definite pause between the first and second true.Feel free to weigh in in the comments. Is this a legitimate trend or have I fallen prey to a recency illusion? Are there other examples that I'm missing? Is this something you say in everyday speech?]]>
Color recognition. Peekaboo? Object permanence. But what about language games? In English, you've got games like pig Latin, which has several versions. Most involve moving syllables or consonants from the front of a word to the end, and then adding -ay. It's such a prevalent phenomena that there's even a Google search in pig Latin.And English isn't alone in having language games like this. In fact, every language I've studied, including Nepali and Esperanto, has had some form of similar language game.[caption id= align=alignnone width=256] Ekchay Atemay!Roland, please stop being so infantile. This is backgammon and I know perfectly well you're fluent in Liturgical Latin.[/caption]The weird thing, though, is that it kinda looks like the only people that language games are really useful for is linguists.Let's look at syllables. If you're a normal person, you only think about them when you're forced to write a haiku for some reason. (Pro tip: In Japanese, it's not the syllables that you count but the moras.) If you're a linguist, though, you think about them all the time, and spend time arguing about whether or not they actually exist. One of the best arguments for syllables existing is that people can move them around relatively intuitively without even having a university degree in linguistics when language games require it. (I know, shocking, isn't it?)And you can use the existence of language games to argue that there's a viable speaker community of any given language, a sort of measure of language health, like mayflies in streams; that's a valuable indicator, since language death is a serious problem. Or you can even use them to argue that a language is alive in the first place.The main use of language games for language users, however, seems to be the creation of smaller speech communities within larger communities. But then, as a linguist, you probably already knew that. Keep an ear out for them in everyday life, however, and you might be surprised how often they tend to crop up--like the use of -izz in early hip hop parlance.*If you thought I was going to bring up Wittgenstein in a blog post meant for people with little to no background in linguistics you are a very silly person. Oh, alright, here. I hope you're proud of yourself.]]>
Eth and Thorn (which are old letters that can be difficult to typesest), because back in the day, English had the two distinct th sounds represented differently in their writing system. There was one where you vibrated your vocal folds (that's called 'voiced') which was written as ð and one where you didn't (unvoiced) which was written as þ. It's a bit like the difference between s and z in English today. Try it: you can say both s and z without moving your tongue a millimeter. Unfortunately, while the voiced and voiceless th sounds remain distinct, they're now represented by the same th sequence. The difference between thy and thigh, for example, is the first sound, but the spelling doesn't reflect that. (Yet another example of why English orthography is horrible.)[caption id= align=alignnone width=600] Used with permission from the How To Be British Collection copyright LGP, click picture for website.[/caption]The fact that they're written with the same letters even though they're different sounds is only part of why they're so hard to master. (That goes for native English speakers as well as those who are learning it as their second language: it's one of the last sounds children learn.). The other part is that they're relatively rare across languages. Standard Arabic Greek, some varieties of Spanish, Welsh and a smattering of other languages have them. If you happen to have a native language that doesn' t have it, though, it's tough to hear and harder to say. Don't worry, though, linguistics can help!I'm afraid the cartoon above may accurately express the difficulty of producing the th for non-native speakers of English, but the technique is somewhat questionable. So, the fancy technical term for the th sounds are the interdental fricatives. Why? Because there are two parts to making it. The first is the place of articulation, which means where you put your tongue. In this case, as you can probably guess (inter- between and -dental teeth), it goes in between your teeth. Gently!The important thing about your tongue placement is that your tongue tip needs to be pressed lightly against the bottom of your top teeth. You need to create a small space to push air thorough, small enough that it makes a hissing sound as it escapes. That's the fricative part. Fricatives are sounds where you force air through a small space and the air molecules start jostling each other and make a high-frequency hissing noise. Now, it won't be as loud when you're forcing air between your upper teeth and tongue as it is, for example, when you're making an s, but it should still be noticeable.So, to review, put the tip of your tongue against the bottom of your top teeth. Blow air through the thin space between your tongue and your teeth so that it creates a (not very loud) hissing sound. Now try voicing the sound (vibrating your vocal folds) as you do so. That's it! You've got both of the English th sounds down.If you'd like some more help, I really like this video, and it has some super-cool slow-motion videos. The lady who made it has a website focusing on English pronunciation which has some great resources. Good luck![youtube http://www.youtube.com/watch?v=VUAYmTnWaCY]]]>
most, in the grand scheme of things. Like how we can only see a narrow band of all wavelengths--hence visible light--we can also only hear some of the possible wavelengths. And wave heights. You might remember this from physics, but there are two measurements that are really important on a diagram of a wave: wave length and intensity. Like so:[caption id= align=alignnone width=512] This should be bringing back flashbacks of asking if you were going to be able to use a formula sheet and a calculator.[/caption]So you've got the wavelength, which is the distance between two peaks or two troughs, and the amplitude, which is the distance between the mid-point of the wave and the tip of a peak. Which is all very well, but it doesn't tell you much in the grand scheme of things, since most waves aren't kind enough to present themselves to you as labelled diagrams. You actually have a pretty good intuitive grasp of the wavelength and amplitude of sound waves, though. The first is pitch and the second is what I like to refer to as loudness. (Technically, loudness is a perceptual measurement, not a... you know, this is starting to be boring.)So there's a limit in how loud and how soft a sound can be and a limit of how high and low a sound can be. I'll deal with loudness first, because it's less fun.LoudnessSo we measure loudness using the decibel scale, which is based on human perception. Since 0 decibels is, by definition, the lower perceptual limit of sound for humans, the quietest sound humans can hear is just above that, which is around 20 micro-pascals of pressure. Of course, that's healthy young humans. The older you get, the more your hearing range decreases, which is why your grandmother asks you to repeat yourself a lot. The loudest is just under 160 decibels, since exposure to a sound at 160 decibels will literally rip your eardrum. That's things like being right under a cannon when it fires, standing next to a rocket when it launches or standing right next to a jet engine during take off, all of which tend to have other problems associated with them. So... avoid that.PitchPitch is a bit more interesting. Normal human hearing is generally between 20 hertz and 20 kilohertz--compare that to 15 to 200 kilohertz for dolphins and bats! (Because they both rely on sonar and echo-locution for hunting.) Just like hearing range for loudness, though, this gets narrower as you get older, particularly at the higher end of the range. Here's a video that runs the gamut of the human hearing range (warning: you might want to turn your speakers down).[youtube http://www.youtube.com/watch?v=lNUJ7Ug8LaU]If you're older than 25 (which is when hearing loss usually starts in the upper ranges) you probably couldn't hear the whole thing. If you did, congratulations! You've got the hearing of a normal, young human.]]>
about the sound systems of different languages you can emulate them. In other words, you can have a pretty convincing fake accent. In fact, accent coaches, who work with actors to create accents and other to reduce them, tend to have linguistic backgrounds with a focus on studying the sounds of language. So I thought with this post I'd go over how to imitate a French accent by looking at the individual sounds that are different between the two languages.Just to be clear: I'm using English as a target language here because English is my native language and everyone who's asked me about it has spoken English natively. I'm in no way implying that English is the best language, or that English speakers don't have accents. (You should hear how I butcher Mandarin. It's pretty atrocious.) If you have any other languages you'd like me to write posts for, let me know in the comments. :)[caption id= align=alignnone width=512] Marcel Marceau can't help you on this one, sorry. Mostly because you'll have a hard time finding examples of authentic French in his performances for some reason... [/caption]I'm going to assume that you want to sound like you're from Paris and not Quebec (Not that Quebec isn't great! Man, now I'm jonesing for some President's Choice snacks.). There are a couple sounds you're going to have to learn: Instead of the English r, as in rat, you're going to have to use what's called the gutteral r. (Okay, it's actually called the voiced uvular fricative, but that's a little bit harder to say.) Basically, when you say the sound, you want to vibrate your uvula, that little punching-bag-looking thing at the back of your throat. Try doing it in front of a well-lit mirror with your mouth open until you can figure out what it feels like. Instead of the English ng, as in cling, you can use a ny, as in nyan cat. No, seriously. This will be a little difficult, since we only really use that sound at the end of words, but practice a bit and you should be able to pick it up. Or you can just go with go with a regular n sound.Now the good news! There's also a couple of sounds we have in English that don't exist in French, and they're the one's that are slightly harder to say, so you can save yourself some time and trouble by switching them out. The th sound, like at the begining of thin or the is actually really rare in world languages. French speakers tend to replace it with z. The sounds at the beginning of church and judge are also not a thing in French. You can use the sound at the beginning of sheep for the sound at the beginning of church and the s in vision for the j in judge.So that's the consonants.The vowels are significantly different than they are in English. You've got all sorts of things like nasalization and rounding in places where you, as an English speaker, are just not expecting it. And, frankly, unless you've got a really good ear, you're going to have a hard time picking up on the differences. Long story short: I'm weaseling out of explaining the vowels entirely and using a Youtube video. (I'm also doing it so you can get some native speaker data, which I think you'll find helpful.)[youtube http://www.youtube.com/watch?v=RJVxe4inqyg]That does give me space to discuss intonation, however. Intonation is probably the single biggest difference in the way English and French sounds. In fact, intonation is one of the very first things that babies pick up, before they even start experimenting with individual sounds. Unfortunately, it's also one of the most difficult things to learn. Here's a few pointers, though: French intonation isn't as concerned with individual syllables. Rather, you tend to get whole phrases (rather than individual words) in the same intonation pattern. This is what gives French its sort of smooth, musical quality. Instead of a slow rise and slow fall, like we get in English, pitch in French tends to rise slowly until the very final syllable of a sentence, where it drops suddenly. It looks more like the graph of an absolute value than polynomial, in other words.There's a ton more to be said about French phonology, and a lot of it has already been said, but this should be enough to get you started on approximating a French accent. Good luck!]]>
different definition of linguist. His job is to use his specialist knowledge of a language (specifically Mandarin Chinese, Mongolian or one of the handful of other languages he speaks relatively well) to solve a problem. And one problem that he's worked on a lot is language learning.There's no doubt that knowing more than one language is very, very useful. It opens up job opportunities, makes it easier to travel and can even improve brain function. But unless you were lucky enough to be raised bilingual you're going to have to do it the hard way. And, if you live in America, like I do, you're not very likely to do that: Only about 26% of the American population speaks another language well enough to hold a basic converstaion in it, and only 9% are fluent in another language. Compare that to Europe, where around 50% of the population is bilingual.[caption id= align=alignnone width=512] Now that you've learned these characters, you only need to learn and retain one a day for the next five years to be considered literate.[/caption]Which makes the lure of easily learning a language on your own all the more compelling. I recently saw an ad that I found particularly enticing; learn a language in just ten days. Why, that's less time than it takes to hand knit a pair of socks. The product in this = case was the oh-so-famous (at least in linguistic circles) Pimsleur Method (or approach, or any of a number of other flavors of delivery). I've heard some very good things about the program, and thought I'd dig a little deeper into the method itself and evaluate its claims from a scientific linguistics perspective.I should mention that Dr. Pimsleur was an academic working in second language acquisition from an applied linguistics stand point. That is, his work (published mainly in the 1960's) tended to look at how older people learn a second language in an educational setting. I'm not saying this makes him unimpeachable--if a scientific argument can't stand up to scrutiny it shouldn't stand at all--but it does tend to lend a certain patina of credibility to his work. Is it justified? Let's find out.First things first: it is not possible to become fluent in a language in just ten days. There are lots of reasons why this is true. The most obvious is that being a fluent speaker is more than just knowing the grammar and vocabulary; you have to understand the cultural background of the language you're studying. Even if your accent is flawless (unlikely, but I'll deal with that later), if you unwittingly talk to your mother-in-law and become a social pariah that's just not going to do you much. Then there are just lots of little linguistic things that it's so very easy to get wrong. Idioms, for example, particularly choosing which preposition to use. Do you get in the bus or on the bus? And then there's even more subtle things like producing a list of adjectives in the right order. Big red apple sounds fine, but red big apple? Not so much. A fluent speaker knows all this, and it's just too much information to acquire in ten days.That said, if you were plopped down in a new country without any prior knowledge of the language, I'd bet within ten days you'd be carrying on at least basic conversations. And that's pretty much what the Pimsleur method is promising. I'm not really concerend with whether it works or not... I'm more concerned with how it works (or doesn't). There are four basic principals that the Pimsleur technique is based on. Anticipation. Basically, this boils down to posing questions that the learner is expected answer. These can be recall tasks, asking you to remember something you heard before, or tasks where the learner needs to extrapolate based on the knowledge they currently have of the language. Graduated-interval recall. Instead of repeating a word or word list three or four time right after each other, they're repeated at specific intervals. This is based on the phonological loop part of a model of working memory that was really popular when Pimsleur was doing his academic work. Core Vocabulary. The learner is just exposed to basic vocabulary, so the total number of words learned is less. They're chosen (as far as I can tell, it seems to vary based on method) based on frequency. Organic learning. Basically, you learn by listening and there's a paucity of reading and writing. (Sorry about that; paucity was my word of the day today :P ).So let's evaluate these claims. Anticipation. So the main benefit of knowing that you'll be tested on something is that you actually pay attention. In fact, if you ask someone to listen to pure tones, their brain consumes more oxygen (which you can tell because circulation to that area increases) if you tell them they'll be tested. Does this help with language learning? Well. Maybe. I don't really have as much of a background in psycholinguistics, but I do know that language learning tends to entail the creation of new neural networks and connections, which requires oxygen. On the other hand, a classroom experience uses the same technique. Assessment: Reasonable, but occurs in pretty much every language-learning method. Graduated-interval recall: So this is based on the model I mentioned above. You've got short term and long term memory, and the Pimsleur technique is designed to pretty much seed your short term memory, then wait for a bit, then grab at the thing you heard and pull it to the forefront again, ideally transferring it to long-term memory. Which is peachy-keen... if the model's right. And there's been quite a bit of change and development in our understanding of how memory works since the 1970's. Within linguistics, there's been the rise of Exemplar Theory, which posits that it's the number of times you hear things, and the similarity of the sound tokens, that make them easier to remember. (Kinda. It's complicated.) So... it could be helpful, assuming the theory's right. Assessment: Theoretical underpinnings outdated, but still potentially helpful. Core Vocabulary. So this one is pretty much just cheating. Yes, it's true, you only need about 2000 words to get around most days, and, yes, those are probably the words you should be learning first in a language course. But at some point, to achieve full fluency, you'll have to learn more words, and that just takes time. Nothing you can do about it. Assesment: Legitimate, but cheating. Organic learning: So this is in quotation marks mainly because it sounds like it's opposed to inorganic learning, and no one learns language from rocks. Basically, there are two claims here. One is the auditory learning is preferable, and the other is that it's preferable because it's how children learn. I have fundamental problems with claims that adults and children can learn using the same processes. That said, if your main goal is to learn how to speak and hear a given language, learning writing will absolutely slow you down. I can tell you from experience: once you learn the tones, speaking Mandarin is pretty straightforward. Writing Mandarin remains one of the most frustrating things I've ever attempted to do. Assessment: Reasonable, but claims that you can learn like a baby should be examined closely. Bonus: I do agree that using native speakers of the target language as models is preferable. They can make all the sounds correctly, something that even trained linguists can sometimes have problems with--and if you never hear the sounds produced correctly, you'll never be able to produce them correctly.So, it does look pretty legitimate. My biggest concern is actually not with the technique itself, but with the delivery method. Language is inherently about communicating, and speaking to yourself in isolation is a great way to get stuck with some very bad habits. Being able to interact with a native speaker, getting guidance and correction, is something that I'd feel very uncomfortable recommending you do without.]]>
Ethnolouge. It was a great gift for two reasons. The first is that Ethnolouge is probably the single best resource for looking up basic information about the languages of the world, and the second is that I had a door that just wouldn't stay open. (Seriously, the hard copy is about the size of a layer cake and weighs three times as much).But the point is that you would think that if a body of linguists is going around printing big annotated lists of all the world's languages, they'd have a hard-and-fast number ready to throw at you when you ask them how many there are. Instead you get answers like, Around 6,500 or Somewhere between 6000 and 7000 or It really depends what you mean by language or Fewer all the time or 6500! Who told you that? Their figure is clearly wrong! Why all the hedging?[caption id=attachment_258 align=alignnone width=640] Language families! Well, one person's idea of language families, 'cuase there's disagreement. And the consensus is constantly changing, so it's one person's idea of language families at one point in time... Long story short, linguists love to argue.[/caption]Well, there's a couple reasons, and they all seem to boil down to the same thing: language is complicated. Seriously, I spend most of my spare time thinking about language and have for the past several years... and I keep coming back to language is complicated.But let's say you're not satisfied with our sort of fluffyish numbers of how many languages there are. So you put on your linguist outfit (which, honestly, is probably just jeans and a t-shirt) and pack your rucksack and head out to count all the world's languages.But right away you start to run into problems. Let's say you're in India. You've been counting the Dravidian languages and getting some pretty good data, and you decide to really quickly check off Hindi. TotalNumberOfLanguages = TotalNumberOfLanguages + 1. So next you look at Urdu. Well, huh. The people that you're interviewing say that they're clearly different languages... but they're mutually intelligible, so if one person is speaking Hindi and one person is speaking Urdu they can carry on a normal conversation. And you notice that they sound alike. A lot alike. In fact, their rules for how sounds change seem to be the same. And they have overlapping speaker populations. The best way you can find to tell them apart is that one (Hindi) is written using Devanagari script, and the other (Urdu) is written using Arabic script. Soo.... are they one language or two? Two different linguists might give you two different answers and then, BAM, you've got a different count.And you'll encounter similar problems with other languages and language families all across the globe. Do it three or four hundred times and you've got a radically different number from the next linguist counting.But, despite these problems, you finally manage to count all the languages. You stumble home, travel-worn and weary, and settle down to check a truly massive back log of e-mails. And it looks like one of them is very sad news. One of your friends you met along the way, the last living speaker of an Amazonian language, has died. And with her death, her language is gone. It's a double tragedy, to be sure, and when you think back on, a lot of languages had only one or two very old speakers. Who knows how many of them have died in the years you've been travelling? So it looks like, due to language death, your count is even more off.So you write up your findings and try to get them published, but people keep pointing out the hard and fast number number that you were searching for and collecting data for all along isn't really reliable. Even if you end up giving a range of numbers, languages are dying so quickly that they won't be accurate for long.Man, I'm glad that this situation is a hypothetical one, and you didn't actually spend years chasing after a mythic number that it turns out is unobtainable. Aren't you?]]>
Plato chimed in on this one) debate about what language is. Now, as a linguist, you'd think I'd have the inside scoop on the subject. I mean, I pretty much sit around and think about language all day, so I should have this one down cold, right?Well, as much as I hate to admit it, not really. And it's not just me. Ask three different linguists what language is and you'll probably get six to twelve answers. There is one point that I'm firm on, though: language is a human phenomena. Outside of the internet and fantasy worlds (which tend to overlap a lot, now that I think about it), animals don't talk.[caption id= align=alignnone width=512] ...and then Mildred told Randolf that she thought his new haircut made him look like a basilisk. Well, you can imagine how he took *that*.[/caption]I'm not claiming that animals don't make communicative noises. Far from it! As someone who has bottle-fed more than one lamb, I can tell you that there's a definite difference between cries that indicate genuine hunger and cries that are transparent ploys to get cookies. But there are a lot of differences between communicative behavior and language. Lying is part of language. By lying here, I mean a wide variety of linguistic expressions that express information that is counter to the truth, including joking. Language is separate from the things it describes (there's nothing inherently tree-ish about the word tree, for example, ditto arbor, boom and träd, though there are inherent respect points in correctly identifying all three languages) and because of this can communicate abstract thought. Abstract thought, as evinced through lying, is in inherent part of language. There's some evidence that Koko the gorilla is capable of lying, but one isolated incident really isn't a sound basis for scientific argument. Language is generative. I've written an entire post about generativity, but it's worth repeating. Language has to have underlying structures that can be used to produce new and novel utterances. Otherwise, you're just saying random words. Langauge is communicative. This is part of the reason why music isn't language, though it's completely abstract and (at least in the Western tradition) generative. Abstraction is required, but so is a connection to thoughts and ideas. Tied to this is the fact that you have to have a community to speak in, even if it's a community of two. Language can communicate events at a temporal distance. This is a a biggie, and one of the main reasons that I really think that Koko and other talking animals are really using language. (Quick aside: Did you know that the Nazis attempted to train talking dogs as part of the war effort? True story.) It's pretty easy to teach a dog to bark for a treat, but try teaching it to bark because you gave it a treat two days ago. You may think that a specific bark means treat, but without temporal distance and repeatability, it's pretty much just pigeons in boxes.Now, other linguists will take other positions (or, you know, the same position ;), but this is how I see it. So what do you think: can animals talk?]]>
http://www.msnbc.msn.com/id/32545640Visit msnbc.com for breaking news, world news, and news about the economyExcept, no, not really.]]>
in my last post I talked about the tongue part of tongue twisters. But, as I mentioned, the really interesting part of tongue twisters comes from the brain, not the tongue. It all has to do with lexical access.Lexical access: The process through which a speaker or listener accesses their mental lexicon (i.e. the not-so-tiny brain dictionaries we all have and are all constantly changing).If you were a computer, your mental entries on various words would be like files you needed to access. Like, if I write kumquat, you probably have some sort of mental entry for it. Even if you've never eaten a kumquat, you've probably seen them, so you have that mental image associated with the words--like a .txt file with a picture in it. So, once you hear or read kumquat, you need to rummage around until you find that file, then open it and access the information inside. And you do! In fact, you do it very, very quickly. You do it for every single word you ever read or hear, and you do it in reverse for every word you ever write or say.[caption id= align=alignnone width=512]This is your brain on language. Well, some of the bits that deal with language, at any rate.[/caption]So lexical access is a very important process. You need it for every single aspect of language use. The good news is, there's a lot we know about lexical access! Remember when I talked about priming? That's an aspect of lexical access. The bad news is, there's a whole bunch we don't know about lexical access.(Is lexical access starting to sound like a fake word yet? That process, by the way, is known as semantic saturation.)This is where tongue twisters come into play. (No, I hadn't forgotten them.) Why? Well, it turns out that tongue twisters are a really good way to get at what happens during the lexical access process. Like many things that happen in the human brain, it can be difficult to study lexical access. Unlike physicists, linguists can't break apart the mind to see what happens and figure out what's going on. First, it would be deeply unethical. Second, when you break a brain open, it stops working. Sometimes, however, the brain does something weird. Like with tongue twisters. If you read my previous post, you know that the tongue itself can cause problems... but not enough to explain the most common errors, like saying How can a clam cram in a clean cream can? as How clan a cam cram in a clean cleam can?In a 1999 study, Carolyn E. Wilshire found that there were two main contributors to making tongue twisters tricky. The first factor that made it easier to confuse sounds was whether the confuseable sounds were similar. There's lots of technical reasons this is, but basically sounds can be grouped together and some sounds are more like other sounds. t and d are really similar, for example, whereas k and m are not really that alike. Unsurprisingly, sounds that are alike are easier to confuse. Basically, you reach for something that sounds similar, then realize that you made a mistake and try to correct your error.The second factor was that it was easier to confuse sounds that were repeated. This is because you're more likely to reach for something you've already gotten out once, even if it's the wrong thing. Together, these factors make for some really awesome tongue twisters. Awesome for two reasons: the first is that they're really, really hard to say. (Try moss knife noose muff!). The second is that we can use tongue twisters like these to help increase our understanding of the human mind. And that's what it's really all about.]]>
2009 Star Trek movie was the scene where Kirk tries to pick up Uhara in a bar. After she says he probably doesn't know what Xenolinguistics is, he replies:The study of alien languages. Morphology, phonology, syntax. Means you've got a talented tongue.Well. Four out of five isn't bad. Linguists know about language, not the languages themselves--so tongue talents are a skill that linguists only develop tangentially. (Although, of course, a lot of linguists do end up learning the languages they work on.) But knowing about tongues is still pretty useful. For example, it helps explain why tongue twisters are so hard.[caption id= align=alignnone width=512] Tongue rolling is actually probably not controlled genetically. That's right, your introductory biology textbook lied to you.[/caption]Basically, your tongue is a muscle like any other muscle, and it has certain limits. For example, there's just a certain upper limit to how fast you can type, knit or eat, you can only produce recognizable words so fast. In addition to speed, however, there are certain motions that are difficult to make. Linguists often refer to correctly producing a given sound as hitting an articulatory target, and that's a useful metaphor. For each sound in your repertoire, your tongue (and other parts of your articulatory system) have to be in certain positions.Exercise time! Try saying s sh s sh s sh and t k t k t k. In the first, the tip of your tongue should move from that little ridge in your mouth (just behind your front teeth) to behind that ridge. In the second, the t sound should be made with the tip of your tongue against the roof of your mouth, and the k with the very back of your tongue. (I'm assuming that you have the sh sound in your native language, otherwise this exercise might have been a little fruitless for you. Sorry.)You might have noticed that it was a little easier to make the t k sound than it was the s sh sound. That's because you're using two different parts of your tongue to make the t k pair, so while one is making a sound, the other is preparing to and vice versa. On the other hand, if your only using the very tip of your tongue, you have to finish one task before you can move onto the next, so your rate of making sounds is much lower. It's the same reason that assembly lines are so much faster--you don't have the additional time it takes to switch tasks. (BTW, that's why, if you play a wind instrument, you can tongue faster with t k t k than t t t t.)Ok, so we have two pressures working against your ability to produce tongue twisters. The first is that your tongue can only move so fast. You can train it to move faster, but eventually you will (barring cybernetic implants) reach the limits of the human body. Secondly, you have limited resources to make sounds, and when sounds that draw on the same resources are produced too close together, they both become more difficult to produce, and the speed at which you can produce them is reduced even further.And that wraps it up for tongues, because the dirty secret of tongue twisters is that they're mainly actually brain twisters. But I'll cover that in part two. That's right: brraaaaaaains.]]>
joined the Illumanati, I've been thinking a lot about subliminal messages. A subliminal stimulus is literally something that's below--sub--your perceptual limit-- liminal. So, linguistically speaking, it would be a word that was presented to you to be read but removed before you could actually read it. Or a sound that's too soft for you to hear.[caption id= align=alignnone width=512] Holy crap, look at that subliminal owl! Oh wait, you see it? Well shoot. Guess it's just a regular old liminal Illuminati owl.[/caption]So if you can't really perceive subliminal messages, why are they even a thing?Well, if there's one thing I've learned by studying linguistics, it's that language is complex and that there are huge gaps between what we know and what we think we know about language--at least on an individual level. (I've also learned that I can't count, becuase that's definitely two things.) We all do things that we have no idea we're doing, and so quickly and easily that they just slip below our notice. Linguistics is all about figuring out what those things are.One of those things is priming. Basically, when you hear or read a word it gets warm, like how you leave a heat signature on your sheets after you get out of bed. And if, later, you're looking for a similar word, you're more likely to go back to your warm bed than another word you haven't used lately. Of course, the effect fades over time, but it does fade very slowly. And priming effects are where you really see an effect of subliminal messages. (And, just to be clear, not really anywhere else... at least linguistically speaking. :)For example, this study by Abrams, Klinger and Greewald found that, if participants had been exposed to a word earlier in their study, they were able to recognize it later when it was presented subliminally--but it only works really well when participants had not only read the word before, but had had to think about it some by assigning it to a category. So the effect isn't really strong enough to, say, help you lose weight or stop smoking.The fact that it exists at all, though, does tell us something interesting about the human brain and how it uses language. For example, is our ability to interpret stimuli that are degraded tied to the pressure to understand conversational speech, even in noisy environments? What does it mean that the effect is also present with visual stimuli? Like all sciences, linguistics is all about asking the right questions, and research on subliminal stimuli opens up a whole barrel-full of questions.]]>
synesthesia, a neurological condition where you perceive sensory input from one sense as if it were another sense--with synesthesia the color yellow might taste like root beer, or the sound of a bassoon may feel like bread dough. Even without synesthesia, however, linguists (particularly phoneticians and phonologists) see sound all the time. What does it look like? Something like this:[caption id=attachment_202 align=alignnone width=640] Auuuugh what is this? It looks so boring and spiky! My eyes![/caption]These, boys and girls and others, are what your speech sounds look like. Spectrograms are one of the most useful tools in the speech scientist's tool shed. Heck, they're pretty much a Swiss army shovel. You can spend your entire career basically only looking at data in this one form.Why? Well, there's a lot of data in a spectrogram. Big things, like whether a sound's a 'b' or a 'p' (there's a big black bar on the bottom if it's a 'b', but not if it's a 'p'), but also really small things that we as humans have have a really hard time hearing. Like, remember what I said earlier about your ears lying to you? Turns out it's a lot easier to sort out the truth if you can see what you're hearing. Plus, by looking at spectrogram we can quantify things like average vowel frequencies really quickly and easily. (Turns out, by the way, that you can [maybe, kinda, if you squint just right and have just the right voice sample] judge how tall someone is based on their vowel frequencies.)But spectrograms aren't just a serious scientific tool; they're also pretty fun. Aphex Twin, an ambient musician (I mean, he makes music in the ambient genre, not that he provide background music at canape parties. Sheesh.) uses spectrograms as an art form. This song, for example, has a picture of his face encoded in it's spectrogram. Give it a listen and see if you can find it![youtube http://www.youtube.com/watch?v=i49ODCnEAZI?rel=0]On a more general note, the study of images made with sound is known as cymatics. I'm just going to leave this video here for the more physics-minded among you:[youtube http://www.youtube.com/watch?v=WaYvYysQvBU?rel=0&w=560&h=315] ]]>
good, possible English word that didn't have a meaning associated with it. The long answer is that a wug's one of the ways that we know phonology is real.[caption id=attachment_196 align=alignnone width=295 caption=These are wugs, from Jean Berko Gleason's work on child morphemic acquisition "The acquisition and dissolution of the English inflectional system", published in 1978. Sorry, nothing really funny to say about them. They are pretty cute, though.][/caption]Ok, so answer the question in the picture above. If you're a native speaker of English, you probably said something like There are two wugz. Of course, you would write it wugs, but you'd say it with a final 'z'. I've talked about this before, but it's worth repeating:In English, there are two ways to make a word plural. You can add -z to the end, and you can add -s to the end. They're actually very similar sounds, but with a slight difference. When you're making a -s sound, you don't vibrate your vocal folds, so there's no sort of louder buzzing noise (linguists call that voicing), but when you make a -z sound, you do voice it. When that happens is determined by the sound in front of the plural marking. If it's voiced, the voicing is sort of smeared over into the -s on the end, mainly because it's easier to say.Now, this is a rule that you know and can apply without even thinking about it. But children have to learn it somehow, and we didn't really know when this happened developmentally. Which is what the wug test was designed to find out. If children have learned the rule, then they'll say wug-z instead of wug-s. It turns out that four- and five-year-olds have usually got this rule down cold. Which tells us something useful about how we acquire language. And, you know, watching four-year-olds trying to stay on task is adorable.And, as a special bonus, here's a video interview with Jean Berko Gleason. She's super awesome and a real live linguist. :)[youtube http://www.youtube.com/watch?v=fx8F8lV8_2Y]]]>
[/caption] Well, it's more accurate to say that your brain lies to you. I mean, your ears are simply there to receive the speech signal, like the antennas on an old TV. You still need a tuner to translate those signals into something meaningful, and in this really over-extended metaphor, the tuner is your brain.And, sometimes, your brain will lie to you. There's this thing called Phonemic Restoration that's studied extensively by Makio Kashino, among other people. Basically, what happens is that even when a speech sound is missing you'll think you heard it. Here, try this:[youtube http://www.youtube.com/watch?v=k74KCfSDCn8]Isn't that just the freakiest thing? And it gets even better. Not only can you gain sounds that were never there to begin with at all, you can also lose sounds that should have been perfectly intelligible. I was at a conference this weekend and one of presentations, by Chris Heffner, was on how you adapt to changes in speaking rate. Basically, if you're listening to a bit of slow speech and then encounter a segment or set of words that's produced much faster, your brain can't handle it very well, so it'll just skip right over parts of it, even if it leaves you with something that's less than grammatical.So why does this matter? Well, first off, it's super cool. Secondly, knowing when and how your brain lies to you can tell us more about how your brain processes language. And, really, that's not something we know a whole lot about. Linguistics as a field is littered with unsolved problems, like rocks waiting to destroy a perfectly good tiller. By learning more about what goes on between the antenna and the television screen, though, we can keep working to solve those problems.]]>
three times. Where was it? Adam, the ladder, pick it up! Try saying it aloud. If youre a native speaker of American English, you'll say all three of the underlined sounds the same way.[caption id= align=alignnone width=512 caption=Come on, Adam, Lulu's having to pick up your slack!][/caption]Unless youre already pretty familiar with linguistics, youve probably never heard of the flap (or tap, as some linguists call it), but that doesnt mean that youre not already acquainted. In fact, the flap is one of most common sounds of the English language, especially American English. Its produced by a very quick movement of the tongue against the little ridge of bone just behind your teeth. This video will give you an idea of just how quick:[youtube http://www.youtube.com/watch?v=abakkV2pl34]Its a little difficult to see, but did you notice that bit in the middle where the tongue suddenly jumped? That was the flap. Its so fast that it makes the production of most other sounds seem like the proverbial tortoise. A flap takes an average of 20 milliseconds to produce; by contrast, the schwa vowel (its an uh sound, the most common in the English Language) lasts an average of 64 milliseconds. You can see why the flap is such a favorite; it's a huge time saver.It's a little difficult to spot a flap within specialized training because it doesnt have its own letter, or make any minimal pairs. (A minimal pair is a pair of words that differ by only one sound, like cat and cap. Because you need to be able to tell the sounds apart in order to tell the words apart, you're really good at distinguishing the sounds that make minimal pairs, at least in your native language[s]). Usually, it replaces the 't' or 'd' sound in the middle of a word, but when you start speaking more quickly, more and more of your 't's and 'd's end up coming out as flaps. And that makes sense. When you're speaking more quickly, you want to be understood, but you just don't have as much time to articulate quickly. Since most people will hear the flap as a 't' or a 'd', switching one for the other is just easier for everyone.So that's the flap, a shy, unassuming sound that you often mistake for one of its more glamorous siblings. Now that you've been introduced, though, try to keep an eye out for the little guy. You just might be surprised how often it pops up!]]>
[/caption]Phonotactics is like your great-aunt who always arranges the seating at family reunions becuase she remembers who fought with whom twenty years ago and knows not to sit them together. Basically, some sounds really like to be next to others. Like vowels. Vowels like to be next to everyone. In Japanese, for example, with a couple of exceptions, most syllables have to be made of a consonant plus a vowel. (In ling speak, this is known as CV. C for consonant, V for vowel. Yeah, unlike physicists, we like to keep things simple.) What's even more amazing is that within six months of birth, Japanese infants prefer sounds that are CVCV to those that are CVCCV or CVCVC.Polish, on the other hand, notoriously plays fast and loose with syllable structure. You can have consonant clusters up to five sounds long in Polish that, most weirdly, don't follow the same sorts of rules that other languages do. Like English. English can have pretty big consonant clusters... but they'll only get really big if the first or last sound in the word is 's'. (Protip: That's why 's' is such a great letter in scrabble; there's a bunch of things you can slap it on to piggyback of someone else's word, even outside of its morpheme status.) If you've ever stumbled over a Polish last name, there's a sound linguistic reason you found it hard.Why is this useful? Well, besides its obvious use in language teaching and being great cocktail party conversation material, if you want to make a plausibly difficult-to-pronounce alien language, screw up your phonotactics and you'll leave audio book readers in tears.]]>
[/caption]Let's take an example. How about the glottal stop in English? Here, read this and then come back. I'll wait.We splash glottal stops around in our speech because they're easy and quick to say. So that's laziness; it doesn't hurt anyone, it just makes the speaker's life a little easier. But wait! Let's say that you move to Egypt and start using Egyptian Arabic. In fact, let's say that a whole bunch of English speakers move to Egypt, so many that there starts to be a really large native English speaker population in Cairo... but a population that still has to learn and use Egyptian Arabic just to get around during the day.Now, in Egyptian Arabic, if you slosh glottal stops around like mop water on a dirty floor, you're going to run into problems. Why? Because the glottal stop is a separate sound. It would be like if I used b and p interchangeably. There's a big different between Hand me the robe and Hand me the rope (particularly if you're a cultist). It's confusing. And confusing people isn't nice.So, if you're nice, you'll use glottal stops only when you're supposed to in English and Arabic, and use the other sounds where they belong. The downside? It's more work for a speaker to make a full k-sound than just a glottal stop.So you've got this tension between laziness and niceness, and in different languages and different situations, a different pressure will win out. Or, you know, at least be something that you worry about more.]]>
[/caption]Now, because I'm not a normal person, I jotted down a note of this interesting ambiguity. You've probably noticed lots of instances like this, where a word can be interpreted in more than one way. But did you ever wonder about ambiguity in language? (A little note here: There is ambiguity on the word level and ambiguity on the sentence level. I'm talking about ambiguious words here, though I might come back and do phrases later on.)Think about it this way: language's primary purpose is to assist in communication. You would think that anything that got in the way of that purpose would be weeded out. I mean, yeah, languages evolve, but they evolve with conscious input from humans, so you'd think that we'd try to cut down on things that make communication harder. I mean, if you were designing a human, would you include the appendix? Ok, maybe you would. But my point is, ambiguity isn't really helpful in communication. So why do we continue to use it?Funnily enough, I'm not the first person to ask this question; it's one that's troubled linguistics for a while. And there was a theory proposed in a recent article that I find particularly interesting. The authors argued that words that have more than one meaning (like how chips can be delicious and ruin your computer, or taste terrible and make your computer run) are generally words that are really easy to say.You can think of different words as having different shapes, and that you have to trace these shapes to say the word. A word that's really easy to say, like mom, would be a circle. A word that's harder to say, like Cryptonomicon, is going to be more like five-pointed star. (A word that's impossible to say, like lpdkn, would be like trying to draw a scale model of Mount Fuji in two dimensions: you can kind of get the general idea across, but you can't produce it fully because it violates the rules of physics. Metaphorically.) When you're just talking to friends, you want to use as many circles as possible. Because of that pressure, you're going to use circles to represent tires and oranges and the sun, and trust that your friends can use context clues to figure out that you didn't have tire juice for breakfast.I tend to like this argument, because I'm of the opinion that laziness is one of the driving factors in language--I'm not so sure of another argument that they make, which is that the primary purpose of language is not communication, but basically to organize our thoughts, but more on that later. The main point is that ambiguity is an essential part of language and will remain so for the foreseeable future.]]>
Sleeping Beauty, Snow White, Cinderella, Rapunzel or Rumpelstiltskin, you've got them to thank.[caption id= align=alignnone width=256 caption=Oh, you're a prince? Sorry, I'm holding out for a linguist.][/caption]But this is a linguistics blog, not a folklore blog, so why am I going on and on about these guys? Because they were also pretty awesome linguists. They were like the Galileo of linguistics, way ahead of their time and brilliant. They were so brilliant, they discovered something called Grimm's law. Well, really it was Jacob who discovered it (hence the apostrophe placement) and it wasn't called Grimm's law at the time. It was just something that no one had ever thought to look for.What was it?Grimm's law is the very first time we see a set of rules governing linguistic change. And that may sound kind of boring, but it was just as monumental as the discovery of calculus. (Was calculus more of a discovery or a development? Mhh, whatever.) It fundamentally changed the way that linguistics was done.Basically, Jacob determined that, historically, certain sounds in Germanic languages (including German and English) had changed. And they hadn't changed randomly. A had changed to B had changed to C across a set of languages, and all across the language. It would be like if three or four different countries, without talking about it, decide that purple was better color for stop signs than red or bright green, and changed out all their stop signs. And then, when they were done, they decided that they really liked pink better and all changed to that.Why was this exciting? Well, unlike theories like This word is fun to say becuase I think it is, Grimm's law is testable. You can go out and take a picture of some non-pink stop signs and use that evidence to argue against a law that ends with all stop signs being now pink. We have a theory (and phonological theory!) that we can use empirical data to prove or disprove. It obviously took some time to be accepted as the standard practice, and for a long time, all anybody wanted to talk about was historical sound change and written texts. But, hey, once phonology was born, it was only a matter of time before it started saving the world.]]>
is a science; but there are some parts of linguistics that don't really act like people expect sciences to act, and that tends to confuse people.[caption id= align=alignnone width=512 caption=Not necessary equipment for linguistics.][/caption]Before I go over why linguistics is a science, I think it's worth saying that I'm not arguing (and I am arguing; there are linguists who I know personally and by reputation who argue passionately linguistics is not a science) that linguistics is a science because sciences are better. I'm arguing because there is an inherent difference between how you do science and how you study the humanities. Your aims are different and what you need to do to accomplish those aims are different. I'm arguing that the ultimate aims of linguistics are science-type and not humanities-type or plant-typeand therefore our methodology should match those aims.Alright, off the soapbox. First of all, linguistics is a huge field. Really, really, really big. Linguists study every aspect of human language. This ranges from translating cuneiform tablets to tracking tweets to figure out how people feel about political candidates or products to attempting to isolate and analyze the faces that children make when they're lying. Obviously, when you're looking at a variety of subjects that mind-blowingly diverse--Because, let's be honest, is there any aspect of the human experience that doesn't interact with language?--you're not going to use the same methodologies all the time.But, and this is what makes it science, the overarching goals are the same. Let's look at field linguistics, which to the untrained eye seems to be completely about describing languages and suspiciously like anthropology. Don't get me wrong, anthropology is a great field that has contributed a lot to our body of knowledge as a species, but it's not essentially scientific in nature. It's inherently descriptive, not experimental. Ethnographic data doesn't generally include what happens when the anthropologist intentionally violated a taboo specifically to see what would happen (Hey, Cuchulainn, wanna try this dog kabob?) because, 1) that's unbelievably unethical and 2) experimental data just wouldn't be as useful for them. Descriptive linguists, on the other hand, intentionally put together sentences that could be but aren't grammatical to see what native speakers think of them. They have a series of hypotheses that they gain support for or disprove by gathering and analyzing data, and their data gathering is guided by their theoretical framework.And these theoretical frameworks aren't just limited to one sub-field. Descriptive linguists' data is used by syntax people whose data is used by the folk doing phonology whose data is used by sociolinguists. Why? Because the arguments linguists make about how humans acquire, use and think about language are field-wide too. Whether language is innate to humans, for example, is a biggie, and every sub-field contributes fuel for the debate.So. We've got overarching theories that are supported or disproved using empirical data and evolve over time, all for the greater purpose of accurately describing a phenomena and using these theories to make accurate predictions. Sounds like science to me, even if most linguists don't wear a lab coat to work.]]>
metalinguistic and recursive. They're not that closely related, but they tend to get asked if they're sisters a lot. Why?Well, metalinguistic knowledge is knowing about language, and the fact that you can read this shows that you must have some metalinguistic knowledge. But this blog (and the field of linguistics as a whole) is concerned with knowing about what you know about language, i.e. meta-metalinguistic knowledge. And just just talking about that, I'm adding another level. My discussion of what we know about linguistics gets us all the way to meta-meta-metalinguistic knowledge. And by talking about that... You get the picture.[caption id= align=alignnone width=294 caption=The picture looks like this.][/caption]The picture is also recursive. One of my favorite examples of recursivity is PHP. Originally, the acronym stood for Personal Home Page, but it now stands for PHP: Hypertext Preprocessor. What does the PHP in that stand for? Why, for PHP: Hypertext Preprocessor, of course. (Repeat ad nauseum, or at least ad getting-punched-in-the-arm.) Or, wait, maybe it's cats looking at cats looking at cats looking at cast looking at cats...So you can see how they're related, right? They're both all about making you feel dizzy and then fall down, or maybe puke if you get motion sickness.But what you may not know about recursivity is that it's a very important process in linguistics as well. How so, you might ask? Well, remember in the days of yore (yesterday was totally a day of yore) when I told you all about generativity? Recursivity is a great example of one of those generative processes. You can have a recursive sentence that just goes on forever. How about when you're describing where you learned something?I heard it from Jen.Well, what if Jen heard it from someone else?I heard it from Jen who heard it from Ian.And then you find out that Ian wasn't the originator either.I heard it from Jen, who heard it from Ian, who heard it from Zach, who heard it from Nick, who heard it from Clarice... And so on and so forth.You can pretty much keep going on infinitely. You can do it with other types of phrases to.Get the butter from the fridge by the stove behind the water buffalo next to the peat coal kiln...Chomsky argued that recursion is the fundamental characteristic of human languge, and this has been the cause of some debate. (Pirahã may be the most argued-about Non-Indo-European language ever.) So recursion has two main uses in linguistics. The first is as a generative process that allows speakers to form infinitely long sentences, and the other is to use language about using language about using language about using language about using language about using language about using language...]]>
Used under the Creative Commons Attribution 2.0 Generic license, click for link to source.][/caption]If you answered Einstein's less famous brother, Einbert? you wouldn't actually be too far from the truth. It's Noam Chomsky. He's so famous his name comes pre-installed in Microsoft Word's spell checker. (Did you mean chomp sky?)If you've got a good history or government background, you may be thinking, Oh yeah, the anarchy guy. He may be, but his greatest intellectual achievement has nothing to do with anarchy and everything to do with linguistics. That achievement would be generativity.Gen-er-a-tiv-i-ty. Write it down, it will be on the test.Generativity was a game-changer for linguistics. Before that point, linguistics was basically phrenology, which I've mentioned before. Phrenology is to modern linguistics what naturalism is to modern biology. Phrenologists collected knowledge about languages haphazardly, without a whole lot of underlying theoretical structure. I mean, there was some, (I'll talk about what the brother's Grimm did on their weekends off later) but it was pretty confined. And a lot of it, let's be honest, was about proving that Europe was best. The monumental Oxford English Dictionary is a good example of that mindset. They wanted to collect every single word in English language and pin it neatly to the page with a little series of notes about it and a list of sightings in the wild. It was, and remains, a grand undertaking and a staggering achievement
but modern linguists aren't collectors anymore.That's because the end goal of modern linguistics is to solve language. The field is working to put together a series of rules that will actually describe and predict all human language. Not in the mind reader, fortune teller sense of predict. I mean that, with the right rules, we should be able to generate all possible sentences. In a generative way. By using generativity.So why is this important?Lots of reasons! Here, let me list them, because lists are fun to read. This turned linguistics from an interesting hobby for rich people into a science. If you have rules, you can make predictions about what those rules will produce and then test those predictions. Testing predictions is also known as science. It's also something that linguistics as a whole has been a little... hesitant to adopt, but that's another story. Suddenly computers! Computer programming is, at its most basic level, a series of rules. Linguistics is now dedicated to producing a series of rules. Bada-bing, bada-boom, universal translator. (It doesn't work that way, but, in theory, it eventually can.) Now we have a framework that we can use to figure out how to ask questions. We have a goal. Things are organized.Now for the promised test.What term is used to describe the current goal of linguistics; i.e. to generate a set of rules that can accurately describe and predict language usage? (Seriously, I'm not going to give you the answer. Just scroll up.)]]>
such a degree, get asked a lot. Fortunately, there are a lot of possible answers! I'm going to start with the obvious ones and then start surprising you.[caption id= align=alignnone width=640 caption=From Linguist Llama (click for original post).][/caption]Obvious answer #1: Get another degree in linguistics!If you're really in love with the subject, getting a doctorate and competing for the tiny number of teaching positions in the field is certainly an option. Imma be straight with you, though: it's very, very hard work; very, very competitive and very, very low paying for the amount of specialized training you need. (A PhD usually takes between four and six years...if you manage to finish at all.) Oh, and did I mention that you'll be expected to do original, groundbreaking research and consistently get it published in addition to your teaching load? Yeah... unless you're 100% sure that's what you want to do, you should probably keep reading.Obvious answer #2: Teach computers how to language!Do you like computers? Do you like linguistics? Do you like the thought of eventually having a job and making money? Holy balls of yarn, do I have a career for you. Super-high employment rates, cutting edge research, making all the best and newest toys... yeah. Plus, if you have a good background in both computer science and linguistics (a surprisingly large number of people only have a computer background) you'll be a very competitive candidate.Obvious answer #3: Help children and adults overcome speech problems!If you've always wanted a career where you help people, you should look into Speech-Language Pathology. Sometimes, someone doesn't acquire language correctly, or they develop a problem with language. Speech pathologists work with patients to help them acquire language or to relearn language. You'll need at least a masters, but most people find it to be a very rewarding career.Obvious answer #4: Work as a translator!So I wrote earlier about the difference between a linguist and a translator, but being a linguist can really help you with translation as well, particularly if you're interested in working on bilingual dictionaries. Of course, demand for translators varies from language to language, and you do have to be fluent in at least two languages.Obvious answer #5: Teach languages!If you're interested in teaching anyone to acquire a second language, whether it's English or something else, having a linguistics background can be very, very helpful. Think back to any foreign language classes you might have taken. Wouldn't it have been better if your teacher had been able to tell you exactly what you were supposed to be doing with your mouth, instead of vaguely telling you what letters it was like and then that You're doing it wrong? With a background in linguistics, you can really explain how things work in the second language, and that will really help your students.So those are the biggies. You'll need other skills for most of them, but linguistics will help you a lot. And, hey, linguistics classes are fun! But what other careers can linguistics help you with? Well...Be a lawyer! A background in linguistics is actually a really strong choice for someone heading to law school. Why? Well, law is all about using language really, really carefully and communicating effectively. An academic background in linguistics will help you do that.Make up languages! Now, this is a bit of a niche, but there is more than one person who has been paid for designing alien languages for flims. You've heard of Na'vi and Klingon, I presume? They're actually legit artificial languages with grammars and everything.Write standardized tests! If you're American, you've probably taken or will take the SAT's at some point. Fun fact: most of those language-based questions were written by linguists, who know how to ask questions designed to get at very specific pieces of linguistic knowledge.Do anything you like! Really, linguistics training gives you a great set of skills. You can analyze large sets of data, deduce the rules that would generate them and then write about them in a clear way. That's a really useful thing to be able to do.]]>
the speech stream is far from discrete, I talked about how difficult it is to pick apart words. But I didn't really talk that much about phonemes, and since I promised you phonetics and phonology and phun, I thought I should cover that. Besides, it's super interesting.It's not just that language is continuous, it's that language that's discrete is actually impossible to understand. I ran across this Youtube video a while back that's a great example of this phenomenon.[youtube http://www.youtube.com/watch?v=zlK5bfuv6Oo]What the balls of yarn is he saying? It's actually the preamble to the constitution, but it took me well over half the video to pick up on it, and I spend a dumb amount of time listening to phonemes in isolation.You probably find this troubling on some level. After all, you're a literate person, and as a literate person you're really, really used to thinking about words as being easy to break down into letter sounds. If you've ever tried to fiddle around with learning Mandarin or Cantonese, you know just how table-flippingly frustrating it is to memorize a writing system where the graphemes (smallest unit of writing, just as morpheme is the smallest unit of meaning, phoneme is the small unit of sound and dormeme is the smallest amount of space you can legally house a person in) have no relation to the series of sounds they represent.Fun fact: It's actually pretty easy to learn to speak Mandarin or Cantonese once you get past the tones. They're syntactically a lot like English, don't have a lot of fussy agreement markers or grammatical gender and have a pretty small core vocabulary. It's the characters that will make you tear your hair out.[caption id=attachment_85 align=alignnone width=172 caption=Hm. Well, it kinda looks me sitting on a chair hunched over my laptop while wearing a little hat and ARGH WHAT AM I DOING THAT LOOKS NOTHING LIKE A BIRD.][/caption]But. Um. Sorry, got a little off track there. Point was, you're really used to thinking about words as being further segmented. Like oranges. Each orange is an individual, and then there are neat little segments inside the orange so you don't get your hands sticky. And, because you're already familiar with the spelling system of your language, (which is, let's face it, probably English) you probably have a fond idea that it's pretty easy to divide words that way. But it's not. If it were, things like instantaneous computational voice to voice translation would be common.It's hard because the edges of our sounds blur together like your aunt's watercolor painting that you accidently spilled lemonade on. So let's say you're saying round. Well, for the n you're going to close off your nasal passages and put your tongue against the little ridge right behind your teeth. But wait! That's where you tongue needs to be to make the d sound! To make it super clear, you should stop open up your nasal passages before you flick your tongue down and release that little packet of air that you were storing behind it. You're totally not going to, though. I mean, your tongue's already where you need it to be; why would you take the extra time to make sure your nasal passages are fully clear before releasing the d? That's just a waste of time. And if you did it, you'd sound weird. So the d gets some of that nasally goodness and neither you or your listener give a flying Fluco.But, if you're a computer who's been told, If it's got this nasal sound, it's an 'n', then you're going to be super confused. Maybe you'll be all like, Um, ok. It kinda sounds like an 'n', but then it's got that little pop of air coming out that I've been told to look for with the 'p', 'b', 't' 'd', 'k', 'g' set
so
let's go with 'rounp'. That's a word, right? Obviously, this is a vast over-simplification, but you get my point; computers are easily confused by the smearing around of sounds in words. They're getting better, but humans are still the best.So just remember: when you're around the robot overlords, be sure to run your phonemes together as much as possible. It might confuse them enough for you to have time to run away.]]>
indiscreet, but continuous, and continuous is what language, especially speech, is. By continuous, I mean that it doesn't come out in separable chunks; it's more like a stream of water than a stream of ice cubes. In fact, English itself discriminates between things that are discrete and continuous; discrete things are called count nouns because (gasp!) you can count them, and continuous things are called mass nouns. You can count ice cubes and words, but you can't count water or language unless you assign them units.But wait, I can hear you protest. Language is discrete. I'm speaking in sentences, that are made up of words that are made up of letters. And you're right. For you, your language is made up of units that are psychologically real to you. Somewhere between the speaker vocalizing the words and you parsing them, you segment them using the rules that you've mastered. It's a deeply complex process and one that we still don't completely understand. If we did, we'd be able to write speech recognition programs that wouldn't give us errors like the wells were gathered and planning for the walls were dark and clammy. (True life. I got that very error not that long ago.)Here, let's look at some data. Here's the waveform that shows the wave intensity, or loudness, of a native speaker of English saying I am an elephant.Can you pick out the part of the speech signal for each of the words? Here, let me help you.So
if speech really is discrete, wouldn't expect four separate bumps in loudness for the words, with silence in between? (Maybe with a couple extra bumps on the end for the laugher.)Instead, what we get is pretty much a constant rush of noise that you rely on the vast amount of knowledge you have about your language to decode accurately. Take out that knowledge and you get something completely incomprehensible. And there's a really easy way to show this, just listen to someone speaking a language you aren't familiar with.[youtube http://www.youtube.com/watch?v=eFPStHF2CzE&w=420&h=315]That's Finnish and if you speak it well enough to understand everything he just said, I'd like to extend some mad props unto you; Finno-Ugric languages are as hard as ice-cream from a deep freezer. But to get back to the point, what observations can you make about what you just heard? The speaker was speaking super-quickly. There didn't seem to be any pauses between words Basically, it was like standing in front of a language fire hose.For people who don't speak your native language, you sound very similar. They're not speaking any more quickly in Hindi or Mandarin or Swahili or German than you are in English, you just don't have a metalinguistic framework to help you cut the sound-stream into words, slap it up on a syntactic framework and yank meaning out of it.]]>
Uploaded by Samir at en.wikipedia and used here under the GFDL.][/caption]That's them. But it's actually a two-step process. Step one: Tighten the vocal folds. This is like tuning a guitar; you can change the pitch of your voice based on how taut your vocal cords are. If you put your hand on your throat and sing a low note and a high one in quick succession, you can actually feel your muscle rotating as it adjusts the length of your vocal cords. Step two: Vibrate those vocal folds. Now, you might think, based on step one, that you use your muscles to wiggle them back and forth really fast. Nope. You vibrate your vocal cords by blowing air through them. The more air, the louder the sound, the sooner you have to take a new breath.So based on this, there are two possible ways to lose your voice. You can run out of air--which, unless you've had the breath knocked out of you, is a pretty straightforward problem to fix--or your muscles can crap out. And that's generally why you lose your voice. The muscles in your larynx are just like any other muscles. If you use them hard enough, long enough, they'll strain and, bam, you'll lose your voice. Of course, this is just for run-of-the-mill I've-been-screaming-at-a-football-match type voice loss. Anything that messes with those muscles will cause you to lose your voice, and that can include things like aging, smoking (seriously, don't smoke), damage to the larynx during surgery or even a tumor.But unless you're at risk for one of those things, your voice will come back once the strained muscles have had time to heal. In the meantime, I recommend carrying around a small whiteboard and whiteboard marker (It's got good visibility, you can write easily and quickly, and you can write large enough that people not directly next to you can read it.) and learning how to finger spell.]]>
a lot of information.Let me break it down for you. I recently did a large chunk of transcription, looking at speech data from four different people. I took a random two minute sample from each of those transcriptions, and they spoke 282, 257, 386 and 357 words in that time, for an average of around 160 words per minute. None of the people were talking faster than what I consider a normal rate, and I live in the South, where speaking rates are lower then they are in, say, California. But let's pretend that this is your normal speaking rate.Let's put this in perspective.Say you're one of those brave souls who does NaNoWriMo, and you try to write a 50,000 word novel in a month. If you were writing your novel as fast as you speak, you'd finish in a little over five hours. That's right. Every five hours you speak, you produce enough words to fill a book. Of course, you don't spend five hours a day talking at full tilt, but even so, most people speak around 16,000 words a day. (The link is a Scientific American summing up of the paper in question.)If you're a hacker, you might be a little confused at the words per minute figure. (In other languages and for other purposes linguists tend to use morphemes, syllables, or phonemes, and measure them by the minute, second, or even hour.) The unit milliLampson sometimes pops up:milliLampson /mil'*-lamp`sn/ /n./ A unit of talking speed, abbreviated mL. Most people run about 200 milliLampsons. The eponymous Butler Lampson [link mine] (a CS theorist and systems implementor highly regarded among hackers) goes at 1000. A few people speak faster. This unit is sometimes used to compare the (sometimes widely disparate) rates at which people can generate ideas and actually emit them in speech. For example, noted computer architect C. Gordon Bell (designer of the PDP-11) is said, with some awe, to think at about 1200 mL but only talk at about 300; he is frequently reduced to fragments of sentences as his mouth tries to keep up with his speeding brain.Yeah... it's cute, but you're not really going to see it cropping up in linguistics literature. My guess would be, based on the speaking rate, that a milliLampson is loosely based on words per minute, probably based on Californian speakers (maybe even from, gasp! UC Berkeley), and then inflated by folkloric proportions. But that's a great example of the type of misinformation that's out there. Take this for example:[caption id=attachment_57 align=alignnone width=640 caption=I love infographics. I do not love misinformation. Image taken from infographic by Medical Billing and Coding (which can be found at http://www.medicalbillingandcoding.org/life-summed-up/) for educational purposes. Don't you feel educated? Always hunt down the citations for these random numbers the people vomit at you. ][/caption]The fine folks at Medical Billing and Coding may have listed their sources, but I'm afraid one of their sources were wrong. Let take a look at this 73 million figure. I will even do all the arithmetic for you. I know, I know, I'm a peach.Ok, so, let's assume that their 18,140 figure is right, and that our 160 words/minute figure is right. In that case, we've got 18,140 hours per life x 60 minutes per hour x 160 words per minute and do the multiplication and cancel out all the units super nicely, and we come up with 174,144,000 words per life. That's almost 2.5 times as many as they predicted. Or, hey, since a little more math can't hurt, let's assume 75 speaking years per life x 365 days per year x 16,000 words per day and we come up with 438,000,000 words per life. And since I'm far more likely to trust the data from the article published in Science than my own little two-bit estimation, it looks like this infographic is wrong by a factor of 6.What's even more amazing, though, is that if you wrote down every single one of those words, it would be as long as 402.5 editions of Proust's In Search of Lost Time, the longest novel ever written. Like I said, you talk a lot.]]>
[/caption]Linguistics is a huge field. It includes everything from the algorithms behind Siri to preserving endangered languages like flies in amber to reconstructing dead languages. (Unlike biologists, we don't have to worry about an undead T-Rex wandering around if things go terribly, terribly wrong.) Since it's just one intrepid girl linguist here at Making Noise and Hearing Things, I'm going to have to restrict myself to just a single set of sub-disciplines. These are: Psycholinguistics: Like a zombie valiantly trying to overcome his crippling aphasia, psycholinguistics all about language and brains. Since I'm all about sound, you'll probably be getting a lot of stuff about brains and sound. Phonology: Often confused with phrenology (no, seriously, this happens to me all the time) it's the study of the systems of rules languages apply to their sounds. Here's a quick example: say dogs and cats. Is the s on the end of both of those words the same? Try saying it again with your hand right above your Adam's apple or where your Adam's apple would be. When you say the s on dogs you should feel a slight buzzing, like you've swallowed a bee. The s on cats, though, doesn't have it. Whether or not the final s has buzzing in it (linguists call it voicing) is determined by a simple rule in English: you get vibration on the final s if the sound before it had it. The g sound in dog has vibration; the t sound in cat doesn't. Phonetics: This is the study of sounds themselves. Phonology is all like, Oh, yeah, that was voicing. Phonetics is all like, Sure, but how much voicing? How long did it last? How much air came out? Phonetics wants to know all the dirty details. Phonetics takes videos like this one, where you can see the vocal folds vibrating in slow motion. [[WARNING: If you are prone to nightmares of terrors from beyond space, you might want to skip this one. Just saying.]][youtube http://www.youtube.com/watch?v=Drns_eV9wWg]But, yeah, those are the biggies. I can't promise I won't be branching out from these sub-disciplines, but I can promise an extremely low frequency of syntax posts. (Low frequency! Get it? Because... sound... um. Never mind.)]]>
not the same thing, but I'll get to that later) you probably read the title of this post and immediately thought No, I'm not. Trust me, you are. How do I know? Well, a linguist is someone who does two things: Makes claims about language Attempts to either verify or disprove these claims (whether they made them or someone else did).That's it. There's no secret cabal of linguists you have to join, you don't have to speak thirty languages, and you certainly don't have to have a PhD. I think if you start paying attention, you'll notice that you do this all the time. Have you ever had a conversation like this?Lulu: She talks slow.Max: Really? I've never noticed it.Or maybe one like this:Lulu: We go to the zoo.Max: Don't you mean we're going to the zoo?Lulu: No, we go to the zoo all the time.Max: But we're going today, so you could have said that we're going.Lulu: Yeah, but that's not what I meant.Bam. You're a linguist; go you! But wait a minute, you say, I know for a fact that translators are called linguists. Are you saying that just speaking another language doesn't make you a linguist? Because that's what I've always heard.Man, you've got the linguistics bug bad. Look at you, bringing up fine semantic distinctions! (Semantics is the study of how words map onto meaning, BTW.) And you're absolutely right, a linguist can also be someone who speaks more than one language. The Oxford English Dictionary, the most complete record of the English language, defines a linguist as, first:One who is skilled in the use of languages; one who is master of other tongues besides his own. (Often with adj. indicating the degree or extent of the person's skill.)And only later as:A student of language; a philologist.Philology is what the very beginnings of the modern study of language were called. These days, most people prefer the term linguistics, and only use philology for a certain field of study within linguistics. For the purposes of this blog and most academic settings, a linguist is not someone who knows languages, but someone who knows about languages. And since knowing a language also automatically means you know about a language--if you're a native English speaker, you can easily identify where people are from based on their accent, for example--you, sir or madam or other, are a linguist.]]>