Skip to content
Alex Rudnick edited this page Nov 22, 2013 · 1 revision

Notes from 20 June 2011!

morning notes

The proceedings are on a flash drive, and it also contains some software and datasets from the paper! It's a new tradition!

The proceedings are dedicated to Fred Jelinek.

first morning keynote: David Ferrucci from IBM Watson

Deep QA/IBM Research.

"We didn't publish for four years..." kind of risky! (they just clammed up for some time)

"What would have happened if this went really bad?"

"Capture the broader imagination"; reference Deep Blue. "Most people can communicate really well in human language."

Tests/evaluation sparring match against Jeopardy! players.

Jeopardy! things: open domain, complex language, high precision, accurate confidence estimates, high speed.

IBM has a srs bsns parser; they've been developing it for 20 years. But the ultimate metric is: win on Jeopardy!

Almost no time was spent curating data. Used some existing databases: wordnet and somethign that sounds like "yago". YAGO? (eta: probably this one: http://en.wikipedia.org/wiki/YAGO_(ontology) )

Build up a zillion syntactic frames. "And this helped us do things!"

Interesting pronunciation slip: "nerdnet".

Build up a list of candidate answers, come up with evidence for them. Ensemble methods: each component gets different weights.

"A long tiresome speech delivered by a frothy pie topping" in the "edible rhyme time" category: meringue harangue.

Good Jeopardy! players have really high precision, and get access to the question first, like 60% of the time. (they had this awesome plot of speed vs. precision for different Jeopardy! players; eg, Ken Jennings got to the question first like up to 80% of the time and almost always got it right -- to win, they'd have to be fast enough to win on the buzzer the great majority of the time, and then manage to get the answer right. Daaang.)

Watson fit on 10 racks.

"Enabled us to live under this fantasy that we didn't have to worry about speed..."

Uima: now an opensource apache project. http://uima.apache.org/

They also used Lucene!

Put al the researchers in a single room: not very IBM.

Pun detector!!! "St. Paul is holier than South Bend."

"Agentinia. Argentna. Argentina. Wow."

All the content was preloaded into RAM. HFS.

MT track

Zollmann and Vogel

"A Word-Class Approach to Labeling PSCFG Rules for Machine Translation"

http://aclweb.org/anthology-new/P/P11/P11-1001.pdf

Ravi and Knight

"Deciphering Foreign Language"

http://aclweb.org/anthology-new/P/P11/P11-1002.pdf

Most language pairs don't have bilingual input. Not so scalable to pay people to make bilingual corpora? What if you didn't need bitext? Just use monolingual text, holy cow.

99% of the work in MT requires parallel text.

"decipherment approach".

"Bayesian model 3"

Treat translation as a cryptography task. In this setting, you have foreign sentences, but alignments *and* English sentences are hidden. Learn substitution mappings between English words and the cipher codes ... so adjust the mapping until you get sensible-looking English.

Iterative EM. This is cool!!

Then: Bayesian model of word substitution. ChineseRestaurantProcess, GibbsSampling.

OK, now how do we do MT with this? Without parallel data, training is intractable. Replace parts of the model 3 with a ChineseRestaurantProcess. Question :what text were the final bleu scores over?

Xianchao Wu, Matsuzaki, Tsujii

"Effective Use of Function Words for Rule Generalization in Forest-Based Translation"

http://aclweb.org/anthology-new/P/P11/P11-1003.pdf

"Horse-driven train?" Giza++, tree/forest based systems.

"Solution: Let the horse fly!"

Forest-to-string vs. tree-to-string?

Forest-to-string realignment. Realign function words to source forests. How to align the Japanese function words?

pumpkin = sentence. piece = chunk. seeds = words. Use the Enju parser for English, Cabocha for Japanese... attach the unaligned function words to higher nodes in the parse forest. Open source the software!!

Clifton and Sarkar

"Combining Morpheme-based Machine Translation with Post-processing Morpheme Prediction"

http://aclweb.org/anthology-new/P/P11/P11-1004.pdf

Three big Finnish words become, in English, "Should I sit down for a while in the big house?" -- 11 English words.

Data sparsity for MRLs. Also BLEU makes it hard to work in MRLs because missing a word is really bad. It's too coarse-grained.

Unsupervised morphological induction. Our Finnish MT is pretty bad, especially English->Finnish.

Morfessor segmentation system.

EUROPARL, Moses, unsupervised segmentation... the unsupervised segmentation didn't work well. Maybe long-distance dependencies were the issue? Segment the Finnish first, treat the morphs as words.

Next step: simplify the morphology first, train the system -- then predict the morphology again, based on the input in the English in a post-processing step.

Finnish vowel harmony: language-specific part of this technique. Need better evaluation for MRL generation.

Linguistic Creativity Track


CategoryAclHlt2011

Clone this wiki locally