# Translation

Natural language parsing has many applications in other natural language processing tasks such as machine translation. For instance, when 2 languages have very different sentential structure, finding the structure of the input sentence is often helpful in predicting the correct word order for the output.

We'll do an example of translating languages with different order. This is called syntactic reordering.

## Syntactic Reordering

In this example, we'll translate English to Yoda-English

<img src = 'yoda.jpeg' width = 400/>

Yoda, from the Star Wars series, is a legendary Jedi master, famous for being Luke Skywalker's instructor and for speaking English in a peculiar manner. 

<img src = 'peculiar.png' width = 300/>

Here is the syntactic tree of the sentence "I can help you".

<img src = 'tree.png' width = 300/>

...where `MD` stands for the `Modal Verb`.

We can rearrange the tree so that it becomes the following,

<img src = 'tree2.png' width = 300/>

Once we have a tree that corresponds to the original sentence, the transformation is quite simple. 

## Demo

We're starting with a partial implementation. We are going to grab words from the tree `t`,

In [None]:
def words(t):
    """ Yield the words in t"""

And we want to be able to extract the modal argument of the tree `t` that contains a modal verb.

In [None]:
def extract_modal(t):
    """ Delete and return the arg of a modal verb phrase"""
    if type(t) == Leaf: # If t is a leaf, then there's no `MD`
        return None
    # Recursively searches t to find when t has 2 branches where the left branch is tagged 'MD'
    if len(t.branches) == 2 and t.branches[0].tag == 'MD':
        modal, clause = t.branches # Grab the modal verb and the clause modified by the modal verb
        t.branches = modal # replace t.branches with the first branch 
        return clause # return the second branch, which is going to be moved to the front of the tree
    # If it's neither leaf nor a branch where the left branch is `MD`,
    # then recurse through all the branches in t
    for b in t.branches:
        clause = extract_modal(b)
        #If we find a clause that contains argument to a modal verb, then return it
        if clause is not None:
            return clause
    return None #Otherwise, return None

In [None]:
@main
def run():
    for line in sys.stdin:
        tree = parse(line, 1, list)[0] # parse the tree from the line
        print_tree(tree)
        yoda = yoda_transform(tree) # perform the Yoda transform
        print_tree(yoda) # print the transformed tree
        transformed = ' '.join(words(yoda)) # Join the words of that tree together
        print('Yoda would say, "{}"'.format(transformed))

Extracting words from a tree `t` involves checking whether `t` is a `leaf`.

In [None]:
def words(t):
    """ Yield the words in t"""
    if type(t) == Leaf: # if t is a leaf
        yield t.word# Then the words in that tree are just the word at the leaf
    else: # Otherwise
        for b in t.branches: # Go through all the branches
            for w in words(b): # For every result of calling the words function on every branch
                yield w # Yield that result

And now let's define the `yoda_transform` function.

In [None]:
def yoda_transform(t):
    """ Place the clause of a modal verb at the front """
    # First get the clause, the argument of the modal verb phrase, by calling extract_modal on the original tree
    clause = extract_modal(t)
    if clause is not None: # If clause is not None, there's a modal verb present!
        return Tree(t.tag, [clause, Leaf(',', ',')] + t.branches)
    else: # If there's no modal verb
        return t # Then we don't do anything

The file `ex.py` contains the code for this `yoda_transform` program. 