# SCAN Dataset

Each example in the SCAN dataset is aimed at converting a natural language command to a sequence of actions. 

$$ InputCommand \longrightarrow OutputSequence$$

Example: 

$$jump \qquad  thrice \longrightarrow JUMP \qquad  JUMP\qquad  JUMP$$

## Phrase Structure Grammar 

The input commands can be generated with a basic PS grammar starting from C and ending with U: 

1. C $\longrightarrow$ S and S
2. C $\longrightarrow$ S after S
3. C $\longrightarrow$ S
4. S $\longrightarrow$ V twice
5. S $\longrightarrow$ V thrice
6. S $\longrightarrow$ V
7. V $\longrightarrow$ D[1] opposite D[2]
8. V $\longrightarrow$ D[1] around D[2]
9. V $\longrightarrow$ D
10. V $\longrightarrow$ U
11. D $\longrightarrow$ U left
12. D $\longrightarrow$ U right
13. D $\longrightarrow$ turn left
14. D $\longrightarrow$ turn right
15. U $\longrightarrow$ walk
16. U $\longrightarrow$ run
17. U $\longrightarrow$ jump
18. U $\longrightarrow$ look

Where C=Full Command, S= Sentence Phrase, V= Verb Phrase, D= Direction Phrase, U= Verb

## Interpretation Function

[[walk]] = WALK 

[[jump]] = JUMP

[[run]]= RUN

[[look]]= LOOK

[[turn left]] = LTURN

[[turn right]] = RTURN

[[u left]]= LTURN [[u]]

[[u right]]= RTURN [[u]]

[[turn opposite left]] = LTURN LTURN

[[turn opposite right]] = RTURN RTURN

[[u opposite left]] = [[turn opposite left]] [[u]]

[[u opposite right]] = [[turn opposite right]] [[u]]

[[turn around left]] = LTURN LTURN LTURN LTURN

[[turn around right]] = RTURN RTURN RTURN RTURN

[[u around left]]= LTURN [[u]] LTURN [[u]] LTURN [[u]] LTURN [[u]]

[[u around right]]= RTURN [[u]] RTURN [[u]] RTURN [[u]] RTURN [[u]]

[[x twice]] = [[x]] [[x]]

[[x thrice]] = [[x]] [[x]] [[x]]

[[x1 and x2]] = [[x1]] [[x2]]

[[x1 after x2]] = [[x2]] [[x1]] 

## Building Causal Model

Ideal Causal Algorithm: 

1. Layer 0: For a given command like "jump thrice and turn left", get list of leaf nodes.
2. Layer 1: Get the phrase divisions based on "and" and "after" 
3. Layer 2: Use interpretation function to get an interpretation for each node
4. Layer 3: Use variable binding to remove all variables
5. Layer 4: Get final action sequence

In [1]:
# layer 0: get list of leaf nodes
command= "jump thrice and turn left"
l0=command.split()
print(l0)

['jump', 'thrice', 'and', 'turn', 'left']


In [2]:
# layer 1: get phrase divisions
if 'and' in l0:
    l11=l0[:l0.index("and")] 
    l12= l0[l0.index("and")+1:]
elif 'after' in l0:
    l11= l0[l0.index("after")+1:]
    l12=l0[:l0.index("after")] 
else:
    l11=l0
    l12=[]
l1=[l11,l12]
print(l1)

[['jump', 'thrice'], ['turn', 'left']]


In [3]:
# layer 2: apply interpretation function depending on word class
actions=["walk","run","jump", "look"]
turns=["around","opposite"]
directions=["right", "left"]
nums=["twice","thrice"]

l21=[]
for word in l11:
    if word in actions:
        l21.append(word.upper())
    elif word in nums:
        if word=="twice":
            l21.append(["[[x]][[x]]"])
        elif word=="thrice":
            l21.append(["[[x]][[x]][[x]]"])
    elif word in turns:
        if word=="around":
            l21.append(["[[y]][[y]][[y]][[y]]"])
        elif word=="opposite":
            l21.append(["[[y]][[y]]"])
    elif word in directions:
        if word=="right":
            l21.append("RTURN")
        elif word=="left":
            l21.append("LTURN")

l22=[]
for word in l12:
    if word in actions:
        l22.append(word.upper())
    elif word in nums:
        if word=="twice":
            l22.append(["[[x]][[x]]"])
        elif word=="thrice":
            l22.append(["[[x]][[x]][[x]]"])
    elif word in turns:
        if word=="around":
            l22.append(["[[y]][[y]][[y]][[y]]"])
        elif word=="opposite":
            l22.append(["[[y]][[y]]"])
    elif word in directions:
        if word=="right":
            l22.append("RTURN")
        elif word=="left":
            l22.append("LTURN")   

l2=[l21,l22]
print(l2)

[['JUMP', ['[[x]][[x]][[x]]']], ['LTURN']]


In [4]:
# layer 3: variable binding
l31=[]
for item in l21:
    if item ==['[[x]][[x]][[x]]']:
        l31.append (l21[l21.index(item)-1]*3)
    if item =='[[x]][[x]]]':
        l31.append (l21[l21.index(item)-1]*2)
l32=[]
for item in l22:
    if item ==['[[[x]][[x]]']:
        l32.append (l21[l21.index(item)-1]*3)
    if item =='[[x]][[x]]]':
        l32.append (l21[l21.index(item)-1]*2)

l3=[l31,l22]
print(l3)

[['JUMPJUMPJUMP'], ['LTURN']]


In [5]:
# layer 4: action sequence output
sequence = ''.join([item for sublist in l3 for item in sublist])
print(sequence)

JUMPJUMPJUMPLTURN
