# Dependencies

- 📺 **Video:** [https://youtu.be/dbDjKCc4R3E](https://youtu.be/dbDjKCc4R3E)

## Overview
- Represent syntax as directed arcs between head and dependent words.
- Examine projective trees and dependency labels.

## Key ideas
- **Head-dependent relations:** each token (except root) points to a head.
- **Labels:** encode grammatical function (nsubj, dobj, etc.).
- **Projectivity:** arcs should not cross in projective parses.
- **Conversion:** algorithms map between constituency and dependency representations.

## Demo
Build a dependency tree for a sentence and convert it to adjacency lists, as demonstrated in the lecture (https://youtu.be/IRvjQRFaz9k).

In [1]:
sentence = ['ROOT', 'The', 'quick', 'brown', 'fox', 'jumps', 'over', 'the', 'lazy', 'dog']
heads = [0, 4, 4, 4, 6, 0, 6, 6, 9, 6]  # head indices per token
labels = ['root', 'det', 'amod', 'amod', 'nsubj', 'root', 'case', 'det', 'amod', 'obl']

for idx in range(1, len(sentence)):
    head_word = sentence[heads[idx]] if heads[idx] > 0 else 'ROOT'
    print(f"{sentence[idx]:<6} <-({labels[idx]})- {head_word}")

adjacency = {}
for idx in range(1, len(sentence)):
    adjacency.setdefault(heads[idx], []).append(idx)

print()

print('Adjacency list (head -> dependents):')
for head, children in adjacency.items():
    print(sentence[head], '->', [sentence[c] for c in children])


The    <-(det)- fox
quick  <-(amod)- fox
brown  <-(amod)- fox
fox    <-(nsubj)- over
jumps  <-(root)- ROOT
over   <-(case)- over
the    <-(det)- over
lazy   <-(amod)- dog
dog    <-(obl)- over

Adjacency list (head -> dependents):
fox -> ['The', 'quick', 'brown']
over -> ['fox', 'over', 'the', 'dog']
ROOT -> ['jumps']
dog -> ['lazy']


## Try it
- Modify the demo
- Add a tiny dataset or counter-example


## References
- [BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding](https://arxiv.org/pdf/1810.04805.pdf)
- [BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding](https://arxiv.org/pdf/1810.04805.pdf)
- [To Tune or Not to Tune? Adapting Pretrained Representations to Diverse Tasks](https://www.aclweb.org/anthology/W19-4302/)
- [GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding](https://arxiv.org/pdf/1804.07461.pdf)
- [What Does BERT Look At? An Analysis of BERT's Attention](https://arxiv.org/abs/1906.04341)
- [RoBERTa: A Robustly Optimized BERT Pretraining Approach](https://arxiv.org/pdf/1907.11692.pdf)
- [BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension](https://arxiv.org/abs/1910.13461)
- [Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer](https://arxiv.org/pdf/1910.10683.pdf)
- [UnifiedQA: Crossing Format Boundaries With a Single QA System](https://arxiv.org/abs/2005.00700)
- [Neural Machine Translation of Rare Words with Subword Units](https://arxiv.org/pdf/1508.07909.pdf)
- [Byte Pair Encoding is Suboptimal for Language Model Pretraining](https://arxiv.org/pdf/2004.03720.pdf)
- [Eisenstein 8.1](https://github.com/jacobeisenstein/gt-nlp-class/blob/master/notes/eisenstein-nlp-notes.pdf)
- [Eisenstein 7.1](https://github.com/jacobeisenstein/gt-nlp-class/blob/master/notes/eisenstein-nlp-notes.pdf)
- [Eisenstein 7.4](https://github.com/jacobeisenstein/gt-nlp-class/blob/master/notes/eisenstein-nlp-notes.pdf)
- [Eisenstein 7.4.1](https://github.com/jacobeisenstein/gt-nlp-class/blob/master/notes/eisenstein-nlp-notes.pdf)
- [Eisenstein 7.3](https://github.com/jacobeisenstein/gt-nlp-class/blob/master/notes/eisenstein-nlp-notes.pdf)
- [TnT - A Statistical Part-of-Speech Tagger](https://arxiv.org/abs/cs/0003055)
- [Enriching the Knowledge Sources Used in a Maximum Entropy Part-of-Speech Tagger](https://www.aclweb.org/anthology/W00-1308/)
- [Part-of-Speech Tagging from 97% to 100%: Is It Time for Some Linguistics?](https://link.springer.com/chapter/10.1007/978-3-642-19400-9_14)
- [Natural Language Processing with Small Feed-Forward Networks](https://www.aclweb.org/anthology/D17-1309.pdf)
- [Eisenstein 10.1-10.2](https://github.com/jacobeisenstein/gt-nlp-class/blob/master/notes/eisenstein-nlp-notes.pdf)
- [Eisenstein 10.3-10.4](https://github.com/jacobeisenstein/gt-nlp-class/blob/master/notes/eisenstein-nlp-notes.pdf)
- [Eisenstein 10.3.1](https://github.com/jacobeisenstein/gt-nlp-class/blob/master/notes/eisenstein-nlp-notes.pdf)
- [Accurate Unlexicalized Parsing](https://www.aclweb.org/anthology/P03-1054/)
- [Eisenstein 10.5](https://github.com/jacobeisenstein/gt-nlp-class/blob/master/notes/eisenstein-nlp-notes.pdf)
- [Eisenstein 11.1](https://github.com/jacobeisenstein/gt-nlp-class/blob/master/notes/eisenstein-nlp-notes.pdf)
- [Finding Optimal 1-Endpoint-Crossing Trees](https://www.aclweb.org/anthology/Q13-1002/)
- [Eisenstein 11.3](https://github.com/jacobeisenstein/gt-nlp-class/blob/master/notes/eisenstein-nlp-notes.pdf)


*Links only; we do not redistribute slides or papers.*