# Applying Embeddings, Deep Averaging Networks

- 📺 **Video:** [https://youtu.be/3pwwdHuH0I4](https://youtu.be/3pwwdHuH0I4)

## Overview
This segment demonstrates how learned word embeddings can be applied in an NLP model, using the example of a Deep Averaging Network (DAN). A DAN (Iyyer et al., 2015) is a simple neural architecture for text classification: it takes the average of all the word embeddings in a piece of text (thus ignoring word order entirely, hence “unordered composition”) and then feeds this average vector through one or more feedforward neural layers to predict a label.

In [None]:
import os, random
random.seed(0)
CI = os.environ.get('CI') == 'true'

## Key ideas
- The video explains that despite its simplicity - essentially treating the input as a bag of embedded words - a DAN can perform surprisingly well on tasks like sentiment analysis or topic classification Iyyer et al.
- found that this simple approach could rival more complex models that incorporated syntax, hence the paper title “Deep Unordered Composition Rivals Syntactic Methods”.
- The lecture likely walks through how a DAN works on a sample sentence: say “The movie was absolutely fantastic”.
- Each word (“The”, “movie”, “absolutely”, “fantastic”) is replaced by its embedding vector.

## Demo

In [None]:
print('Try the exercises below and follow the linked materials.')

## Try it
- Modify the demo
- Add a tiny dataset or counter-example


## References
- [Eisenstein 14.5](https://github.com/jacobeisenstein/gt-nlp-class/blob/master/notes/eisenstein-nlp-notes.pdf)
- [Distributed Representations of Words and Phrases and their Compositionality](https://papers.nips.cc/paper/2013/file/9aa42b31882ec039965f3c4923ce901b-Paper.pdf)
- [A Scalable Hierarchical Distributed Language Model](https://papers.nips.cc/paper/2008/hash/1e056d2b0ebd5c878c550da6ac5d3724-Abstract.html)
- [Neural Word Embedding as Implicit Matrix Factorization](https://papers.nips.cc/paper/2014/file/feab05aa91085b7a8012516bc3533958-Paper.pdf)
- [GloVe: Global Vectors for Word Representation](https://www.aclweb.org/anthology/D14-1162/)
- [Enriching Word Vectors with Subword Information](https://arxiv.org/abs/1607.04606)
- [Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings](https://papers.nips.cc/paper/2016/file/a486cd07e4ac3d270571622f4f316ec5-Paper.pdf)
- [Black is to Criminal as Caucasian is to Police: Detecting and Removing Multiclass Bias in Word Embeddings](https://www.aclweb.org/anthology/N19-1062/)
- [Lipstick on a Pig: Debiasing Methods Cover up Systematic Gender Biases in Word Embeddings But do not Remove Them](https://www.aclweb.org/anthology/N19-1061/)
- [Deep Unordered Composition Rivals Syntactic Methods for Text Classification](https://www.aclweb.org/anthology/P15-1162/)


*Links only; we do not redistribute slides or papers.*