# Thinking in tensors in PyTorch

Hands-on training  by [Piotr Migdał](https://p.migdal.pl) (2019). 

Version for [AI & NLP Workshop Day](https://nlpday.pl/), 31 May 2019, Warsaw, Poland: **Understanding LSTM and GRU networks in PyTorch**.



## NLP & AI: 3. Embedding vs one-hot encoding


[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/stared/thinking-in-tensors-writing-in-pytorch/blob/master/extra/3%20Embedding%20vs%20one-hot%20encoding.ipynb)
 

In [1]:
import torch
from torch import nn

In [2]:
emb = nn.Embedding(10, 3)

In [3]:
emb.weight

Parameter containing:
tensor([[-1.3940,  0.2951,  0.7581],
        [-0.0409, -1.1174,  0.6262],
        [-1.1795,  0.7874, -0.9221],
        [ 0.8923, -0.8383, -0.5019],
        [ 0.6201,  0.3561,  0.0570],
        [-0.4379, -0.3483,  1.0572],
        [-1.6454, -0.2331, -0.1572],
        [ 1.2459, -1.0029,  0.8067],
        [-0.2577, -1.2125, -0.9466],
        [ 1.2742,  0.6020, -1.5209]], requires_grad=True)

In [4]:
words = torch.LongTensor([[2, 2, 4, 1, 5]])
words

tensor([[2, 2, 4, 1, 5]])

In [5]:
emb(words)

tensor([[[-1.1795,  0.7874, -0.9221],
         [-1.1795,  0.7874, -0.9221],
         [ 0.6201,  0.3561,  0.0570],
         [-0.0409, -1.1174,  0.6262],
         [-0.4379, -0.3483,  1.0572]]], grad_fn=<EmbeddingBackward>)

In [6]:
words_onehot = torch.zeros((5, 10), dtype=torch.float32)
for i, j in enumerate([ 2,  2,  4,  1,  5]):
    words_onehot[i, j] = 1.
words_onehot

tensor([[0., 0., 1., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 1., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 1., 0., 0., 0., 0., 0.],
        [0., 1., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 1., 0., 0., 0., 0.]])

In [7]:
words_onehot.matmul(emb.weight)

tensor([[-1.1795,  0.7874, -0.9221],
        [-1.1795,  0.7874, -0.9221],
        [ 0.6201,  0.3561,  0.0570],
        [-0.0409, -1.1174,  0.6262],
        [-0.4379, -0.3483,  1.0572]], grad_fn=<MmBackward>)