Skip to content

tsmatz/nlp-tutorials

Repository files navigation

Natural Language Processing (Neural Methods) Tutorials

This repository consists of Python examples to learn fundamental neural methods for Natural Language Processing (NLP). All notebooks describe fundamental ideas for each architectures.

In the former part, I'll focus on embeddings.
In the latter part, I'll focus on language models, and finally discuss how and why the widely used transformer architecture matters.

  1. Primitive Embeddings (Sparse Vector)
  2. Custom Embedding (Dense Vector)
  3. Word2Vec algorithm (Negative Sampling)
  4. N-Gram detection with 1D Convolution
  5. Neural Language Model - Basic FFN
  6. Neural Language Model - RNN (Recurrent Neural Network)
  7. Encoder-Decoder Architecture (Seq2Seq)
  8. Attention
  9. Transformer

Examples are built with PyTorch from scratch. I recommend you to run these examples on GPU-utilized machine.

NLP (natural language processing) has a long history in artificial intelligence, and generative models were also developed with traditional statistical models in 1950s - such as, Hidden Markov Models (HMMs) or Gaussian Mixture Models (GMMs).
However, this repository focuses on recent neural methods examined in today's NLP, and you'll learn how each models are developed, improved, and reached into today's architectures (such as, widely-used transformers) by running notebooks step-by-step in order.

Tsuyoshi Matsuzaki @ Microsoft

About

Natural Language Processing (Neural Methods) Tutorials - from embeddings to transformers

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published