Overview

PyText is a deep-learning based NLP modeling framework built on PyTorch. PyText addresses the often-conflicting requirements of enabling rapid experimentation and of serving models at scale. It achieves this by providing simple and extensible interfaces and abstractions for model components, and by using PyTorch’s capabilities of exporting models for inference via the optimized Caffe2 execution engine. We are using PyText in Facebook to iterate quickly on new modeling ideas and then seamlessly ship them at scale.

Core PyText features:

Production ready models for various NLP/NLU tasks:
- Text classifiers
  - Yoon Kim (2014): Convolutional Neural Networks for Sentence Classification
  - Lin et al. (2017): A Structured Self-attentive Sentence Embedding
- Sequence taggers
  - Lample et al. (2016): Neural Architectures for Named Entity Recognition
- Joint intent-slot model
  - Zhang et al. (2016): A Joint Model of Intent Determination and Slot Filling for Spoken Language Understanding
- Contextual intent-slot models
Distributed-training support built on the new C10d backend in PyTorch 1.0
Extensible components that allows easy creation of new models and tasks
Reference implementation and a pretrained model for the paper: Gupta et al. (2018): Semantic Parsing for Task Oriented Dialog using Hierarchical Representations
Ensemble training support

Installing PyText

To get started on a Cloud VM, checkout our guide

We recommend using a virtualenv:

  $ python3 -m virtualenv venv
  $ source pytext/bin/activate
  (venv) $ pip install pytext-nlp

Detailed instructions can be found in our Documentation

Train your first text classifier

For this first example, we'll train a CNN-based text-classifier that classifies text utterances, using the examples in tests/data/train_data_tiny.tsv.

  (venv) $ pytext train < demo/configs/docnn.json

By default, the model is created in /tmp/model.pt

Now you can export your model as a caffe2 net:

  (venv) $ pytext export < config.json

You can use the exported caffe2 model to predict the class of raw utterances like this:

  (venv) $ pytext --config-file config.json predict <<< '{"raw_text": "create an alarm for 1:30 pm"}'

License

PyText is BSD-licensed, as found in the LICENSE file.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.circleci		.circleci
.github		.github
demo		demo
pytext		pytext
tests		tests
.coveragerc		.coveragerc
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
activation_venv		activation_venv
activation_venv.bat		activation_venv.bat
install_deps		install_deps
install_deps.bat		install_deps.bat
pytest.ini		pytest.ini
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Overview

Installing PyText

Train your first text classifier

License

About

Releases

Packages

Languages

License

GitHub30/pytext

Folders and files

Latest commit

History

Repository files navigation

Overview

Installing PyText

Train your first text classifier

License

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages