Thinc: A refreshing functional take on deep learning, compatible with your favorite libraries
Thinc is a lightweight deep learning library that offers an elegant, type-checked, functional-programming API for composing models, with support for layers defined in other frameworks such as PyTorch, TensorFlow and MXNet. You can use Thinc as an interface layer, a standalone toolkit or a flexible way to develop new models. Previous versions of Thinc have been running quietly in production in thousands of companies, via both spaCy and Prodigy. We wrote the new version to let users compose, configure and deploy custom models built with their favorite framework.
- Type-check your model definitions with custom types and
- Wrap PyTorch, TensorFlow and MXNet models for use in your network.
- Concise functional-programming approach to model definition, using composition rather than inheritance.
- Optional custom infix notation via operator overloading.
- Integrated config system to describe trees of objects and hyperparameters.
- Choice of extensible backends, including JAX support (experimental).
- Read more →
Thinc is compatible with Python 3.6+ and runs on Linux, macOS and Windows. The latest releases with binary wheels are available from pip.
pip install thinc==8.0.0a1
⚠️Note that Thinc 8.0 is currently in alpha preview and not necessarily ready for production yet.
📓 Selected examples and notebooks
Also see the
/examples directory and usage documentation for more examples. Most examples are Jupyter notebooks – to launch them on Google Colab (with GPU support!) click on the button next to the notebook name.
||Everything you need to know to get started. Composing and training a model on the MNIST data, using config files, registering custom functions and wrapping PyTorch, TensorFlow and MXNet models.|
||How to use Thinc,
||Implementing and training a basic CNN for part-of-speech tagging model without external dependencies and using different levels of Thinc's config system.|
||How to set up synchronous and asynchronous parameter server training with Thinc and Ray.|
📖 Documentation & usage guides
|Introduction||Everything you need to know.|
|Concept & Design||Thinc's conceptual model and how it works.|
|Defining and using models||How to compose models and update state.|
|Configuration system||Thinc's config system and function registry.|
|Integrating PyTorch, TensorFlow & MXNet||Interoperability with machine learning frameworks|
|Layers API||Weights layers, transforms, combinators and wrappers.|
|Type Checking||Type-check your model definitions and more.|
🗺 What's where
||User-facing API. All classes and functions should be imported from here.|
||Custom types and dataclasses.|
||The layers. Each layer is implemented in its own module.|
||Interface for external models implemented in PyTorch, TensorFlow etc.|
||Functions to calculate losses.|
||Functions to create optimizers. Currently supports "vanilla" SGD, Adam and RAdam.|
||Generators for different rates, schedules, decays or series.|
||Config parsing and validation and function registry system.|
||Utilities and helper functions.|
🐍 Development notes
black for auto-formatting,
flake8 for linting and
mypy for type checking. All code is written compatible with Python 3.6+, with type hints wherever possible. See the type reference for more details on Thinc's custom types.
👷♀️ Building Thinc from source
Building Thinc from source requires the full dependencies listed in
requirements.txt to be installed. You'll also need a compiler to build the C extensions.
git clone https://github.com/explosion/thinc cd thinc python -m venv .env source .env/bin/activate export PYTHONPATH=`pwd` pip install -r requirements.txt python setup.py build_ext --inplace
🚦 Running tests
Thinc comes with an extensive test suite. The following should all pass and not report any warnings or errors:
python -m pytest thinc # test suite python -m mypy thinc # type checks python -m flake8 thinc # linting
To view test coverage, you can run
python -m pytest thinc --cov=thinc. We aim for a 100% test coverage. This doesn't mean that we meticulously write tests for every single line – we ignore blocks that are not relevant or difficult to test and make sure that the tests execute all code paths.