Skip to content

Add a WordPiece tokenizer layer #22

@mattdangerw

Description

@mattdangerw

Tensorflow text provides a set of efficient, in graph, WordPiece tokenization ops.

We would like to expose these through a keras layer in a way that is easily configurable, supports both tokenization and detokenization, and integrates properly with Keras functional models.

This can serve as an example for future subword tokenization work in this library.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions