Skip to content

Latest commit

History

History
48 lines (35 loc) 路 2.13 KB

encode-inputs.mdx

File metadata and controls

48 lines (35 loc) 路 2.13 KB

Encode Inputs

These types represent all the different kinds of input that a [`~tokenizers.Tokenizer`] accepts when using [`~tokenizers.Tokenizer.encode_batch`].

TextEncodeInput[[[[tokenizers.TextEncodeInput]]]]

tokenizers.TextEncodeInput

Represents a textual input for encoding. Can be either:

alias of Union[str, Tuple[str, str], List[str]].

PreTokenizedEncodeInput[[[[tokenizers.PreTokenizedEncodeInput]]]]

tokenizers.PreTokenizedEncodeInput

Represents a pre-tokenized input for encoding. Can be either:

alias of Union[List[str], Tuple[str], Tuple[Union[List[str], Tuple[str]], Union[List[str], Tuple[str]]], List[Union[List[str], Tuple[str]]]].

EncodeInput[[[[tokenizers.EncodeInput]]]]

tokenizers.EncodeInput

Represents all the possible types of input for encoding. Can be:

alias of Union[str, Tuple[str, str], List[str], Tuple[str], Tuple[Union[List[str], Tuple[str]], Union[List[str], Tuple[str]]], List[Union[List[str], Tuple[str]]]]. The Rust API Reference is available directly on the Docs.rs website. The node API has not been documented yet.