Skip to content

Conversation

@xenova
Copy link
Collaborator

@xenova xenova commented Nov 1, 2025

This PR updates package.json and moves the built types into a separate directory. I tried updating the settings to place everything in dist/index.d.ts, but didn't seem to be possible. I think this is a good compromise, makes the dist directory only contain the implementation (a bit cleaner imo). Also, we can now ship both the minified and unminified versions.

dist/
├── tokenizers.cjs
├── tokenizers.min.cjs
├── tokenizers.min.mjs
└── tokenizers.mjs
types/
├── core
│   ├── AddedToken.d.ts
│   ├── Decoder.d.ts
│   ├── Normalizer.d.ts
│   ├── PostProcessor.d.ts
│   ├── PreTokenizer.d.ts
│   ├── Tokenizer.d.ts
│   ├── TokenizerModel.d.ts
│   ├── decoder
│   │   ├── BPEDecoder.d.ts
│   │   ├── ByteFallback.d.ts
│   │   ├── ByteLevelDecoder.d.ts
│   │   ├── CTCDecoder.d.ts
│   │   ├── DecoderSequence.d.ts
│   │   ├── FuseDecoder.d.ts
│   │   ├── MetaspaceDecoder.d.ts
│   │   ├── ReplaceDecoder.d.ts
│   │   ├── StripDecoder.d.ts
│   │   ├── WordPieceDecoder.d.ts
│   │   └── create_decoder.d.ts
│   ├── normalizer
│   │   ├── BertNormalizer.d.ts
│   │   ├── Lowercase.d.ts
│   │   ├── NFC.d.ts
│   │   ├── NFD.d.ts
│   │   ├── NFKC.d.ts
│   │   ├── NFKD.d.ts
│   │   ├── NormalizerSequence.d.ts
│   │   ├── Precompiled.d.ts
│   │   ├── Prepend.d.ts
│   │   ├── Replace.d.ts
│   │   ├── StripAccents.d.ts
│   │   ├── StripNormalizer.d.ts
│   │   ├── UnicodeNormalizer.d.ts
│   │   └── create_normalizer.d.ts
│   ├── postProcessor
│   │   ├── BertProcessing.d.ts
│   │   ├── ByteLevelPostProcessor.d.ts
│   │   ├── PostProcessorSequence.d.ts
│   │   ├── RobertaProcessing.d.ts
│   │   ├── TemplateProcessing.d.ts
│   │   └── create_post_processor.d.ts
│   ├── preTokenizer
│   │   ├── BertPreTokenizer.d.ts
│   │   ├── ByteLevelPreTokenizer.d.ts
│   │   ├── DigitsPreTokenizer.d.ts
│   │   ├── MetaspacePreTokenizer.d.ts
│   │   ├── PreTokenizerSequence.d.ts
│   │   ├── PunctuationPreTokenizer.d.ts
│   │   ├── ReplacePreTokenizer.d.ts
│   │   ├── SplitPreTokenizer.d.ts
│   │   ├── WhitespacePreTokenizer.d.ts
│   │   ├── WhitespaceSplit.d.ts
│   │   └── create_pre_tokenizer.d.ts
│   └── tokenizerModelImplementations
│       ├── BPE.d.ts
│       ├── LegacyTokenizerModel.d.ts
│       ├── Unigram.d.ts
│       ├── WordPieceTokenizer.d.ts
│       └── create_tokenizer_model.d.ts
├── index.d.ts
├── static
│   ├── constants.d.ts
│   └── types.d.ts
└── utils
    ├── Callable.d.ts
    ├── core.d.ts
    ├── data-structures
    │   ├── CharTrie.d.ts
    │   ├── DictionarySplitter.d.ts
    │   ├── LRUCache.d.ts
    │   ├── PriorityQueue.d.ts
    │   └── TokenLattice.d.ts
    ├── index.d.ts
    └── maths.d.ts

11 directories, 70 files

@xenova xenova requested a review from nico-martin November 1, 2025 04:46
@xenova xenova changed the title Types improvements Separate types into separate folder Nov 1, 2025
@xenova xenova changed the title Separate types into separate folder package.json updates & move built types to separate folder Nov 1, 2025
@nico-martin nico-martin merged commit bfd24cb into main Nov 2, 2025
@xenova xenova deleted the types branch November 2, 2025 21:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants