This commit introduces the ability to load Kimi-compatible encodings directly
from HuggingFace repositories. It includes a new resolution mechanism that
downloads and caches required files locally.
Key changes:
Introduced TiktokenEx.HuggingFace for resolving files from HuggingFace. It
includes local file system caching under the user cache directory, atomic
writes to prevent corruption during concurrent access, and injectable
fetchers for network-independent testing.
Added TiktokenEx.Cache, an optional ETS-based cache for built encodings.
This allows users to reuse built encoding structures across their
application by keying them against repository name and revision.
Expanded TiktokenEx.Kimi with the from_hf_repo/2 function. This simplifies
the workflow for using Kimi-style models by automatically fetching the
tiktoken.model and tokenizer_config.json files from the remote repository.
Updated dependencies and project configuration. The package now includes
Inets, SSL, and Public Key as extra applications to support HTTPS
downloads. Credo has been added for static analysis, and the version
is bumped to 0.2.0.
Comprehensive tests were added for the cache logic, file resolution
sanitization, and the integration between Kimi and HuggingFace.