Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rebase with TGI v2.0 #134

Merged
merged 154 commits into from
May 6, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
154 commits
Select commit Hold shift + click to select a range
ab34c16
Fix AMD documentation (#1307)
fxmarty Dec 4, 2023
a41c1a6
Add a stale bot. (#1313)
Narsil Dec 5, 2023
a7f52f3
Speculative (#1308)
Narsil Dec 11, 2023
9aef902
feat: mixtral (#1328)
OlivierDehaene Dec 11, 2023
79f268f
chore: formatting
OlivierDehaene Dec 11, 2023
db5053f
v1.3.0
OlivierDehaene Dec 11, 2023
09c556d
v1.3.1
OlivierDehaene Dec 11, 2023
f9b58ac
feat: add quant to mixtral (#1337)
OlivierDehaene Dec 12, 2023
05f8c85
v1.3.2
OlivierDehaene Dec 12, 2023
2f88d8d
fix: default max_new_tokens to 100
OlivierDehaene Dec 13, 2023
c974437
fix: fix gpt-q params loading
OlivierDehaene Dec 14, 2023
5c9ef06
feat: add more latency metrics in forward (#1346)
OlivierDehaene Dec 14, 2023
28fcdcc
fix: fix triton OutOfResources import
OlivierDehaene Dec 14, 2023
b3c2d72
fix: fix quant linear autotune
OlivierDehaene Dec 14, 2023
04dbf7a
fix: slice stopping criteria buffer
OlivierDehaene Dec 14, 2023
214ec0e
fix: only keep stop sequence buffer if we have some
OlivierDehaene Dec 14, 2023
bb62005
fix: max_past default value must be -1, not 0 (#1348)
OlivierDehaene Dec 15, 2023
3600fc9
v1.3.3
OlivierDehaene Dec 15, 2023
a95e6d6
feat: relax mistral requirements (#1351)
OlivierDehaene Dec 15, 2023
ecb0db4
fix: fix logic if sliding window key is not present in config (#1352)
OlivierDehaene Dec 15, 2023
5ff9e81
fix: fix offline (#1341) (#1347)
OlivierDehaene Dec 18, 2023
b7299e1
fix: fix gpt-q with groupsize = -1 (#1358)
OlivierDehaene Dec 18, 2023
be05972
Peft safetensors. (#1364)
Narsil Dec 20, 2023
3e22ad9
docs: Change URL for Habana Gaudi support in doc (#1343)
regisss Dec 21, 2023
7eeabb9
feat: update exllamav2 kernels (#1370)
OlivierDehaene Dec 21, 2023
8cc4306
Fix local load for peft (#1373)
Narsil Dec 21, 2023
62646c2
v1.3.4
OlivierDehaene Dec 22, 2023
fc9173a
docs: update required CUDA version to 12.2
OlivierDehaene Jan 9, 2024
118344b
fix: fix local loading for .bin models (#1419)
OlivierDehaene Jan 9, 2024
92ddb41
Fix missing make target platform for local install: 'install-flash-at…
deepily Jan 9, 2024
af63e32
fix: follow base model for tokenizer in router (#1424)
OlivierDehaene Jan 10, 2024
e930ad9
Fix local load for Medusa (#1420)
PYNing Jan 10, 2024
12cfc79
Return prompt vs generated tokens. (#1436)
Narsil Jan 11, 2024
76b226b
feat: supports openai chat completions API (#1427)
drbh Jan 16, 2024
77afb88
feat: support raise_exception, bos and eos tokens (#1450)
drbh Jan 18, 2024
935ee00
chore: bump rust version and annotate/fix all clippy warnings (#1455)
drbh Jan 22, 2024
5836a1c
feat: conditionally toggle chat on invocations route (#1454)
drbh Jan 22, 2024
1b99d4c
Disable `decoder_input_details` on OpenAI-compatible chat streaming, …
EndlessReform Jan 23, 2024
2a3a9c5
Fixing non divisible embeddings. (#1476)
Narsil Jan 24, 2024
ae222cc
Add messages api compatibility docs (#1478)
drbh Jan 24, 2024
be9bfae
Add a new `/tokenize` route to get the tokenized input (#1471)
Narsil Jan 25, 2024
b2fc097
feat: adds phi model (#1442)
drbh Jan 25, 2024
ac0be8a
fix: read stderr in download (#1486)
OlivierDehaene Jan 25, 2024
a1124f7
Update the docs
Narsil Jan 26, 2024
41fbf5c
fix: show warning with tokenizer config parsing error (#1488)
drbh Jan 26, 2024
82f20c4
fix: launcher doc typos (#1473)
Narsil Jan 26, 2024
ea2aa53
Reinstate exl2 with tp (#1490)
Narsil Jan 26, 2024
b064b33
Add sealion mpt support (#1477)
Narsil Jan 26, 2024
9fd5f51
Trying to fix that flaky test. (#1491)
Narsil Jan 26, 2024
5134d9c
fix: launcher doc typos (#1462)
thelinuxkid Jan 26, 2024
5d663fb
Update the docs to include newer models. (#1492)
Narsil Jan 26, 2024
4b376b3
GPTQ support on ROCm (#1489)
fxmarty Jan 26, 2024
ac580f5
feat: add tokenizer-config-path to launcher args (#1495)
drbh Jan 26, 2024
efd4b97
v1.4.0 (#1494)
OlivierDehaene Jan 26, 2024
4339345
Fixing top_n_tokens. (#1497)
Narsil Jan 26, 2024
050c584
Sending compute type from the environment instead of hardcoded string…
Narsil Jan 29, 2024
89fa4fd
Create the compute type at launch time (if not provided in the env). …
Narsil Jan 29, 2024
86796bc
Modify default for max_new_tokens in python client (#1336)
freitng Jan 29, 2024
bf72c03
feat: eetq gemv optimization when batch_size <= 4 (#1502)
dtlzhuangz Jan 31, 2024
11d8e71
fix: improve messages api docs content and formatting (#1506)
drbh Jan 31, 2024
27daa51
GPTNeoX: Use static rotary embedding (#1498)
dwyatte Feb 1, 2024
1a0bfe3
Freshen up the README.
Narsil Feb 1, 2024
2bf3931
Hotfix the / health - route. (#1515)
Narsil Feb 1, 2024
6c0b21b
Revert "Modify default for max_new_tokens in python client (#1336)"
Narsil Feb 1, 2024
14b40bf
fix: tokenizer config should use local model path when possible (#1518)
drbh Feb 1, 2024
369ae2d
Updating tokenizers. (#1517)
Narsil Feb 1, 2024
e39ba49
[docs] Fix link to Install CLI (#1526)
pcuenca Feb 2, 2024
62a40b8
feat: add ie update to message docs (#1523)
drbh Feb 2, 2024
99cb270
feat: use existing add_generation_prompt variable from config in temp…
drbh Feb 7, 2024
51a4e62
Impl simple mamba model (#1480)
drbh Feb 8, 2024
cec954e
Update to peft 0.8.2 (#1537)
Stillerman Feb 8, 2024
f1d8da3
feat(server): add frequency penalty (#1541)
OlivierDehaene Feb 8, 2024
8415d46
chore: bump ci rust version (#1543)
drbh Feb 9, 2024
777e519
ROCm AWQ support (#1514)
IlyasMoutawwakil Feb 9, 2024
518d30d
feat(router): add max_batch_size (#1542)
OlivierDehaene Feb 9, 2024
0c207f7
feat: experimental support for cuda graphs (#1428)
OlivierDehaene Feb 12, 2024
91b56a7
feat: add deserialize_with that handles strings or objects with conte…
drbh Feb 13, 2024
d05d930
Fixing glibc version in the runtime. (#1556)
Narsil Feb 13, 2024
f6500bf
Upgrade intermediary layer for nvidia too. (#1557)
Narsil Feb 13, 2024
e93cc34
Improving mamba runtime by using updates (#1552)
Narsil Feb 14, 2024
686b56a
Small cleanup. (#1560)
Narsil Feb 14, 2024
55acb86
Outlines guided generation (#1539)
drbh Feb 15, 2024
cfccdf3
Added `name` field to OpenAI compatible API Messages (#1563)
amihalik Feb 15, 2024
69a2ead
Bugfix: eos and bos tokens positions are inconsistent (#1567)
amihalik Feb 16, 2024
31b5e37
chore: add pre-commit (#1569)
OlivierDehaene Feb 16, 2024
cf946b3
feat: add chat template struct to avoid tuple ordering errors (#1570)
OlivierDehaene Feb 16, 2024
2ac1b55
v1.4.1 (#1568)
OlivierDehaene Feb 16, 2024
5a54d91
Fix mistral with length > window_size for long prefills (rotary doesn…
Narsil Feb 19, 2024
c3053e8
improve endpoint support (#1577)
drbh Feb 20, 2024
5addb84
fix: refactor syntax to correctly include structs (#1580)
drbh Feb 20, 2024
3c6e6d8
fix(router): fix openapi and add jsonschema validation (#1578)
OlivierDehaene Feb 21, 2024
a461257
feat: add support for Gemma (#1583)
OlivierDehaene Feb 21, 2024
e7183c2
v1.4.2 (#1585)
OlivierDehaene Feb 21, 2024
d94343d
fix: fix openapi schema (#1586)
OlivierDehaene Feb 21, 2024
70ac5c3
fix: avoid default message (#1579)
drbh Feb 22, 2024
21d52c9
Revamp medusa implementation so that every model can benefit. (#1588)
Narsil Feb 26, 2024
f215cc1
Support tools (#1587)
drbh Feb 28, 2024
35f7c3f
Fixing x-compute-time. (#1606)
Narsil Feb 28, 2024
bc6ab91
Fixing guidance docs. (#1607)
Narsil Feb 28, 2024
7c6a47b
feat: starcoder2 (#1605)
OlivierDehaene Feb 28, 2024
666cdaa
feat: Qwen2 (#1608)
OlivierDehaene Feb 28, 2024
e9b2003
v1.4.3 (#1609)
OlivierDehaene Feb 28, 2024
e259625
fix: Handle concurrent grammar requests (#1610)
drbh Feb 29, 2024
0390b28
Fix idefics default. (#1614)
Narsil Feb 29, 2024
0a5755e
Fix async client timeout (#1617)
hugoabonizio Feb 29, 2024
5a2a0ca
feat: accept legacy request format and response (#1527)
drbh Feb 29, 2024
dc7c69e
fix: add missing stop parameter for chat request (#1619)
drbh Mar 1, 2024
d4aebbd
fix: correctly index into mask when applying grammar (#1618)
drbh Mar 1, 2024
cd8163d
Use a better model for the quick tour (#1639)
lewtun Mar 12, 2024
7809825
Upgrade nix version from 0.27.1 to 0.28.0 (#1638)
kdamaszk Apr 25, 2024
86c5ce5
Update peft + transformers + accelerate + bnb + safetensors (#1646)
abhishekkrthakur Mar 15, 2024
50d8f99
Fix index in ChatCompletionChunk (#1648)
Wauplin Mar 16, 2024
925f9c4
Fixing minor typo in documentation: supported hardware section (#1632)
SachinVarghese Mar 18, 2024
08525d9
feat: bump minijina and add test for core templates (#1626)
drbh Mar 20, 2024
d888bc2
feat: support force downcast after FastRMSNorm multiply for Gemma (#1…
drbh Mar 21, 2024
b36c0f8
fix: prefer spaces url over temp url (#1662)
drbh Mar 21, 2024
ab074c8
fix: improve tool type, bump pydantic and outlines (#1650)
drbh Mar 21, 2024
6729783
Remove unecessary cuda graph. (#1664)
Narsil Mar 21, 2024
07f05a8
Repair idefics integration tests. (#1663)
Narsil Mar 21, 2024
c4f92ec
feat: update client to 0.7 (#1667)
OlivierDehaene Mar 22, 2024
097e72a
fix: LlamaTokenizerFast to AutoTokenizer at flash_mistral.py (#1637)
SeongBeomLEE Mar 22, 2024
ecdacbb
Inline images for multimodal models. (#1666)
Narsil Mar 22, 2024
da4199e
feat: cohere (#1660)
OlivierDehaene Mar 22, 2024
6ac93d8
v1.4.4 (#1668)
OlivierDehaene Mar 22, 2024
d5ed4c1
fix: adjust logprob response logic (#1682)
drbh Mar 28, 2024
5667039
fix: handle batches with and without grammars (#1676)
drbh Mar 28, 2024
dc1ab20
feat: Add dbrx support (#1685)
OlivierDehaene Mar 29, 2024
0bf856d
v1.4.5 (#1686)
OlivierDehaene Mar 29, 2024
29c316e
Add cuda graphs sizes and make it default. (#1703)
Narsil Apr 4, 2024
fe063b8
Pickle conversion now requires `--trust-remote-code`. (#1704)
Narsil Apr 5, 2024
62672c6
Push users to streaming in the readme. (#1698)
Narsil Apr 5, 2024
fec3f8f
Fixing cohere tokenizer. (#1697)
Narsil Apr 5, 2024
3417398
Force weights_only (before fully breaking pickle files anyway). (#1710)
Narsil Apr 5, 2024
8d4aec0
Regenerate ld.so.cache (#1708)
oOraph Apr 8, 2024
fb998da
Revert license to Apache 2.0 (#1714)
OlivierDehaene Apr 8, 2024
351bd5f
Automatic quantization config. (#1719)
Narsil Apr 9, 2024
2b2f4de
Adding Llava-Next (Llava 1.6) with full support. (#1709)
Narsil Apr 9, 2024
a1b65e5
fix: fix CohereForAI/c4ai-command-r-plus (#1707)
OlivierDehaene Apr 10, 2024
d1d0b3c
hotfix: mixtral
OlivierDehaene Apr 10, 2024
d9dcfe4
Update libraries (#1713)
abhishekkrthakur Apr 11, 2024
e428c7c
Easier defaults for models stemmed from configs.
Narsil Apr 11, 2024
c4ee0a6
Revert "Easier defaults for models stemmed from configs."
Narsil Apr 11, 2024
194fcb4
Dev/mask ldconfig output v2 (#1716)
oOraph Apr 11, 2024
935d56a
Fp8 Support (#1726)
Narsil Apr 12, 2024
0707a09
Upgrade EETQ (Fixes the cuda graphs). (#1729)
Narsil Apr 12, 2024
e6421f6
fix(router): fix a possible deadlock in next_batch (#1731)
OlivierDehaene Apr 12, 2024
7a62f74
chore(cargo-toml): apply lto fat and codegen-units of one (#1651)
somehowchris Apr 12, 2024
661081d
Improve the defaults for the launcher (#1727)
Narsil Apr 12, 2024
f6d5c2e
feat: medusa v2 (#1734)
OlivierDehaene Apr 12, 2024
6ad5aa7
Fix typo in guidance.md (#1735)
eltociear Apr 12, 2024
c6a31b9
v2.0.0 (#1736)
OlivierDehaene Apr 12, 2024
600d033
Merge branch 'habana-main' into rebase_tgi_2.0
kdamaszk Apr 29, 2024
3d78027
A patch to address HPU Graphs issue with DILL
yafshar Apr 23, 2024
0bbec63
Update README example commands
kdamaszk May 6, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 5 additions & 5 deletions .github/ISSUE_TEMPLATE/bug-report.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,14 +5,14 @@ body:
id: system-info
attributes:
label: System Info
description: |
description: |
Please share your system info with us (`text-generation-launcher --env` if installed locally).
The full command line used that causes issues:
The full command line used that causes issues:
OS version:
Rust version (if self-compiling, `cargo version`):
Model being used (`curl 127.0.0.1:8080/info | jq`):
If local model please explicit the kind of model and/or equivalents.
Hardware used (GPUs, how many, on which cloud) (`nvidia-smi`):
Hardware used (GPUs, how many, on which cloud) (`nvidia-smi`):
Deployment specificities (Kubernetes, EKS, AKS, any particular deployments):
The current version being used:

Expand Down Expand Up @@ -52,11 +52,11 @@ body:

placeholder: |
Steps to reproduce the behavior:

1.
2.
3.


- type: textarea
id: expected-behavior
Expand Down
2 changes: 1 addition & 1 deletion .github/ISSUE_TEMPLATE/feature-request.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ body:
label: Motivation
description: |
Please outline the motivation for the proposal. Is your feature request related to a problem? e.g., I'm always frustrated when [...]. If this is related to another GitHub issue, please link here too.


- type: textarea
id: contribution
Expand Down
6 changes: 3 additions & 3 deletions .github/workflows/autodocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,15 +6,15 @@ on:
jobs:
update_docs:
runs-on: ubuntu-latest

steps:
- name: Checkout code
uses: actions/checkout@v2

- name: Install Launcher
id: install-launcher
run: cargo install --git https://github.com/${{ github.repository }} --branch ${{ github.head_ref }} text-generation-launcher

- name: Check launcher Docs are up-to-date
run: |
echo text-generation-launcher --help
Expand Down
78 changes: 40 additions & 38 deletions .github/workflows/build.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -146,11 +146,50 @@ jobs:
cache-from: type=registry,ref=registry.internal.huggingface.tech/api-inference/community/text-generation-inference:cache,mode=min
cache-to: type=registry,ref=registry.internal.huggingface.tech/api-inference/community/text-generation-inference:cache,mode=min

integration-tests:
concurrency:
group: ${{ github.workflow }}-${{ github.job }}-${{ github.head_ref || github.run_id }}
cancel-in-progress: true
needs:
- start-runner
- build-and-push-image # Wait for the docker image to be built
runs-on: ${{ needs.start-runner.outputs.label }} # run the job on the newly created runner
env:
DOCKER_VOLUME: /cache
steps:
- uses: actions/checkout@v2
- name: Inject slug/short variables
uses: rlespinasse/github-slug-action@v4.4.1
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: 3.9
- name: Tailscale
uses: tailscale/github-action@7bd8039bf25c23c4ab1b8d6e2cc2da2280601966
with:
authkey: ${{ secrets.TAILSCALE_AUTHKEY }}
- name: Prepare disks
run: |
sudo mkfs -t ext4 /dev/nvme1n1
sudo mkdir ${{ env.DOCKER_VOLUME }}
sudo mount /dev/nvme1n1 ${{ env.DOCKER_VOLUME }}
- name: Install
run: |
make install-integration-tests
- name: Run tests
run: |
export DOCKER_IMAGE=registry.internal.huggingface.tech/api-inference/community/text-generation-inference:sha-${{ env.GITHUB_SHA_SHORT }}
export HUGGING_FACE_HUB_TOKEN=${{ secrets.HUGGING_FACE_HUB_TOKEN }}
pytest -s -vv integration-tests

build-and-push-image-rocm:
concurrency:
group: ${{ github.workflow }}-build-and-push-image-rocm-${{ github.head_ref || github.run_id }}
cancel-in-progress: true
needs: start-runner # required to start the main job when the runner is ready
needs:
- start-runner
- build-and-push-image # Wait for the main docker image to be built
- integration-tests # Wait for the main integration-tests
runs-on: ${{ needs.start-runner.outputs.label }} # run the job on the newly created runner
permissions:
contents: write
Expand Down Expand Up @@ -235,43 +274,6 @@ jobs:
cache-from: type=registry,ref=registry.internal.huggingface.tech/api-inference/community/text-generation-inference:cache-rocm,mode=min
cache-to: type=registry,ref=registry.internal.huggingface.tech/api-inference/community/text-generation-inference:cache-rocm,mode=min

integration-tests:
concurrency:
group: ${{ github.workflow }}-${{ github.job }}-${{ github.head_ref || github.run_id }}
cancel-in-progress: true
needs:
- start-runner
- build-and-push-image # Wait for the docker image to be built
- build-and-push-image-rocm
runs-on: ${{ needs.start-runner.outputs.label }} # run the job on the newly created runner
env:
DOCKER_VOLUME: /cache
steps:
- uses: actions/checkout@v2
- name: Inject slug/short variables
uses: rlespinasse/github-slug-action@v4.4.1
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: 3.9
- name: Tailscale
uses: tailscale/github-action@7bd8039bf25c23c4ab1b8d6e2cc2da2280601966
with:
authkey: ${{ secrets.TAILSCALE_AUTHKEY }}
- name: Prepare disks
run: |
sudo mkfs -t ext4 /dev/nvme1n1
sudo mkdir ${{ env.DOCKER_VOLUME }}
sudo mount /dev/nvme1n1 ${{ env.DOCKER_VOLUME }}
- name: Install
run: |
make install-integration-tests
- name: Run tests
run: |
export DOCKER_IMAGE=registry.internal.huggingface.tech/api-inference/community/text-generation-inference:sha-${{ env.GITHUB_SHA_SHORT }}
export HUGGING_FACE_HUB_TOKEN=${{ secrets.HUGGING_FACE_HUB_TOKEN }}
pytest -s -vv integration-tests

stop-runner:
name: Stop self-hosted EC2 runner
needs:
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/build_pr_documentation.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,4 +16,4 @@ jobs:
commit_sha: ${{ github.event.pull_request.head.sha }}
pr_number: ${{ github.event.number }}
package: text-generation-inference
additional_args: --not_python_module
additional_args: --not_python_module
12 changes: 0 additions & 12 deletions .github/workflows/delete_doc_comment.yml

This file was deleted.

14 changes: 14 additions & 0 deletions .github/workflows/stale.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
name: 'Close stale issues and PRs'
on:
schedule:
- cron: '30 1 * * *'

jobs:
stale:
runs-on: ubuntu-latest
steps:
- uses: actions/stale@v8
with:
stale-issue-message: 'This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.'
days-before-stale: 30
days-before-close: 5
18 changes: 12 additions & 6 deletions .github/workflows/tests.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -33,11 +33,18 @@ jobs:
- name: Install Rust
uses: actions-rs/toolchain@v1
with:
toolchain: 1.71.0
# Released on: 28 December, 2023
# Branched from master on: 10 November, 2023
# https://releases.rs/docs/1.75.0/
toolchain: 1.75.0
override: true
components: rustfmt, clippy
- name: Install Protoc
uses: arduino/setup-protoc@v1
- name: Clean unused files
run: |
sudo rm -rf /usr/local/lib/android # will release about 10 GB if you don't need Android
sudo rm -rf /usr/share/dotnet # will release about 20GB if you don't need .NET
- name: Install sccache
run: |
curl -fsSL https://github.com/mozilla/sccache/releases/download/v$SCCACHE/sccache-v$SCCACHE-x86_64-unknown-linux-musl.tar.gz | tar -xzv --strip-components=1 -C /usr/local/bin sccache-v$SCCACHE-x86_64-unknown-linux-musl/sccache
Expand Down Expand Up @@ -68,12 +75,11 @@ jobs:
pip install pytest
export HUGGING_FACE_HUB_TOKEN=${{ secrets.HUGGING_FACE_HUB_TOKEN }}
pytest -s -vv server/tests
- name: Run Rust fmt
run: |
cargo fmt --check
- name: Run Rust clippy
- name: Pre-commit checks
run: |
cargo clippy
pip install pre-commit
pre-commit install
pre-commit run --all-files
- name: Run Rust tests
run: |
cargo test
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/upload_pr_documentation.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,4 +13,4 @@ jobs:
package_name: text-generation-inference
secrets:
hf_token: ${{ secrets.HF_DOC_BUILD_PUSH }}
comment_bot_token: ${{ secrets.COMMENT_BOT_TOKEN }}
comment_bot_token: ${{ secrets.COMMENT_BOT_TOKEN }}
9 changes: 9 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,3 +2,12 @@
target
router/tokenizer.json
*__pycache__*

# ROCm auto-generated files
*.hip
server/exllamav2_kernels/exllamav2_kernels/hip/
server/exllama_kernels/exllama_kernels/hip/
server/exllama_kernels/exllama_kernels/hip_func/
*_hip.cuh
server/exllama_kernels/exllama_kernels/hip_buffers.cuh
server/exllama_kernels/exllama_kernels/exllama_ext_hip.cpp
18 changes: 18 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.5.0
hooks:
- id: check-yaml
- id: end-of-file-fixer
- id: trailing-whitespace
exclude: docs/source/basic_tutorials/launcher.md
- repo: https://github.com/psf/black
rev: 24.2.0
hooks:
- id: black
- repo: https://github.com/doublify/pre-commit-rust
rev: v1.0
hooks:
- id: fmt
- id: cargo-check
- id: clippy
Loading