Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

exllama(v2): fix exllamav1, add exllamav2 #1384

Merged
merged 2 commits into from
Dec 5, 2023
Merged

Conversation

mudler
Copy link
Owner

@mudler mudler commented Dec 4, 2023

Description

This PR fixes #1053 . It is a first stab at defining an exllamav2 backend

Notes for Reviewers

Signed commits

  • Yes, I signed my commits.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
@mudler mudler added the enhancement New feature or request label Dec 4, 2023
Copy link

netlify bot commented Dec 4, 2023

Deploy Preview for localai canceled.

Name Link
🔨 Latest commit 13418fc
🔍 Latest deploy log https://app.netlify.com/sites/localai/deploys/656e12f52b5f720009316e40

@mudler mudler merged commit 2b2d667 into master Dec 5, 2023
24 checks passed
@mudler mudler deleted the fixups_backends_v2 branch December 5, 2023 07:15
@mudler mudler mentioned this pull request Dec 15, 2023
98 tasks
truecharts-admin referenced this pull request in truecharts/public Dec 18, 2023
…1.0 by renovate (#16284)

This PR contains the following updates:

| Package | Update | Change |
|---|---|---|
| [quay.io/go-skynet/local-ai](https://togithub.com/mudler/LocalAI) |
minor | `v2.0.0-cublas-cuda11` -> `v2.1.0-cublas-cuda11` |

---

> [!WARNING]
> Some dependencies could not be looked up. Check the Dependency
Dashboard for more information.

---

### Release Notes

<details>
<summary>mudler/LocalAI (quay.io/go-skynet/local-ai)</summary>

### [`v2.1.0`](https://togithub.com/mudler/LocalAI/releases/tag/v2.1.0)

[Compare
Source](https://togithub.com/mudler/LocalAI/compare/v2.0.0...v2.1.0)

<!-- Release notes generated using configuration in .github/release.yml
at master -->

##### What's Changed

##### Breaking Changes 🛠

- feat(alias): alias llama to llama-cpp, update docs by
[@&#8203;mudler](https://togithub.com/mudler) in
[https://github.com/mudler/LocalAI/pull/1448](https://togithub.com/mudler/LocalAI/pull/1448)

##### Bug fixes 🐛

- fix(piper): pin petals, phonemize and espeak by
[@&#8203;mudler](https://togithub.com/mudler) in
[https://github.com/mudler/LocalAI/pull/1393](https://togithub.com/mudler/LocalAI/pull/1393)
- update(llama.cpp): update server, correctly propagate LLAMA_VERSION by
[@&#8203;mudler](https://togithub.com/mudler) in
[https://github.com/mudler/LocalAI/pull/1440](https://togithub.com/mudler/LocalAI/pull/1440)

##### Exciting New Features 🎉

- Added Check API KEYs file to API.go by
[@&#8203;lunamidori5](https://togithub.com/lunamidori5) in
[https://github.com/mudler/LocalAI/pull/1381](https://togithub.com/mudler/LocalAI/pull/1381)
- exllama(v2): fix exllamav1, add exllamav2 by
[@&#8203;mudler](https://togithub.com/mudler) in
[https://github.com/mudler/LocalAI/pull/1384](https://togithub.com/mudler/LocalAI/pull/1384)
- Fix: API Key / JSON Fast Follow
[#&#8203;1](https://togithub.com/mudler/LocalAI/issues/1) by
[@&#8203;dave-gray101](https://togithub.com/dave-gray101) in
[https://github.com/mudler/LocalAI/pull/1388](https://togithub.com/mudler/LocalAI/pull/1388)
- feat: add transformers-musicgen backend by
[@&#8203;dave-gray101](https://togithub.com/dave-gray101) in
[https://github.com/mudler/LocalAI/pull/1387](https://togithub.com/mudler/LocalAI/pull/1387)
- feat(diffusers): update, add autopipeline, controlnet by
[@&#8203;mudler](https://togithub.com/mudler) in
[https://github.com/mudler/LocalAI/pull/1432](https://togithub.com/mudler/LocalAI/pull/1432)
- feat(img2vid,txt2vid): Initial support for img2vid,txt2vid by
[@&#8203;mudler](https://togithub.com/mudler) in
[https://github.com/mudler/LocalAI/pull/1442](https://togithub.com/mudler/LocalAI/pull/1442)

##### 👒 Dependencies

- ⬆️ Update ggerganov/whisper.cpp by
[@&#8203;localai-bot](https://togithub.com/localai-bot) in
[https://github.com/mudler/LocalAI/pull/1378](https://togithub.com/mudler/LocalAI/pull/1378)
- ⬆️ Update ggerganov/llama.cpp by
[@&#8203;localai-bot](https://togithub.com/localai-bot) in
[https://github.com/mudler/LocalAI/pull/1379](https://togithub.com/mudler/LocalAI/pull/1379)
- ⬆️ Update ggerganov/whisper.cpp by
[@&#8203;localai-bot](https://togithub.com/localai-bot) in
[https://github.com/mudler/LocalAI/pull/1430](https://togithub.com/mudler/LocalAI/pull/1430)
- ⬆️ Update mudler/go-piper by
[@&#8203;localai-bot](https://togithub.com/localai-bot) in
[https://github.com/mudler/LocalAI/pull/1441](https://togithub.com/mudler/LocalAI/pull/1441)
- ⬆️ Update ggerganov/whisper.cpp by
[@&#8203;localai-bot](https://togithub.com/localai-bot) in
[https://github.com/mudler/LocalAI/pull/1434](https://togithub.com/mudler/LocalAI/pull/1434)

##### Other Changes

- ⬆️ Update ggerganov/llama.cpp by
[@&#8203;localai-bot](https://togithub.com/localai-bot) in
[https://github.com/mudler/LocalAI/pull/1385](https://togithub.com/mudler/LocalAI/pull/1385)
- docs: site update fixing old image text / How To update updating GPU
and CPU docker pages by
[@&#8203;lunamidori5](https://togithub.com/lunamidori5) in
[https://github.com/mudler/LocalAI/pull/1399](https://togithub.com/mudler/LocalAI/pull/1399)
- feat: cuda transformers by
[@&#8203;mudler](https://togithub.com/mudler) in
[https://github.com/mudler/LocalAI/pull/1401](https://togithub.com/mudler/LocalAI/pull/1401)
- feat(entrypoint): optionally prepare extra endpoints by
[@&#8203;mudler](https://togithub.com/mudler) in
[https://github.com/mudler/LocalAI/pull/1405](https://togithub.com/mudler/LocalAI/pull/1405)
- ⬆️ Update ggerganov/whisper.cpp by
[@&#8203;localai-bot](https://togithub.com/localai-bot) in
[https://github.com/mudler/LocalAI/pull/1390](https://togithub.com/mudler/LocalAI/pull/1390)
- ⬆️ Update mudler/go-piper by
[@&#8203;localai-bot](https://togithub.com/localai-bot) in
[https://github.com/mudler/LocalAI/pull/1400](https://togithub.com/mudler/LocalAI/pull/1400)
- tests: add diffusers tests by
[@&#8203;mudler](https://togithub.com/mudler) in
[https://github.com/mudler/LocalAI/pull/1419](https://togithub.com/mudler/LocalAI/pull/1419)
- ⬆️ Update ggerganov/whisper.cpp by
[@&#8203;localai-bot](https://togithub.com/localai-bot) in
[https://github.com/mudler/LocalAI/pull/1418](https://togithub.com/mudler/LocalAI/pull/1418)
- How To Updates / Model Used Switched / Removed "docker-compose" (RIP)
by [@&#8203;lunamidori5](https://togithub.com/lunamidori5) in
[https://github.com/mudler/LocalAI/pull/1417](https://togithub.com/mudler/LocalAI/pull/1417)
- fix(transformers\*): add sentence-transformers and
transformers-musicgen tests, fix musicgen wrapper by
[@&#8203;mudler](https://togithub.com/mudler) in
[https://github.com/mudler/LocalAI/pull/1420](https://togithub.com/mudler/LocalAI/pull/1420)
- extras: add vllm,bark,vall-e-x tests, bump diffusers by
[@&#8203;mudler](https://togithub.com/mudler) in
[https://github.com/mudler/LocalAI/pull/1422](https://togithub.com/mudler/LocalAI/pull/1422)
- Documentation for Hipblas by
[@&#8203;sfxworks](https://togithub.com/sfxworks) in
[https://github.com/mudler/LocalAI/pull/1425](https://togithub.com/mudler/LocalAI/pull/1425)
- ⬆️ Update ggerganov/llama.cpp by
[@&#8203;localai-bot](https://togithub.com/localai-bot) in
[https://github.com/mudler/LocalAI/pull/1391](https://togithub.com/mudler/LocalAI/pull/1391)
- docs: add aikit to integrations by
[@&#8203;sozercan](https://togithub.com/sozercan) in
[https://github.com/mudler/LocalAI/pull/1412](https://togithub.com/mudler/LocalAI/pull/1412)
- ⬆️ Update ggerganov/llama.cpp by
[@&#8203;localai-bot](https://togithub.com/localai-bot) in
[https://github.com/mudler/LocalAI/pull/1429](https://togithub.com/mudler/LocalAI/pull/1429)
- docs(mixtral): add mixtral example by
[@&#8203;mudler](https://togithub.com/mudler) in
[https://github.com/mudler/LocalAI/pull/1449](https://togithub.com/mudler/LocalAI/pull/1449)

##### New Contributors

- [@&#8203;sozercan](https://togithub.com/sozercan) made their first
contribution in
[https://github.com/mudler/LocalAI/pull/1412](https://togithub.com/mudler/LocalAI/pull/1412)

**Full Changelog**:
mudler/LocalAI@v2.0.0...v2.1.0

</details>

---

### Configuration

📅 **Schedule**: Branch creation - "before 10pm on monday" in timezone
Europe/Amsterdam, Automerge - At any time (no schedule defined).

🚦 **Automerge**: Enabled.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Renovate
Bot](https://togithub.com/renovatebot/renovate).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNy4xMDIuMCIsInVwZGF0ZWRJblZlciI6IjM3LjEwMi4wIiwidGFyZ2V0QnJhbmNoIjoibWFzdGVyIn0=-->
truecharts-admin referenced this pull request in truecharts/public Dec 18, 2023
…1.0@f0b3afa by renovate (#16304)

This PR contains the following updates:

| Package | Update | Change |
|---|---|---|
| [quay.io/go-skynet/local-ai](https://togithub.com/mudler/LocalAI) |
minor | `v2.0.0-cublas-cuda12-ffmpeg` -> `v2.1.0-cublas-cuda12-ffmpeg` |

---

> [!WARNING]
> Some dependencies could not be looked up. Check the Dependency
Dashboard for more information.

---

### Release Notes

<details>
<summary>mudler/LocalAI (quay.io/go-skynet/local-ai)</summary>

### [`v2.1.0`](https://togithub.com/mudler/LocalAI/releases/tag/v2.1.0)

[Compare
Source](https://togithub.com/mudler/LocalAI/compare/v2.0.0...v2.1.0)

<!-- Release notes generated using configuration in .github/release.yml
at master -->

#### What's Changed

##### Breaking Changes 🛠

- feat(alias): alias llama to llama-cpp, update docs by
[@&#8203;mudler](https://togithub.com/mudler) in
[https://github.com/mudler/LocalAI/pull/1448](https://togithub.com/mudler/LocalAI/pull/1448)

##### Bug fixes 🐛

- fix(piper): pin petals, phonemize and espeak by
[@&#8203;mudler](https://togithub.com/mudler) in
[https://github.com/mudler/LocalAI/pull/1393](https://togithub.com/mudler/LocalAI/pull/1393)
- update(llama.cpp): update server, correctly propagate LLAMA_VERSION by
[@&#8203;mudler](https://togithub.com/mudler) in
[https://github.com/mudler/LocalAI/pull/1440](https://togithub.com/mudler/LocalAI/pull/1440)

##### Exciting New Features 🎉

- Added Check API KEYs file to API.go by
[@&#8203;lunamidori5](https://togithub.com/lunamidori5) in
[https://github.com/mudler/LocalAI/pull/1381](https://togithub.com/mudler/LocalAI/pull/1381)
- exllama(v2): fix exllamav1, add exllamav2 by
[@&#8203;mudler](https://togithub.com/mudler) in
[https://github.com/mudler/LocalAI/pull/1384](https://togithub.com/mudler/LocalAI/pull/1384)
- Fix: API Key / JSON Fast Follow
[#&#8203;1](https://togithub.com/mudler/LocalAI/issues/1) by
[@&#8203;dave-gray101](https://togithub.com/dave-gray101) in
[https://github.com/mudler/LocalAI/pull/1388](https://togithub.com/mudler/LocalAI/pull/1388)
- feat: add transformers-musicgen backend by
[@&#8203;dave-gray101](https://togithub.com/dave-gray101) in
[https://github.com/mudler/LocalAI/pull/1387](https://togithub.com/mudler/LocalAI/pull/1387)
- feat(diffusers): update, add autopipeline, controlnet by
[@&#8203;mudler](https://togithub.com/mudler) in
[https://github.com/mudler/LocalAI/pull/1432](https://togithub.com/mudler/LocalAI/pull/1432)
- feat(img2vid,txt2vid): Initial support for img2vid,txt2vid by
[@&#8203;mudler](https://togithub.com/mudler) in
[https://github.com/mudler/LocalAI/pull/1442](https://togithub.com/mudler/LocalAI/pull/1442)

##### 👒 Dependencies

- ⬆️ Update ggerganov/whisper.cpp by
[@&#8203;localai-bot](https://togithub.com/localai-bot) in
[https://github.com/mudler/LocalAI/pull/1378](https://togithub.com/mudler/LocalAI/pull/1378)
- ⬆️ Update ggerganov/llama.cpp by
[@&#8203;localai-bot](https://togithub.com/localai-bot) in
[https://github.com/mudler/LocalAI/pull/1379](https://togithub.com/mudler/LocalAI/pull/1379)
- ⬆️ Update ggerganov/whisper.cpp by
[@&#8203;localai-bot](https://togithub.com/localai-bot) in
[https://github.com/mudler/LocalAI/pull/1430](https://togithub.com/mudler/LocalAI/pull/1430)
- ⬆️ Update mudler/go-piper by
[@&#8203;localai-bot](https://togithub.com/localai-bot) in
[https://github.com/mudler/LocalAI/pull/1441](https://togithub.com/mudler/LocalAI/pull/1441)
- ⬆️ Update ggerganov/whisper.cpp by
[@&#8203;localai-bot](https://togithub.com/localai-bot) in
[https://github.com/mudler/LocalAI/pull/1434](https://togithub.com/mudler/LocalAI/pull/1434)

##### Other Changes

- ⬆️ Update ggerganov/llama.cpp by
[@&#8203;localai-bot](https://togithub.com/localai-bot) in
[https://github.com/mudler/LocalAI/pull/1385](https://togithub.com/mudler/LocalAI/pull/1385)
- docs: site update fixing old image text / How To update updating GPU
and CPU docker pages by
[@&#8203;lunamidori5](https://togithub.com/lunamidori5) in
[https://github.com/mudler/LocalAI/pull/1399](https://togithub.com/mudler/LocalAI/pull/1399)
- feat: cuda transformers by
[@&#8203;mudler](https://togithub.com/mudler) in
[https://github.com/mudler/LocalAI/pull/1401](https://togithub.com/mudler/LocalAI/pull/1401)
- feat(entrypoint): optionally prepare extra endpoints by
[@&#8203;mudler](https://togithub.com/mudler) in
[https://github.com/mudler/LocalAI/pull/1405](https://togithub.com/mudler/LocalAI/pull/1405)
- ⬆️ Update ggerganov/whisper.cpp by
[@&#8203;localai-bot](https://togithub.com/localai-bot) in
[https://github.com/mudler/LocalAI/pull/1390](https://togithub.com/mudler/LocalAI/pull/1390)
- ⬆️ Update mudler/go-piper by
[@&#8203;localai-bot](https://togithub.com/localai-bot) in
[https://github.com/mudler/LocalAI/pull/1400](https://togithub.com/mudler/LocalAI/pull/1400)
- tests: add diffusers tests by
[@&#8203;mudler](https://togithub.com/mudler) in
[https://github.com/mudler/LocalAI/pull/1419](https://togithub.com/mudler/LocalAI/pull/1419)
- ⬆️ Update ggerganov/whisper.cpp by
[@&#8203;localai-bot](https://togithub.com/localai-bot) in
[https://github.com/mudler/LocalAI/pull/1418](https://togithub.com/mudler/LocalAI/pull/1418)
- How To Updates / Model Used Switched / Removed "docker-compose" (RIP)
by [@&#8203;lunamidori5](https://togithub.com/lunamidori5) in
[https://github.com/mudler/LocalAI/pull/1417](https://togithub.com/mudler/LocalAI/pull/1417)
- fix(transformers\*): add sentence-transformers and
transformers-musicgen tests, fix musicgen wrapper by
[@&#8203;mudler](https://togithub.com/mudler) in
[https://github.com/mudler/LocalAI/pull/1420](https://togithub.com/mudler/LocalAI/pull/1420)
- extras: add vllm,bark,vall-e-x tests, bump diffusers by
[@&#8203;mudler](https://togithub.com/mudler) in
[https://github.com/mudler/LocalAI/pull/1422](https://togithub.com/mudler/LocalAI/pull/1422)
- Documentation for Hipblas by
[@&#8203;sfxworks](https://togithub.com/sfxworks) in
[https://github.com/mudler/LocalAI/pull/1425](https://togithub.com/mudler/LocalAI/pull/1425)
- ⬆️ Update ggerganov/llama.cpp by
[@&#8203;localai-bot](https://togithub.com/localai-bot) in
[https://github.com/mudler/LocalAI/pull/1391](https://togithub.com/mudler/LocalAI/pull/1391)
- docs: add aikit to integrations by
[@&#8203;sozercan](https://togithub.com/sozercan) in
[https://github.com/mudler/LocalAI/pull/1412](https://togithub.com/mudler/LocalAI/pull/1412)
- ⬆️ Update ggerganov/llama.cpp by
[@&#8203;localai-bot](https://togithub.com/localai-bot) in
[https://github.com/mudler/LocalAI/pull/1429](https://togithub.com/mudler/LocalAI/pull/1429)
- docs(mixtral): add mixtral example by
[@&#8203;mudler](https://togithub.com/mudler) in
[https://github.com/mudler/LocalAI/pull/1449](https://togithub.com/mudler/LocalAI/pull/1449)

#### New Contributors

- [@&#8203;sozercan](https://togithub.com/sozercan) made their first
contribution in
[https://github.com/mudler/LocalAI/pull/1412](https://togithub.com/mudler/LocalAI/pull/1412)

**Full Changelog**:
mudler/LocalAI@v2.0.0...v2.1.0

</details>

---

### Configuration

📅 **Schedule**: Branch creation - "before 10pm on monday" in timezone
Europe/Amsterdam, Automerge - At any time (no schedule defined).

🚦 **Automerge**: Enabled.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Renovate
Bot](https://togithub.com/renovatebot/renovate).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNy4xMDIuMCIsInVwZGF0ZWRJblZlciI6IjM3LjEwMi4wIiwidGFyZ2V0QnJhbmNoIjoibWFzdGVyIn0=-->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

feat(exllama2): Add support to exllama2
1 participant