Feature request: Docker Image #59

api-haus · 2023-03-29T19:52:46Z

No description provided.

bstadt · 2023-03-29T20:55:49Z

we dont have any plans to do this at the moment

mudler · 2023-03-30T10:35:36Z

maybe this can help you, I've added support for gpt4all too: https://github.com/go-skynet/llama-cli

faroukellouze · 2023-03-30T15:16:02Z

@mudler docker: Error response from daemon: unknown: Tag v0.3 was deleted or has expired.

mudler · 2023-03-30T15:47:56Z

@mudler docker: Error response from daemon: unknown: Tag v0.3 was deleted or has expired.

Use latest, going to tag a new release soon.

BenMcLean · 2023-04-04T04:22:23Z

I would like to try this as a docker image: ideally paired with some kind of web interface.

iQuickDev · 2023-04-04T15:58:25Z

would love this feature, it allows the project to be easily run on any machine without any hassle

BenMcLean · 2023-04-04T18:40:45Z

Upon further research into this, it appears that the llama-cli project is already capable of bundling gpt4all into a docker image with a CLI and that may be why this issue is closed so as to not re-invent the wheel.

However, I'm not seeing a docker-compose for it, nor good instructions for less experienced users to try it out.

I'm also a bit nervous about hardware requirements. It isn't made very clear what you really need to run this in terms of hardware. If I store the model on an HDD, would it be bad for the long term health of the HDD? Lots of questions like that need answering.

iQuickDev · 2023-04-04T18:43:05Z

Upon further research into this, it appears that the llama-cli project is already capable of bundling gpt4all into a docker image with a CLI and that may be why this issue is closed so as to not re-invent the wheel.

However, I'm not seeing a docker-compose for it, nor good instructions for less experienced users to try it out.

I'm also a bit nervous about hardware requirements. It isn't made very clear what you really need to run this in terms of hardware. If I store the model on an HDD, would it be bad for the long term health of the HDD? Lots of questions like that need answering.

HDDs do not deteriorate with write cycles unlike SSDs so if even if it does lots of writes there is no problem

BenMcLean · 2023-04-04T21:59:48Z

HDDs do not deteriorate with write cycles unlike SSDs so if even if it does lots of writes there is no problem

I guess maybe I shouldn't be using this issue as a forum but I am curious as to why the model would be doing lots of writes? Or any writes? If I'm just trying to run the model to generate text and am not working on tuning it, then the program's access to the model data should be read only, at least in theory, right?

iQuickDev · 2023-04-04T22:22:45Z

HDDs do not deteriorate with write cycles unlike SSDs so if even if it does lots of writes there is no problem

I guess maybe I shouldn't be using this issue as a forum but I am curious as to why the model would be doing lots of writes? Or any writes? If I'm just trying to run the model to generate text and am not working on tuning it, then the program's access to the model data should be read only, at least in theory, right?

yes, it will only be reads. And i mentioned writes because you were asking if the HDD health would deteriorate in the long term. I don't think it will.

mudler · 2023-04-04T22:26:35Z

Upon further research into this, it appears that the llama-cli project is already capable of bundling gpt4all into a docker image with a CLI and that may be why this issue is closed so as to not re-invent the wheel.

However, I'm not seeing a docker-compose for it, nor good instructions for less experienced users to try it out.

Care to open an issue on llama-cli? We can tackle it from there.

Edit: I'm not bundling the model in the image due to #75

BenMcLean · 2023-04-04T22:46:34Z

I just had another option recommended to me on Discord: Serge provides a Docker image with a web interface. No offense but it seems to be closer to what I had in mind for the specific goofy nonsense I'm just playing around with than llama-cli but thanks anyway.

Also, it absolutely makes sense to not bundle the actual model with the application now that I think about it. That would be like bundling movies with the Jellyfin Docker image.

mudler · 2023-04-04T22:56:03Z

I just had another option recommended to me on Discord: Serge provides a Docker image with a web interface. No offense but it seems to be closer to what I had in mind for the specific goofy nonsense I'm just playing around with than llama-cli but thanks anyway.

sure thing! I'm happy you got your way around it! llama-cli is more suitable if you need to e.g. embed it in some application, as it provides just a raw RESTful API and a simple web page to interact with as a playground - it's by no mean UX friendly, but rather developer friendly.

The main difference with sarge is that llama-cli doesn't bash out as serge does, but rather uses llama.cpp straight from the C++ code and as such keeps the model in memory between the requests - that makes it more faster for iterating.

Also, it absolutely makes sense to not bundle the actual model with the application now that I think about it. That would be like bundling movies with the Jellyfin Docker image.

👍

BenMcLean · 2023-04-04T23:03:13Z

The main difference with sarge is that llama-cli doesn't bash out as serge does, but rather uses llama.cpp straight from the C++ code and as such keeps the model in memory between the requests - that makes it more faster for iterating.

Seems like the best of both worlds would be to bundle llama-cli with an optional chat-style web interface.

To slightly extend my metaphor of different models for the app being like different movies for a media player, seems like it would be nice to have these applications not be restricted to just one model per app. Requests could go to the same app for any number of models, specifying which one as part of the request.

mudler · 2023-04-04T23:08:48Z

The main difference with sarge is that llama-cli doesn't bash out as serge does, but rather uses llama.cpp straight from the C++ code and as such keeps the model in memory between the requests - that makes it more faster for iterating.

Seems like the best of both worlds would be to bundle llama-cli with an optional chat-style web interface.

To slightly extend my metaphor of different models for the app being like different movies for a media player, seems like it would be nice to have these applications not be restricted to just one model per app. Requests could go to the same app for any number of models, specifying which one as part of the request.

Very good points, I'd like to iterate on this, I'm not a frontend developer, but I guess that shouldn't be quite hard.

Re models: correct, that's where I'd like to go too in the long run, although reloading models come with the price of loading them back into ram each time when instantiated

BenMcLean · 2023-04-04T23:16:13Z

Re models: correct, that's where I'd like to go too in the long run, although reloading models come with the price of loading them back into ram each time when instantiated

Well I think the only real restriction there would be how much RAM you have. Like maybe given how much RAM you have, you'd have a choice of running two small models or one big one. Maybe different models could be started or stopped with different settings with respect to RAM vs storage.

AndriyMulyar closed this as completed Mar 30, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature request: Docker Image #59

Feature request: Docker Image #59

api-haus commented Mar 29, 2023

bstadt commented Mar 29, 2023

mudler commented Mar 30, 2023

faroukellouze commented Mar 30, 2023

mudler commented Mar 30, 2023

BenMcLean commented Apr 4, 2023

iQuickDev commented Apr 4, 2023

BenMcLean commented Apr 4, 2023

iQuickDev commented Apr 4, 2023

BenMcLean commented Apr 4, 2023

iQuickDev commented Apr 4, 2023

mudler commented Apr 4, 2023 •

edited

BenMcLean commented Apr 4, 2023

mudler commented Apr 4, 2023 •

edited

BenMcLean commented Apr 4, 2023 •

edited

mudler commented Apr 4, 2023 •

edited

BenMcLean commented Apr 4, 2023 •

edited

Feature request: Docker Image #59

Feature request: Docker Image #59

Comments

api-haus commented Mar 29, 2023

bstadt commented Mar 29, 2023

mudler commented Mar 30, 2023

faroukellouze commented Mar 30, 2023

mudler commented Mar 30, 2023

BenMcLean commented Apr 4, 2023

iQuickDev commented Apr 4, 2023

BenMcLean commented Apr 4, 2023

iQuickDev commented Apr 4, 2023

BenMcLean commented Apr 4, 2023

iQuickDev commented Apr 4, 2023

mudler commented Apr 4, 2023 • edited

BenMcLean commented Apr 4, 2023

mudler commented Apr 4, 2023 • edited

BenMcLean commented Apr 4, 2023 • edited

mudler commented Apr 4, 2023 • edited

BenMcLean commented Apr 4, 2023 • edited

mudler commented Apr 4, 2023 •

edited

mudler commented Apr 4, 2023 •

edited

BenMcLean commented Apr 4, 2023 •

edited

mudler commented Apr 4, 2023 •

edited

BenMcLean commented Apr 4, 2023 •

edited