Skip to content

[Feature Request]: Old consumer-grade high-end graphics cards, take 2080ti 11GB video memory as an example. Can video memory optimization be achieved? #6424

@1433132084

Description

@1433132084

Self Checks

  • I have searched for existing issues search for existing issues, including closed ones.
  • I confirm that I am using English to submit this report (Language Policy).
  • Non-english title submitions will be closed directly ( 非英文标题的提交将会被直接关闭 ) (Language Policy).
  • Please do not modify this template :) and fill in all the required fields.

Is your feature request related to a problem?

2080ti 11GB, after loading deepseek-r1:14b, it reaches 9GB, after accessing the embedding model of the knowledge base. The graphics card memory is almost full, and it is no longer possible to communicate with the chatbot and perform operations.

Describe the feature you'd like

Can the embedded model and the computational model be temporarily stored in the running memory, and then called into the video memory for computation whenever needed? This is a good method for consumer users with small graphics card memory but large running memory.

Describe implementation you've considered

No response

Documentation, adoption, use case

Additional information

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    💞 featureFeature request, pull request that fullfill a new feature.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions