Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add deepseek vl #1335

Merged
merged 14 commits into from
Apr 2, 2024
Merged

Add deepseek vl #1335

merged 14 commits into from
Apr 2, 2024

Conversation

AllentDan
Copy link
Collaborator

@lvhan028 lvhan028 requested a review from irexyc March 25, 2024 07:33
@lvhan028 lvhan028 added the enhancement New feature or request label Mar 25, 2024
lmdeploy/model.py Outdated Show resolved Hide resolved
@irexyc
Copy link
Collaborator

irexyc commented Mar 28, 2024

@zhoujh1113

麻烦修改下面这两个地方,看下还会不会挂掉或者卡住。cache_max_entry_count 可以设小一点。

backend_config=TurbomindEngineConfig(tp=2, session_len=8192, cache_max_entry_count=0.5)

def __init__(self, model_path, device='cuda'):

这个地方改成cuda:0

https://github.com/InternLM/lmdeploy/blob/c9b61e354b473de5e3d7c319aa3f053ef9bd54f3/lmdeploy/vl/engine.py#L101C2-L108C23
这个地方改成

with torch.device('cuda:0'):
    time_start = time.perf_counter()
    outputs = self.model.forward(inputs)
    time_end = time.perf_counter()
    logger.info(f'ImageEncoder forward {len(inputs)} images, '
                f'cost {time_end - time_start:.3f}s')

Comment on lines 33 to 35
with torch.device('cpu'):
model = AutoModelForCausalLM.from_pretrained(
self.model_path, trust_remote_code=True)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may use init_empty_weights to accelerate loading

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tried. But seemed the output of the model would be wrong.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accelerating loading model is very important. Please investigate

@lvhan028
Copy link
Collaborator

lvhan028 commented Apr 2, 2024

ValueError: Could not find the operator torchvision::nms. Please make sure you have already registered the operator and (if registered from C++) loaded it via torch.ops.load_library.

@AllentDan
Copy link
Collaborator Author

what's the version of torch and torchvision are you using?

torch 2.1.2+cu118
torchvision 0.16.2

@lvhan028 lvhan028 merged commit 9b8ebc1 into InternLM:main Apr 2, 2024
3 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants