-
Notifications
You must be signed in to change notification settings - Fork 385
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Could you please provide a simple script to use your multimodel like huggingface or other multimodels? #14
Comments
Here is a simplified script if you do not need model parallel: from models.cogvlm_model import CogVLMModel
from utils.language import llama2_tokenizer, llama2_text_processor_inference
from utils.vision import get_image_processor
from utils.chat import chat
from sat.model.mixins import CachedAutoregressiveMixin
import argparse
# load model
model, model_args = CogVLMModel.from_pretrained(
"cogvlm-chat",
args=argparse.Namespace(
deepspeed=None,
local_rank=0,
rank=0,
world_size=1,
model_parallel_size=1,
mode='inference',
skip_init=True,
fp16=False,
bf16=True,
use_gpu_initialization=True,
device='cuda',
))
model = model.eval()
tokenizer = llama2_tokenizer("lmsys/vicuna-7b-v1.5", signal_type="chat")
image_processor = get_image_processor(model_args.eva_args["image_size"][0])
model.add_mixin('auto-regressive', CachedAutoregressiveMixin())
text_processor_infer = llama2_text_processor_inference(tokenizer, None, model.image_length)
with torch.no_grad():
response, history, cache_image = chat(
"fewshot-data/kobe.png",
model,
text_processor_infer,
image_processor,
"Describe the image.",
history=[],
max_length=2048,
top_p=0.4,
temperature=0.8,
top_k=1,
invalid_slices=text_processor_infer.invalid_slices,
no_prompt=False
)
print(response) |
Thank you very much, I will try it to tag captions for the images collected from the internet. |
How should I change the scripts to conduct inference on multiple GPUs (2*4090)? |
You can try simplifying them if you think they are not simple enough. |
Seems like your cuda driver is too old. Your PyTorch should be built with the corresponding cuda version as your machine. |
Thanks a lot! I have fixed this problem. Btw, does cogvlm support multiple images as input? |
FYI: #38 |
Can you provide a more faster version, such as 4bit/8bit quantize or multiple GPU inference? |
FYI: #75 |
In this scripts, how to set the GPU ids the model loaded, I want to load all model parameter in one GPU card so that I can caption mutiple images using multiple GPUs. However, i tried many setting in the local_rank, rank and device but still got parameters loaded in GPU0, can you provide some advice? |
You should set Moreover, if you set your visible devices to |
Yes, you are right, respect!!!! |
thank you! Amazing result and hope to use your model to caption the images!!
The text was updated successfully, but these errors were encountered: