Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wrong output on phi3 vision model #447

Closed
adrid opened this issue Jun 18, 2024 · 3 comments
Closed

wrong output on phi3 vision model #447

adrid opened this issue Jun 18, 2024 · 3 comments
Labels
bug Something isn't working resolved

Comments

@adrid
Copy link

adrid commented Jun 18, 2024

Describe the bug
I've started the server:
cargo run --release --features cuda -- --port 1234 vision-plain -m microsoft/Phi-3-vision-128k-instruct -a phi3v

And then I'm running the python request example from here:
https://github.com/EricLBuehler/mistral.rs/blob/master/docs/PHI3V.md

I'm getting:
The image shows a snow-covered mountain with a clear sky above and trees at the base. There appears to be a path or trail leading up the mountain, and some structures can be seen on the peak.

Which is correct.

But when I'm changing the image url to something else like for example: https://onnxruntime.ai/images/coffee.png

Then it takes forever until it gets out of memory.

"/home/adrian/miniconda3/lib/python3.11/site-packages/openai/_base_client.py", line 1020, in _request
    raise self._make_status_error_from_response(err.response) from None
openai.InternalServerError: Error code: 500 - {'message': 'DriverError(CUDA_ERROR_OUT_OF_MEMORY, "out of memory")', 'partial_response': {'id': '2', 'choices': [{'finish_reason': 'error', 'index': 0, 'message': {'content': "<unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk>...
...it continues

Other image: https://onnxruntime.ai/images/table.png

For some images it stops but returns a mess:

The 00012, and0s12 - 00 2000 (   . Currently. There 121 and a
319 to a 0s - It - , the - . The  . ( [ [ 300 isution, in, and   In....9, the
  2odans, 1200 for 20 111,  working and0 years119. Ins10 a in

s [ . In, the Currently. The   on a the - .   and is a the primary, based, and in on - at a an, primarily.  , 20 (   for [ution.s ands for 11 role. It20s
., 02, working at 0101,  -.
 [ [ in for the the  .
0, and in .s, it399 before a a prior on  1.     , currently, ands... [
s 0 iss In....ia. Prior, it25. Currently, I  ' 0 and 
 [ [
    112. Currently, the .0 0 . It .0s is 1 .           , jobs0 to role
 [  .   and, thes - a3, a The9. and1,  working - 20 0.
[ 
0
     0 [, the20 ' '
   \ .   .  - prior, before 2s and . (100 1 .   , the5, prior, full. The1. Its -...utod  -s in0s [200s
3, current, the, and, and.
  

 2, 29, 0 .    10ed        [ is1^^ The^ [s11 for a. It  - a time. (, 10, I. in, it, it
, the

s.  [ and, the
...it continues

I can't make it work on any other image than the one from the example.

Latest commit
3a79137

@adrid adrid added the bug Something isn't working label Jun 18, 2024
@adrid adrid changed the title wrong output on phi3 vision model image wrong output on phi3 vision model Jun 18, 2024
@EricLBuehler
Copy link
Owner

Hi @adrid! Thank you for raising this, I'll take a look.

@EricLBuehler
Copy link
Owner

Hi @adrid! I have fixed this now in #459. After you run git pull and rebuild (for Rust) / reinstall (for Python), it should work.

When I run:

cargo run --release --features cuda -- --port 1234 --isq Q4K vision-plain -m microsoft/Phi-3-vision-128k-instruct -a phi3v

And

import openai

openai.api_key = "EMPTY"
openai.base_url = "http://localhost:1234/v1/"

completion = openai.chat.completions.create(
    model="phi3v",
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "image_url",
                    "image_url": {
                        "url": "https://onnxruntime.ai/images/coffee.png"
                    },
                },
                {
                    "type": "text",
                    "text": "<|image_1|>\nWhat is shown in this image? Write a detailed response analyzing the scene.",
                },
            ],
        },
    ],
    max_tokens=256,
    frequency_penalty=1.0,
    top_p=0.1,
    temperature=0,
)
resp = completion.choices[0].message.content
print(resp)

It gives:

The image captures a moment of tranquility, featuring a white cup filled with coffee. The cup, which is the central focus of the image, is placed on a wooden surface that adds a rustic charm to the scene. 

The coffee inside the cup has been meticulously prepared into an intricate latte art design. This design is composed of three delicate leaves, each symmetrically arranged around a central point. The leaves are white in color, contrasting beautifully with the dark brown foam that forms their base. 

The image does not contain any discernible text or additional objects. The relative position of the objects is such that the cup is at the center, with its contents spread out around it. The wooden surface on which the cup rests provides a natural and warm backdrop to this inviting scene. 

This image exudes a sense of calm and enjoyment, as if inviting one to take a moment to appreciate a well-crafted cup of coffee. It's a simple yet captivating snapshot of everyday life.

If you could confirm that it works for you that would be great!

@adrid
Copy link
Author

adrid commented Jun 22, 2024

It works great now @EricLBuehler ! I've checked couple more images and results now are correct. Thank you for checking this!

@adrid adrid closed this as completed Jun 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working resolved
Projects
None yet
Development

No branches or pull requests

2 participants