Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GenerationResponse final_data passes None creating NaN values #12

Open
calebsheridan opened this issue Apr 25, 2024 · 3 comments
Open

Comments

@calebsheridan
Copy link
Contributor

Results metadata shows NaN values. Root cause is in handling on server.

  • Debug 1 shows final_data with expected values
  • Debug 2 shows final_data with None value

As seen in UI:

Created at: Thu, 25 Apr 2024 06:14:54 GMT
Prompt Eval Count: tokens
Prompt Eval Duration: NaNh, NaNm, NaNs
Eval Count: tokens
Eval Duration: NaNh, NaNm, NaNs
Inference Duration (prompt + eval): NaNh, NaNm, NaNs
Total Duration: NaNh, NaNm, NaNs
Throughput (tokens/total_duration): NaN tokens/s

Debug 1:

[src/commands.rs:185:5] &res = Ok(
    GenerationResponse {
        model: "phi3:instruct",
        created_at: "2024-04-25T06:14:53.635775Z",
        response: "The sun sets gently, painting the sky with hues of pink and orange.",
        done: true,
        final_data: Some(
            GenerationFinalResponseData {
                context: GenerationContext(
                    [
                        32010,
                        13,
                        6113,
                        263,
                        3273,
                        10541,
                        29991,
                        32007,
                        13,
                        32001,
                        13,
                        1576,
                        6575,
                        6166,
                        330,
                        2705,
                        29892,
                        20413,
                        278,
                        14744,
                        411,
                        298,
                        1041,
                        310,
                        282,
                        682,
                        322,
                        24841,
                        29889,
                        32007,
                        13,
                    ],
                ),
                total_duration: 7342334375,
                prompt_eval_count: 12,
                prompt_eval_duration: 182371000,
                eval_count: 19,
                eval_duration: 417887000,
            },
        ),
    },
)

Debug 2:

[src/commands.rs:185:5] &res = Ok(
    GenerationResponse {
        model: "phi3:instruct",
        created_at: "2024-04-25T06:14:54.062179Z",
        response: "The sun sets gently, painting the sky with hues of pink and orange.",
        done: true,
        final_data: None,
    },
)
@dezoito
Copy link
Owner

dezoito commented Apr 25, 2024

Yes, I've documented this issue with Ollama-rs (which is what the "backend" uses to interact with the ollama server):

pepperoni21/ollama-rs#27

Like you said, it looks like an Ollama server issue.

The keep_aliveworkaround I mentioned merely mitigates the issue, but I don't see a way to fix this from the client.

I would appreciate any suggestions.

@calebsheridan
Copy link
Contributor Author

We get back multiple responses, even though stream is set to false. Is there a way to issue an id to each request, such that we can ignore any invalid responses associated with that id?

@dezoito
Copy link
Owner

dezoito commented Apr 25, 2024

That's surprising to me.

Let me try to get responses without using of ollama-rs (I'll just use Rust's reqwest module like I did to get Ollama version.

May take some time as I am quite busy, but would be faster/simpler than the request ID option, I think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants