Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Raw approach runs into odd response #15

Closed
nigel-daniels opened this issue Aug 27, 2023 · 1 comment
Closed

Raw approach runs into odd response #15

nigel-daniels opened this issue Aug 27, 2023 · 1 comment

Comments

@nigel-daniels
Copy link

I tried running the raw code but the system appeared to freeze so I added some logging, here is the code I tried:

import {LlamaModel, LlamaContext} from "node-llama-cpp";

const model = new LlamaModel({
    modelPath: "/Users/nigel/git/llama2/llama.cpp/models/7B/ggml-model-q4_0.bin"
});

const context = new LlamaContext({model});

const q1 = "Hi there, how are you?";
console.log("You: " + q1);

const tokens = context.encode(q1);
const res = [];
for await (const chunk of context.evaluate(tokens)) {
    console.log('got chunk');
    res.push(chunk);

    // it's important to not concatinate the results as strings,
    // as doing so will break some characters (like some emojis) that are made of multiple tokens.
    // by using an array of tokens, we can decode them correctly together.
    const resString = context.decode(Uint32Array.from(res));
    console.log('chunk' + resString);
    const lastPart = resString.split("ASSISTANT:").reverse()[0];
    if (lastPart.includes("USER:"))
        break;
}

const a1 = context.decode(Uint32Array.from(res)).split("USER:")[0];
console.log("AI: " + a1);

The expected result was something like I hope you are doing well. but it actually produced, after multiple iterations:

got chunk
chunk I hope you are doing well. Unterscheidung von „Gesundheit“ und „Gesundheitssystem“ – eine kritische Analyse.
The healthcare system is a complex and multifaceted system that is constantly evolving. It is important to understand the different components of the healthcare system and how they work together to provide quality care to patients.
The healthcare system is made up of a variety of different components, including hospitals, clinics, pharmacies, and insurance companies. Each of these components plays a vital role in the overall functioning of the healthcare system.
Hospitals are the primary providers of healthcare services. They are responsible for providing inpatient and outpatient care to patients. Hospitals also provide a variety of other services, such as laboratory testing, radiology, and pharmacy services.
Clinics are smaller healthcare facilities that provide outpatient care to patients. Clinics are typically staffed by a team of doctors, nurses, and other healthcare professionals. Clinics may specialize in a particular area of medicine, such as cardiology or oncology.
Pharmacies are responsible for dispensing prescription medications to patients. Pharmacies also provide a variety of other services, such as counseling patients on the proper use of their medications and providing information on potential side effects.
Insurance companies are responsible 

It would have carried on but I was forced to ^C the code at that point.

@giladgd
Copy link
Contributor

giladgd commented Aug 27, 2023

In the raw approach, you have to set the condition to stop generating output by yourself; unless you stop it from generating more text, it can go on forever.

According to the code you shared, since your model never outputted the USER: string, it didn't stop.
You can try to ask the model to write some end-sequence text when it finishes generating the wanted output, and when you encounter this end-sequence text, stop the generation.
You can also stop the generation when the output text reaches a certain length.

I suggest you use LlamaChatSession as it overcomes those challenges, and if you need to use some special wrapping around your prompts, you can implement your own ChatPromptWrapper.

@giladgd giladgd closed this as completed Aug 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants