Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python cannot use custom ollama model #188

Open
BowenKwan opened this issue Jun 13, 2024 · 3 comments
Open

Python cannot use custom ollama model #188

BowenKwan opened this issue Jun 13, 2024 · 3 comments

Comments

@BowenKwan
Copy link

I have created a custom model using the ollama create custom_model -f modelfile. The custom model is based on codellama. Some examples and context are provided in the modelfile. In the CLI interface, the custom model ran well, giving answer based on the examples.

However, when I was trying to use the custom model in python, using ollama.chat(model='custom_model. The same user prompt would give answer as if I am using the original codellama model with no training examples.

I have tried using a wrong model name (custommodel instead of custom_model), and it can be detected that no such model exists. So I think python is picking up the custom model, however, it seems like the context is left out. Also when I run ollama run custom_model, the provided context in the model file always being printed out. Does it mean that the model is being retrained from the original codellama model everytime when I run it? Instead of having the same trained custom model everytime I run the custom model?

It seems to me that all the few shot example provided in the modelfile used to train the custom_model is not provided to the custom model when using ollama.chat. The result seems to be just like using the base model that the custom model is trained on.

Is there any way to use the custom model appropriately in python, just like running it in CLI?

@mxyng
Copy link
Collaborator

mxyng commented Jun 13, 2024

can you provide the content (or some representative sample) of you modelfile?

I suspect this is due to a behavioural difference in the cli and the api. in the cli, a model containing MESSAGEs will have those messages prepended to user inputs. the api and therefore this library does not do this

@BowenKwan
Copy link
Author

I am giving some prompts in the model file to teach codellama to write high level synthesis (HLS) pragma code. Here is an example of the prompt, basically asking the model to add the highlighted lines to the code.
Screenshot 2024-06-13 at 10 09 24 PM

If no example is provided, codellama return very much a C like syntax.

after feeding a few examples via the CLI, it can return the correct HLS code.

The same examples are used in the model file, and then create a new custom model.

When using the custom model in CLI, I can see all the user and assistant prompts being loaded first. The model can return the correct HLS syntax.

however when i try to use the custom model in python
response = ollama.chat(model="custom_model", messages=[ { "role": "user", "content": FEW_SHOT_PROMPT_USER } ])

it returns the C like syntax as if none of the HLS example is given. It gives a similar result as if I am running the original codellama model instead of the custom model.

When using the codellama model in python, with the example given explicitly,
response = ollama.chat(model='codellama:70b', messages=[ { { "role": "user", "content": FEW_SHOT_PROMPT_1 }, { "role": "assistant", "content": FEW_SHOT_ANSWER_1 }, { "role": "user", "content": FEW_SHOT_PROMPT_2 }, { "role": "assistant", "content": FEW_SHOT_ANSWER_2 }, { "role": "user", "content": FEW_SHOT_PROMPT_USER } ]) , the model can also return the correct HLS syntax.

It appears to me that the prompt declared in the model file is only loaded when the custom model is run by CLI, but not included when it is called in python. Is there any way to call the custom model so that the prompt in the model file is also passed?

Also another question, it seems like creating a new model does not actually "make a new model", but rather pre-loading the same prompt to the base model. So when we load the same custom model, does it just learn the prompts again on spot? If that's the case, is there any way to save a custom model after learning? I would assume even given the some model and same prompt, some models perform better than the others?

Thank you very much for your help.

@BowenKwan
Copy link
Author

BowenKwan commented Jun 27, 2024

Also, I am wonder if the calls of the ollama chat are independent.

I am repeating the exact function call ollama.chat multiple times in a python script, however each time it gives a different (wrong) result. However, the quality of the answer is improving until the correct answer is given after the 4th call.

for example

response = ollama.chat(model='codellama:70b', messages ... )
response = ollama.chat(model='codellama:70b', messages ... )
response = ollama.chat(model='codellama:70b', messages ... )
response = ollama.chat(model='codellama:70b', messages ... )

with the exact function call and message, the output are
response = a1
response = a2
response = a3
response = a4 respectively ( and a4 is the correct one).

If i change the variable name from a to b, and rerun the script, the answer return is not right away b4, but goes through the cycle of b1, b2, b3, b4. And if I change the variable back to a, I do not get a4 directly, but also need to go through the a1 to a4 cycle first before the correct answer can be returned.

Is the individual function call of ollama.chat independent? or is knowledge from one call being passed to the next one? and if not, why the answer seems to go through the same pattern to converge to the correct answer? if the knowledge is retained, why switching a variable name and switch it back seems to have removed all the memory of the system?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants