Better inference based on starcode2-3b model #13

HeroSong666 · 2024-03-12T03:34:24Z

I am new to starcode.

when I run the follow demo:

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

checkpoint = "./starcoder2-3b"
tokenizer = AutoTokenizer.from_pretrained(checkpoint)

model = AutoModelForCausalLM.from_pretrained(checkpoint, device_map="auto", torch_dtype=torch.bfloat16)

inputs = tokenizer.encode("def is_prime(n):", return_tensors="pt").to("cuda")
outputs = model.generate(inputs)
print(tokenizer.decode(outputs[0]))

it returns:

def is_prime():
    """
    This function checks if a number is prime or not.
    """

it doesn`t finish. so I SET the max_length=120, then it returns:

def is_prime():
    """
    This function checks if a number is prime or not.
    """
    num = int(input("Enter a number: "))
    if num > 1:
        for i in range(2, num):
            if (num % i) == 0:
                print(num, "is not a prime number")
                break
        else:
            print(num, "is a prime number")
    else:
        print(num, "is not a prime number")


is_prime()
<file_sep>/README.md
# Python-

The part

is_prime()
<file_sep>/README.md
# Python-

is redundant. now my solution is:

generated_code = tokenizer.decode(outputs[0])
if "<file_sep>" in generated_code:
    generated_code = generated_code.split("<file_sep>")[0]
print(generated_code)

But I don`t think it a good idea. I want the model to return the results in one go without generating redundant parts. How can I do that? Could you give me some advice?

The text was updated successfully, but these errors were encountered:

HeroSong666 · 2024-03-12T03:34:35Z

Or, I noticed that in https://huggingface.co/bigcode/starcoder2-3b
The inference API can generate code piece by piece, each time I press the Compute. How can I implement such functionality?
(For example, in python, every time I send a request, the model returns me a portion of the results. The next time I send a request, it will send the request based on the previous request + previous results it returns. In this way, the code can be completed step by step without creating redundant parts.)
Many thanks for your advice!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Better inference based on starcode2-3b model #13

Better inference based on starcode2-3b model #13

HeroSong666 commented Mar 12, 2024

HeroSong666 commented Mar 12, 2024

Better inference based on starcode2-3b model #13

Better inference based on starcode2-3b model #13

Comments

HeroSong666 commented Mar 12, 2024

HeroSong666 commented Mar 12, 2024