Add phi3-mini #1461

kyakuno · 2024-04-24T05:45:16Z

MicrosoftのminiサイズのLLM。
https://huggingface.co/microsoft/Phi-3-mini-4k-instruct

kyakuno · 2024-04-24T05:45:59Z

公式でonnxが提供されるかも。
https://onnxruntime.ai/blogs/accelerating-phi-3

kyakuno · 2024-04-24T11:45:34Z

公式でonnxが提供された。
https://huggingface.co/microsoft/Phi-3-mini-128k-instruct-onnx

kyakuno · 2024-04-24T11:49:29Z

generate apiはpythonで書く必要がある。

kyakuno · 2024-04-24T12:11:36Z

推論コードの例。
microsoft/onnxruntime#20448

kyakuno · 2024-04-24T12:11:46Z

onnxruntimeのベータ版であれば下記で動く。

import onnxruntime_genai as og
import argparse
import time

model = og.Model(".\Phi-3-mini-128k-instruct-onnx\cpu_and_mobile\cpu-int4-rtn-block-32")
tokenizer = og.Tokenizer(model)
tokenizer_stream = tokenizer.create_stream()


def input_llm(text):
    print("Question:",text)
    input_tokens = tokenizer.encode(text)
    params = og.GeneratorParams(model)
    params.try_use_cuda_graph_with_max_batch_size(1)
    params.input_ids = input_tokens
    generator = og.Generator(model, params)
    return generator

def output_llm(generator):
    print("Answer:")
    stt = time.time()
    list_error = []
    list_sentence = []
    while not generator.is_done():
        generator.compute_logits()
        generator.generate_next_token()
        new_token = generator.get_next_tokens()[0]
        if not new_token in list_error:
            try:
                list_sentence.append(tokenizer_stream.decode(new_token))
            except:
                list_error.append(new_token)
                list_sentence.append(new_token)
    print(list_sentence)
    fin = time.time()
    print(fin-stt)
    return list_error

kyakuno · 2024-04-24T12:13:11Z

onnxruntime_genaiのコード。
https://github.com/microsoft/onnxruntime-genai

kyakuno · 2024-04-24T12:14:02Z

generateはC++で書かれているので、Pytorch向けの実装を持ってきた方が良さそう。

kyakuno · 2024-04-24T12:15:09Z

とりあえずtokenizerはtransformersを使うと良さそう。

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

torch.random.manual_seed(0)

model = AutoModelForCausalLM.from_pretrained(
    "microsoft/Phi-3-mini-128k-instruct", 
    device_map="cuda", 
    torch_dtype="auto", 
    trust_remote_code=True, 
)
tokenizer = AutoTokenizer.from_pretrained("microsoft/Phi-3-mini-128k-instruct")

messages = [
    {"role": "system", "content": "You are a helpful digital assistant. Please provide safe, ethical and accurate information to the user."},
    {"role": "user", "content": "Can you provide ways to eat combinations of bananas and dragonfruits?"},
    {"role": "assistant", "content": "Sure! Here are some ways to eat bananas and dragonfruits together: 1. Banana and dragonfruit smoothie: Blend bananas and dragonfruits together with some milk and honey. 2. Banana and dragonfruit salad: Mix sliced bananas and dragonfruits together with some lemon juice and honey."},
    {"role": "user", "content": "What about solving an 2x + 3 = 7 equation?"},
]

pipe = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
)

generation_args = {
    "max_new_tokens": 500,
    "return_full_text": False,
    "temperature": 0.0,
    "do_sample": False,
}

output = pipe(messages, **generation_args)
print(output[0]['generated_text'])

https://huggingface.co/microsoft/Phi-3-mini-128k-instruct

kyakuno · 2024-04-24T12:16:07Z

文章生成はとりあえずgreedy searchとか。
https://github.com/axinc-ai/ailia-models/blob/master/natural_language_processing/rinna_gpt2/utils_rinna_gpt2.py

kyakuno · 2024-04-24T12:34:24Z

LlamaTokenizerを使っている。
https://huggingface.co/microsoft/Phi-3-mini-128k-instruct/blob/main/tokenizer_config.json

kyakuno · 2024-04-24T12:35:18Z

LlamaTokenizer
https://github.com/huggingface/transformers/blob/37fa1f654f17b68bbe30440c64e611f1a4d55bc7/src/transformers/models/llama/tokenization_llama.py#L55

kyakuno · 2024-04-24T12:36:16Z

SentencePieceの一般的なTokenizerに見える。

kyakuno added the high priority label Apr 24, 2024

kyakuno self-assigned this Apr 24, 2024

kyakuno mentioned this issue Apr 24, 2024

[WIP] Implement phi3 mini #1462

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add phi3-mini #1461

Add phi3-mini #1461

kyakuno commented Apr 24, 2024 •

edited

kyakuno commented Apr 24, 2024

kyakuno commented Apr 24, 2024

kyakuno commented Apr 24, 2024

kyakuno commented Apr 24, 2024

kyakuno commented Apr 24, 2024 •

edited

kyakuno commented Apr 24, 2024 •

edited

kyakuno commented Apr 24, 2024

kyakuno commented Apr 24, 2024

kyakuno commented Apr 24, 2024

kyakuno commented Apr 24, 2024

kyakuno commented Apr 24, 2024

kyakuno commented Apr 24, 2024

Add phi3-mini #1461

Add phi3-mini #1461

Comments

kyakuno commented Apr 24, 2024 • edited

kyakuno commented Apr 24, 2024

kyakuno commented Apr 24, 2024

kyakuno commented Apr 24, 2024

kyakuno commented Apr 24, 2024

kyakuno commented Apr 24, 2024 • edited

kyakuno commented Apr 24, 2024 • edited

kyakuno commented Apr 24, 2024

kyakuno commented Apr 24, 2024

kyakuno commented Apr 24, 2024

kyakuno commented Apr 24, 2024

kyakuno commented Apr 24, 2024

kyakuno commented Apr 24, 2024

kyakuno commented Apr 24, 2024 •

edited

kyakuno commented Apr 24, 2024 •

edited

kyakuno commented Apr 24, 2024 •

edited