# llama2-wrapper

在线指导 https://pypi.org/project/llama2-wrapper

In [1]:
import platform

if platform.system() == "Windows":
    MODELPATH = "E:/THUDM/llama2/model/llama-2-7b-chat"
    tokenizer_path = "E:/THUDM/llama2/model/tokenizer.model"
else:
    MODELPATH = "/opt/Data/THUDM/llama2.hf/llama-2-7b-chat-hf"
    tokenizerPath = "/opt/Data/THUDM/llama2/tokenizer.model"

# 1 权重加载

In [2]:
from llama2_wrapper import LLAMA2_WRAPPER, get_prompt, get_prompt_for_dialog

llm = LLAMA2_WRAPPER(
	model_path = MODELPATH,
    backend_type = "transformers",
    # load_in_8bit = True
)

Running on GPU with backend torch transformers.


Loading checkpoint shards:   0%|          | 0/3 [00:00<?, ?it/s]

Some weights of LlamaForCausalLM were not initialized from the model checkpoint at /opt/Data/THUDM/llama2.hf/llama-2-7b-chat-hf and are newly initialized: ['model.layers.26.self_attn.rotary_emb.inv_freq', 'model.layers.10.self_attn.rotary_emb.inv_freq', 'model.layers.20.self_attn.rotary_emb.inv_freq', 'model.layers.7.self_attn.rotary_emb.inv_freq', 'model.layers.24.self_attn.rotary_emb.inv_freq', 'model.layers.3.self_attn.rotary_emb.inv_freq', 'model.layers.17.self_attn.rotary_emb.inv_freq', 'model.layers.27.self_attn.rotary_emb.inv_freq', 'model.layers.1.self_attn.rotary_emb.inv_freq', 'model.layers.4.self_attn.rotary_emb.inv_freq', 'model.layers.28.self_attn.rotary_emb.inv_freq', 'model.layers.2.self_attn.rotary_emb.inv_freq', 'model.layers.25.self_attn.rotary_emb.inv_freq', 'model.layers.31.self_attn.rotary_emb.inv_freq', 'model.layers.15.self_attn.rotary_emb.inv_freq', 'model.layers.9.self_attn.rotary_emb.inv_freq', 'model.layers.23.self_attn.rotary_emb.inv_freq', 'model.layers.14.

# 2 completion

In [3]:
prompt = "I feel good."
answer = llm(get_prompt(prompt), temperature=0.9)
print(answer)



 Great to hear! It's important to take care of yourself and prioritize your well-being. Is there anything in particular that you're feeling good about today?


# 3 chat_completion

In [5]:
dialog = [
    {
        "role":"system",
        "content":"You are a helpful, respectful and honest assistant. "
    },{
        "role":"user",
        "content":"Hi do you know Pytorch?",
    },
]
result = llm.chat_completion(dialog)

In [15]:
print(result["choices"][0]["message"]["content"])

 Hello! Yes, I'm familiar with PyTorch. PyTorch is an open-source deep learning framework that is widely used for developing and training neural networks. It is known for its simplicity, flexibility, and ease of use, making it a popular choice among researchers and developers.
PyTorch is built on top of the Python programming language and provides a dynamic computation graph, which allows for more flexible and efficient computation during training. It also has a Pythonic API, which makes it easy to use and integrate with other Python libraries.
Some of the key features of PyTorch include:

* Dynamic computation graph: PyTorch's computation graph is dynamic, which means that it can be built and modified during runtime. This allows for more flexible and efficient computation during training.
* Pythonic API: PyTorch's API is built on top of Python, which makes it easy to use and integrate with other Python libraries.
* Tensor computation: PyTorch provides a tensor computation system that 

# 4 generate

In [None]:
prompt = get_prompt("Hi do you know Pytorch?")
for response in llm.generate(prompt):
	print(response)

# 5 run

run() is similar to generate(), but run()can also accept chat_historyand system_prompt from the users.

It will process the input message to llama2 prompt template with chat_history and system_prompt for a chatbot-like app.

# 6 get_prompt

get_prompt() will process the input message to llama2 prompt with chat_history and system_promptfor chatbot.

By default, chat_history and system_prompt are empty and get_prompt() will add llama2 prompt template to your message:

In [23]:
prompt = get_prompt("Hi do you know Pytorch?")
print(prompt)

[INST] <<SYS>>

<</SYS>>

Hi do you know Pytorch? [/INST]


In [27]:
prompt = get_prompt("Hi do you know Pytorch?", system_prompt="You are a helpful, respectful and honest assistant. ")
print(prompt)

[INST] <<SYS>>
You are a helpful, respectful and honest assistant. 
<</SYS>>

Hi do you know Pytorch? [/INST]


get_prompt_for_dialog

get_prompt_for_dialog() will process dialog (chat history) to llama2 prompt for OpenAI compatible API /v1/chat/completions.

In [34]:
dialog = [
    {
        "role":"system",
        "content":"You are a helpful, respectful and honest assistant. "
    },{
        "role":"user",
        "content":"Hi do you know Pytorch?",
    },
    {
        "role":"assistant",
        "content":"Yes, I know",
    },
    {
        "role":"user",
        "content":"How are you?",
    },
]

prompt = get_prompt_for_dialog(dialog)

print(prompt)

[INST] <<SYS>>
You are a helpful, respectful and honest assistant. 
<</SYS>>

Hi do you know Pytorch? [/INST] Yes, I know [INST] How are you? [/INST]
