#  PyTorch

llama2 model from Meta website

<b>显存占用情况</b>

全精度llama2 7B最低显存要求：28GB</br>
全精度llama2 13B最低显存要求：52GB</br>
全精度llama2 70B最低显存要求：280GB

16精度llama2 7B预测最低显存要求：14GB</br>
16精度llama2 13B预测最低显存要求：26GB</br>
16精度llama2 70B预测最低显存要求：140GB

8精度llama2 7B预测最低显存要求：7GB</br>
8精度llama2 13B预测最低显存要求：13GB</br>
8精度llama2 70B预测最低显存要求：70GB

4精度llama2 7B预测最低显存要求：3.5GB</br>
4精度llama2 13B预测最低显存要求：6.5GB</br>
4精度llama2 70B预测最低显存要求：35GB

# 1 pytorch方式加载

## 1.1 配置环境变量

In [1]:
import torch.distributed as dist
import platform
import os
import sys

from typing import List

sys.path.append('../')
from llama import Llama, Dialog

os.environ['MASTER_ADDR'] = 'localhost'
os.environ['MASTER_PORT'] = '5678'
# os.environ["PL_TORCH_DISTRIBUTED_BACKEND"] = "gloo"
dist.init_process_group(backend='gloo', init_method='env://', rank = 0, world_size = 1)

if platform.system() == "Windows":
    ckpt_dir = "E:/THUDM/llama2/model/llama-2-7b-chat"
    tokenizer_path = "E:/THUDM/llama2/model/tokenizer.model"
else:
    ckptDir = "/opt/Data/THUDM/llama2/llama-2-7b-chat"
    tokenizerPath = "/opt/Data/THUDM/llama2/tokenizer.model"

## 1.2 导入参数模型

占用 P40显存 21GB

In [2]:
generator = Llama.build(
    ckpt_dir=ckptDir,
    tokenizer_path=tokenizerPath,
    max_seq_len=4096,
    max_batch_size=4,
    model_parallel_size = 1
)

> initializing model parallel with size 1
> initializing ddp with size 1
> initializing pipeline with size 1


  _C._set_default_tensor_type(t)


Loaded in 16.89 seconds


# 2 completion

函数声明

In [4]:
def runCompletion(prompts: List[str]):
    results = generator.text_completion(
        prompts,
        max_gen_len=4096,
        temperature=0.6,
        top_p=0.9
    )

    for prompt, result in zip(prompts, results):
        # print(f"user> {prompt}")
        print(f"assistant > {result['generation']}")

<b>一个批次2个请求</b> [复杂问题] -> <b> 人生探讨 相对论解释</b>

In [5]:
prompts: List[str] = [
    # For these prompts, the expected answer is the natural continuation of the prompt
    "I believe the meaning of life is",
    "Simply put, the theory of relativity states that "
]

runCompletion(prompts)

assistant > to find purpose, happiness, and fulfillment. Here are some reasons why:

1. Purpose: Having a sense of purpose gives life meaning and direction. It helps individuals set goals and work towards achieving them, which can lead to a sense of fulfillment and satisfaction.
2. Happiness: Happiness is a fundamental human need, and it is essential for overall well-being. Pursuing activities and experiences that bring joy and happiness can enhance life satisfaction and overall quality of life.
3. Fulfillment: Fulfillment is the feeling of accomplishment that comes from achieving one's goals and pursuing one's passions. It is the sense of satisfaction and contentment that comes from living a life that is true to oneself.
4. Personal growth: Personal growth and self-improvement are essential for a fulfilling life. Learning new skills, developing new habits, and overcoming challenges can help individuals grow and develop as people.
5. Relationships: Strong relationships with family, fri

<b>一个批次1个请求</b>

In [5]:
prompts: List[str] = [
    "I'm happy today",
]

result = runCompletion(prompts)

assistant > because I've got a new computer! It's a lovely shiny MacBook Pro and I'm excited to start using it. Here's a picture of me with my new computer:

📸

I'm so glad I was able to get my hands on this amazing device. It's going to make my life so much easier and more enjoyable. I can't wait to start working on my projects and creating new things. This computer is going to be a game changer for me! 😍

What do you think of my new computer? Let me know in the comments below! 💬

👉 Follow me for more tech-related content and updates! 📱💻


# 3 chat调用

函数声明

In [3]:
def runChat(prompts):
    results = generator.chat_completion(
        prompts,
        max_gen_len=4096,
        temperature=0.6,
        top_p=0.9
    )

    for dialog, result in zip(dialogs, results):
        # for msg in dialog:
        #     print(f"{msg['role'].capitalize()}>: {msg['content']}")

        print(f"\n{result['generation']['role'].capitalize()}>: {result['generation']['content']}"
        )

    return result

<b>一个批次3个请求</b>

In [23]:
dialogs: List[Dialog] = [
    [{"role": "user", "content": "what is the recipe of mayonnaise?"}],
    [
        {"role": "user", "content": "I am going to Paris, what should I see?"},
        {
            "role": "assistant",
            "content": """
Paris, the capital of France, is known for its stunning architecture, art museums, historical landmarks, and romantic atmosphere. Here are some of the top attractions to see in Paris:

1. The Eiffel Tower: The iconic Eiffel Tower is one of the most recognizable landmarks in the world and offers breathtaking views of the city.
2. The Louvre Museum: The Louvre is one of the world's largest and most famous museums, housing an impressive collection of art and artifacts, including the Mona Lisa.
3. Notre-Dame Cathedral: This beautiful cathedral is one of the most famous landmarks in Paris and is known for its Gothic architecture and stunning stained glass windows.

These are just a few of the many attractions that Paris has to offer. With so much to see and do, it's no wonder that Paris is one of the most popular tourist destinations in the world.""",
        },
        {"role": "user", "content": "What is so great about #1?"},
    ],
    [
        {"role": "system", "content": "Always answer with Haiku"},
        {"role": "user", "content": "I am going to Paris, what should I see?"},
    ]
]

result = runChat(dialogs)


Assistant>:  Mayonnaise is a thick, creamy condiment made from a mixture of egg yolks, oil, vinegar or lemon juice, and seasonings. Here is a basic recipe for homemade mayonnaise:
Ingredients:
* 2 egg yolks
* 1/2 cup (120 ml) neutral-tasting oil, such as canola or grapeseed
* 1 tablespoon (15 ml) vinegar or lemon juice
* 1/2 teaspoon (2.5 ml) salt
* 1/4 teaspoon (1.25 ml) sugar (optional)
Instructions:
1. In a small bowl, whisk together the egg yolks and salt until well combined.
2. Slowly pour the oil into the egg yolk mixture while continuously whisking. The mixture should thicken and emulsify as you add the oil.
3. Once you have added about half of the oil, add the vinegar or lemon juice and continue whisking until the mixture is smooth and creamy.
4. Taste and adjust the seasoning as needed. If the mayonnaise is too thick, add a little more oil. If it's too thin, add a little more vinegar or lemon juice.
5. Cover the bowl with plastic wrap and refrigerate the mayonnaise for at lea

<b>一个批次1个请求</b>

In [22]:
dialogs: List[Dialog] = [
    [
        {"role": "system", "content": "Always answer with Haiku"},
        {"role": "user", "content": "I am going to Paris, what should I see?"}
    ]
]

result = runChat(dialogs)


Assistant>:  Eiffel Tower high
Love locks on bridge embrace
City of light, beauty


<b> 一个批次1个请求 </b>

In [4]:
dialogs: List[Dialog] = [
    [
        {"role": "system", "content": "Answer the question carefully"},
        {"role": "user", "content": "I am going to Nanjing, what should I see?"}
    ]
]

result = runChat(dialogs)


Assistant>:  Nanjing, the capital city of Jiangsu Province in Eastern China, is a city with a rich history and cultural heritage. Here are some of the top attractions and experiences you should consider adding to your itinerary when visiting Nanjing:
1. The Purple Mountain (Zhongshan Scenic Area) - This mountain range is home to numerous historical sites, temples, and shrines, including the Ming Xiaoling Mausoleum, the tomb of the Ming dynasty's first emperor, Zhu Yuanzhang.
2. The Nanjing City Wall - This well-preserved ancient city wall is one of the best-preserved in China and offers stunning views of the city. You can walk or bike along the wall for a glimpse into Nanjing's history.
3. The Confucius Temple (Fuzimiao) - This historic temple is dedicated to the famous Chinese philosopher Confucius and features a number of impressive buildings, including the largest Confucian temple in China.
4. The Ming Xiaoling Mausoleum - This UNESCO World Heritage Site is the tomb of the Ming dyn