# LLAMA3文本生成
- 同样的代码可以看[example_text_completion.py](example_text_completion.py)，py文件可以调试

### ①设置工作目录和环境变量

In [1]:
import os
# 获取当前工作目录
current_directory = os.getcwd()
print("当前工作目录:", current_directory)

# 设置新的工作目录
new_directory = "/mnt/d/code/llama3-from-scratch"
os.chdir(new_directory)

# 再次获取当前工作目录，确认是否更改成功
current_directory = os.getcwd()
print("新的工作目录:", current_directory)

# 设置环境变量
os.environ['RANK'] = '0'
os.environ['WORLD_SIZE'] = '1'
os.environ['MASTER_ADDR'] = 'localhost'
os.environ['MASTER_PORT'] = '5678'

当前工作目录: /home/liangxianbing
新的工作目录: /mnt/d/code/llama3-from-scratch


### ②构建llama3模型


In [2]:
from typing import List, Optional
# import fire
from llama import Dialog, Llama

In [3]:
ckpt_dir = 'Meta-Llama-3-8B/original'
tokenizer_path = 'Meta-Llama-3-8B/original/tokenizer.model'
max_seq_len = 512       # 输入的最大token数（包含prompt的token）
max_batch_size = 6      # 最大batch size，即最大同时生成几个句子（prompts数决定）

In [4]:
# 构建模型
generator = Llama.build(
        ckpt_dir=ckpt_dir,
        tokenizer_path=tokenizer_path,
        max_seq_len=max_seq_len,
        max_batch_size=max_batch_size,
    )
print('model built done!')

> initializing model parallel with size 1
> initializing ddp with size 1
> initializing pipeline with size 1


  _C._set_default_tensor_type(t)


Loaded in 84.79 seconds


### ③构建prompts

In [5]:
prompts: List[str] = [
        # For these prompts, the expected answer is the natural continuation of the prompt
        "I believe the meaning of life is",
        "Simply put, the theory of relativity states that ",
        """A brief message congratulating the team on the launch:

        Hi everyone,

        I just """,
        # Few shot prompt (providing a few examples before asking model to complete more);
        """Translate English to French:

        sea otter => loutre de mer
        peppermint => menthe poivrée
        plush girafe => girafe peluche
        cheese =>""",
    ]

### ④文本生成
- temperature用法：logits / temperature，然后再softmax，会改变分布的平滑性；较大的temperature会让分布更平滑，会更有多样性；较小会更尖锐，也会更确定
    ```
    if temperature > 0:
        probs = torch.softmax(logits[:, -1] / temperature, dim=-1)
    ```
- top_p用法：越大越有多样性，越小越确定
  1. logits过softmax后得到probs，然后对probs排序;
  2. 然后求每个位置的累加;
  3. 生成mask：累加-probs > p的位置为1
  4. mask为1的位置置为0
  5. 重新计算概率（因为某些置为0了）
  6. 做概率采样，采样一次得到token
    ```
    def sample_top_p(probs, p):
        """
        Perform top-p (nucleus) sampling on a probability distribution.
    
        Args:
            probs (torch.Tensor): Probability distribution tensor.
            p (float): Probability threshold for top-p sampling.
    
        Returns:
            torch.Tensor: Sampled token indices.
    
        Note:
            Top-p sampling selects the smallest set of tokens whose cumulative probability mass
            exceeds the threshold p. The distribution is renormalized based on the selected tokens.
        """
        probs_sort, probs_idx = torch.sort(probs, dim=-1, descending=True)
        probs_sum = torch.cumsum(probs_sort, dim=-1)
        mask = probs_sum - probs_sort > p
        probs_sort[mask] = 0.0
        probs_sort.div_(probs_sort.sum(dim=-1, keepdim=True))
        next_token = torch.multinomial(probs_sort, num_samples=1)
        next_token = torch.gather(probs_idx, -1, next_token)    # 取token真正的位置（因为sort了）
        return next_token
    ```

In [6]:
max_gen_len = 64    # 生成的token数
temperature = 0.6   # 用法：logits / temperature，然后再softmax，会改变分布的平滑性；较大的temperature会让分布更平滑，会更有多样性；较小会更尖锐，也会更确定
top_p = 0.6 # 用法：较大时，考虑更多词，更有多样性；较小时，只会考虑概率足够大的，更加具有确定性
results = generator.text_completion(
        prompts,
        max_gen_len=max_gen_len,
        temperature=temperature,
        top_p=top_p,
    )
print('text generation done!')

text generation done!


In [8]:
# 打印出来
for prompt, result in zip(prompts, results):
    print(prompt)
    print(f"> {result['generation']}")
    print("\n==================================\n")

I believe the meaning of life is
>  to be happy. I believe that we are all born with a purpose, and that we have the ability to choose our own path. I believe that we are all here to learn and grow, and that we should never stop learning. I believe that we are all connected, and that we should always be kind and compassionate


Simply put, the theory of relativity states that 
> 1) the laws of physics are the same for all non-accelerating observers and 2) the speed of light in a vacuum is the same for all observers, regardless of the motion of the light source. The first part of the theory is known as the principle of relativity. The second part is known


A brief message congratulating the team on the launch:

        Hi everyone,

        I just 
>  wanted to take a moment to congratulate you all on the launch of the
        new website.  It's a great accomplishment and I'm sure it will be a great
        resource for the community.  Keep up the good work!

        Best regards,
    