<a href="https://colab.research.google.com/github/runfuture/tigerbot/blob/main/test_tigerbot_7b_sft.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
!pip install transformers
!pip install accelerate

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting transformers
  Downloading transformers-4.29.2-py3-none-any.whl (7.1 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m7.1/7.1 MB[0m [31m98.4 MB/s[0m eta [36m0:00:00[0m
Collecting huggingface-hub<1.0,>=0.14.1 (from transformers)
  Downloading huggingface_hub-0.15.1-py3-none-any.whl (236 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m236.8/236.8 kB[0m [31m31.4 MB/s[0m eta [36m0:00:00[0m
Collecting tokenizers!=0.11.3,<0.14,>=0.11.1 (from transformers)
  Downloading tokenizers-0.13.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.8 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m7.8/7.8 MB[0m [31m103.6 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: tokenizers, huggingface-hub, transformers
Successfully installed huggingface-hub-0.15.1 tokenizers-0.13.3 transformers-4.29.2
Looking in i

In [2]:
import os

#import fire
import torch
import readline
from accelerate import infer_auto_device_map, dispatch_model
from accelerate.utils import get_balanced_memory
from transformers import AutoTokenizer, AutoModelForCausalLM

os.environ["TOKENIZERS_PARALLELISM"] = "false"

tok_ins = "\n\n### Instruction:\n"
tok_res = "\n\n### Response:\n"
prompt_input = tok_ins + "{instruction}" + tok_res


def get_model(model):
    def skip(*args, **kwargs):
        pass

    torch.nn.init.kaiming_uniform_ = skip
    torch.nn.init.uniform_ = skip
    torch.nn.init.normal_ = skip
    model = AutoModelForCausalLM.from_pretrained(model, torch_dtype=torch.float16)
    return model


def main(
        model_path: str,
        max_input_length: int = 512,
        max_generate_length: int = 1024,
):
    print(f"loading model: {model_path}...")

    model = get_model(model_path)
    max_memory = get_balanced_memory(model)
    device_map = infer_auto_device_map(model, max_memory=max_memory,
                                       no_split_module_classes=["BloomBlock"])
    print("Using the following device map for the model:", device_map)
    model = dispatch_model(model, device_map=device_map, offload_buffers=True)

    device = torch.cuda.current_device()

    tokenizer = AutoTokenizer.from_pretrained(
        model_path,
        cache_dir=None,
        model_max_length=max_generate_length,
        padding_side="left",
        truncation_side='left',
        padding=True,
        truncation=True
    )
    if tokenizer.model_max_length is None or tokenizer.model_max_length > 1024:
        tokenizer.model_max_length = 1024

    generation_kwargs = {
        "top_p": 0.95,
        "temperature": 0.8,
        "max_length": max_generate_length,
        "eos_token_id": tokenizer.eos_token_id,
        "pad_token_id": tokenizer.pad_token_id,
        "early_stopping": True,
        "no_repeat_ngram_size": 4,
    }

    sess_text = ""
    while True:
        raw_text = input("prompt(\"exit\" to end, \"clear\" to clear session) >>> ")
        if not raw_text:
            print('prompt should not be empty!')
            continue
        if raw_text.strip() == "exit":
            print('session ended.')
            break
        if raw_text.strip() == "clear":
            print('session cleared.')
            sess_text = ""
            continue

        query_text = raw_text.strip()
        sess_text += tok_ins + query_text
        input_text = prompt_input.format_map({'instruction': sess_text.split(tok_ins, 1)[1]})
        inputs = tokenizer(input_text, return_tensors='pt', truncation=True, max_length=max_input_length)
        inputs = {k: v.to(device) for k, v in inputs.items()}
        output = model.generate(**inputs, **generation_kwargs)
        answer = ''
        for tok_id in output[0][inputs['input_ids'].shape[1]:]:
            if tok_id != tokenizer.eos_token_id:
                answer += tokenizer.decode(tok_id)

        sess_text += tok_res + answer

        print("=" * 100)
        print(answer)
        print("=" * 100)


In [None]:
#main(model_path='TigerResearch/tigerbot-7b-base')
main(model_path='TigerResearch/tigerbot-7b-sft')

loading model: TigerResearch/tigerbot-7b-sft...


Downloading shards:   0%|          | 0/2 [00:00<?, ?it/s]

Downloading (…)l-00001-of-00002.bin:   0%|          | 0.00/9.97G [00:00<?, ?B/s]

Downloading (…)l-00002-of-00002.bin:   0%|          | 0.00/6.22G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Downloading (…)neration_config.json:   0%|          | 0.00/132 [00:00<?, ?B/s]

Using the following device map for the model: {'': 0}


Downloading (…)okenizer_config.json:   0%|          | 0.00/289 [00:00<?, ?B/s]

Downloading tokenizer.json:   0%|          | 0.00/14.5M [00:00<?, ?B/s]

Downloading (…)cial_tokens_map.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

prompt("exit" to end, "clear" to clear session) >>> 请以”多巴胺对大脑的影响“为主题，生成一篇论文大纲
论文题目：多巴胺在大脑中的作用及其影响
I. 引言
   A. 背景介绍
   B. 研究目的
   C. 研究问题
   D. 研究方法
II. 多巴胺在大体解剖学中的分布
   A. 多巴氨酸的神经元
   B. 多巴氨的神经元连接
   C. 多巴氨基的受体
III. 多巴氨基酸在大脑中的作用
   A. 大脑中的多巴胺水平
   B. 大脑中多巴胺的作用
   C. 大脑多巴胺的代谢
IV. 多巴多胺在大脑中的影响
   A. 与情绪和情感的联系
   B. 与记忆和学习的联系
   C. 与奖赏和奖惩的联系
   D. 与注意力、运动和行为控制的关系
V. 多巴肽在大脑中的作用和影响
   A.  多巴肽与多巴胺的关系
   B. 神经肽Y和多巴胺
   C. 神经元内多巴胺和神经元间多巴胺
VI. 多巴酸在大脑中的作用与影响
   A.   多巴酸与多巴多胺的关系
   C.多巴酸与情绪和情感
   B.多巴多酸与记忆和学习
   D.多巴胺与奖赏和惩罚
VII. 多巴的多巴胺与神经退行性疾病
   A.阿尔茨海默病
   B.帕金森病
   C.亨廷顿病
VIII. 结论
   A. 研究结果总结
   B. 对多巴胺的研究的展望
   C. 对多学科合作的研究的展望
IX. 参考文献
   A. 引文
   B. 参考文献
