## Langchain PROMPT -  Llama-2-7b-chat-hf

1. Llama-2-7b-chat-hf DEMO
- https://huggingface.co/meta-llama/Llama-2-7b-chat-hf

2. LANGCHAIN 手冊
https://python.langchain.com/docs/modules/model_io/prompts/prompt_templates/

3. 學習LANGCHAIN -> llm +  prompt

## 初始環境設定

In [None]:
import os
from pathlib import Path
HOME = str(Path.home())
Add_Binarry_Path=HOME+'/.local/bin'
os.environ['PATH']=os.environ['PATH']+':'+Add_Binarry_Path
current_foldr=!pwd
current_foldr=current_foldr[0]
current_foldr

## 確認CUDA版本, 以及否能使用GPU
若無gpu 請點選右側->已連線->變更執行階段類型->T4 Gpu

In [None]:
!nvidia-smi
import torch
torch.cuda.is_available()

## 安裝套件
安裝完成後建議, 點選上方選單, 直接階段->重新啟動工作階段, 確保 library重置

In [None]:
!pip install cohere gdown kaleido langchain openai pyngrok pypdf python-dotenv sentence-transformers tiktoken -q
!pip install accelerate bitsandbytes hf_transfer huggingface_hub optimum transformers -q 

### HF_TOKEN

In [None]:
# HF_TOKEN method 1

!echo "HF_TOKEN=hf_xxxxxxx" > .env
from dotenv import load_dotenv
load_dotenv() # loads env variables

In [None]:
# HF_TOKEN method 2

import os
os.environ["HF_TOKEN"] = "hf_xxxxxxx"

In [None]:
# OPENAPI KEY  method 3

import os
from typing import TextIO
from getpass import getpass
os.environ["HF_TOKEN"] = getpass()

### LOAD LIBRARY

In [None]:
# LOAD LIBRARY
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline, BitsAndBytesConfig
from langchain.llms.huggingface_pipeline import HuggingFacePipeline
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain
import torch

### Download model

In [None]:
%%bash
# Download model
mkdir -p /content/Llama-2-7b-chat-hf
HF_HUB_ENABLE_HF_TRANSFER=1 huggingface-cli download meta-llama/Llama-2-7b-chat-hf --local-dir /content/Llama-2-7b-chat-hf  --local-dir-use-symlinks False

### Load Model
temperature 的參數值越小，模型就會回傳越確定的結果。如果調高該參數值，大語言模型可能會返回更隨機的結果，也就是說這可能會帶來更多樣化或更具創造性的產出

In [None]:
#################################################################
# Tokenizer
#################################################################

MODEL_ID = "/content/Llama-2-7b-chat-hf"
tokenizer = AutoTokenizer.from_pretrained(MODEL_ID, use_fast=True)
#tokenizer.pad_token = tokenizer.eos_token
#tokenizer.padding_side = "right"

#################################################################
# bitsandbytes parameters
#################################################################

# Activate 4-bit precision base model loading
use_4bit = True

# Compute dtype for 4-bit base models
bnb_4bit_compute_dtype = "float16"

# Quantization type (fp4 or nf4)
bnb_4bit_quant_type = "nf4"

# Activate nested quantization for 4-bit base models (double quantization)
use_nested_quant = False

#################################################################
# Set up quantization config
#################################################################
compute_dtype = getattr(torch, bnb_4bit_compute_dtype)

bnb_config = BitsAndBytesConfig(
    load_in_4bit=use_4bit,
    bnb_4bit_quant_type=bnb_4bit_quant_type,
    bnb_4bit_compute_dtype=compute_dtype,
    bnb_4bit_use_double_quant=use_nested_quant,
)

#################################################################
# Load pre-trained config
#################################################################
model = AutoModelForCausalLM.from_pretrained(
    MODEL_ID,
    quantization_config=bnb_config,
)


#################################################################
# Pipeline
#################################################################
pipe = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    max_new_tokens=512,
    temperature=0.7,
    repetition_penalty=1.1,
)

llm = HuggingFacePipeline(pipeline=pipe, model_kwargs={"temperature": 0.0})

### QUESTION to Model

In [None]:
response=llm("什麼是聯邦式學習?")

print(response)

## LANGCHAIN PROMPT

In [None]:
# PROMPT01
template01="""[INST] <<SYS>>
You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe.  Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature. If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.
<</SYS>>
{question}[/INST]

"""

template02="""<|im_start|>system
You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe.  Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature. If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.<|im_end|>
<|im_start|>user
{question}<|im_end|>
<|im_start|>assistant
"""


template03="""You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe.  Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature. If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.

### Instruction:
{question}

### Response:
"""


prompt01 = PromptTemplate(template=template01, input_variables=["question"])
prompt02 = PromptTemplate(template=template02, input_variables=["question"])
prompt03 = PromptTemplate(template=template03, input_variables=["question"])

In [None]:
# PROMPT RESULT
question = "什麼是聯邦式學習?"
print(prompt01.format(question=question))

print(prompt02.format(question=question))

print(prompt03.format(question=question))

## LANGCHAIN LLM+PROMPT

In [None]:
## Create Chain (prompt + model)
chain01 = LLMChain(llm=llm, prompt=prompt01)

chain02 = LLMChain(llm=llm, prompt=prompt02)

chain03 = LLMChain(llm=llm, prompt=prompt03)

### Lnagchain QUESTION

In [None]:
question = "什麼是聯邦式學習?"
print(chain01.run(question=question))

In [None]:
question = "什麼是聯邦式學習?"
print(chain02.run(question=question))

In [None]:
question = "什麼是聯邦式學習?"
print(chain03.run(question=question))