## Langchain PROMPT - OpenHermes-2.5-Mistral-7B-GPTQ

1. GPTQ DEMO
- https://huggingface.co/TheBloke/OpenHermes-2.5-Mistral-7B-GPTQ

2. LANGCHAIN 手冊
https://python.langchain.com/docs/modules/model_io/prompts/prompt_templates/

3. 學習LANGCHAIN -> llm +  prompt

## 初始環境設定

In [None]:
import os
from pathlib import Path
HOME = str(Path.home())
Add_Binarry_Path=HOME+'/.local/bin'
os.environ['PATH']=os.environ['PATH']+':'+Add_Binarry_Path
current_foldr=!pwd
current_foldr=current_foldr[0]
current_foldr

## 確認CUDA版本, 以及否能使用GPU
若無gpu 請點選右側->已連線->變更執行階段類型->T4 Gpu

In [None]:
!nvidia-smi
import torch
torch.cuda.is_available()

## 安裝套件
安裝完成後建議, 點選上方選單, 直接階段->重新啟動工作階段, 確保 library重置

In [None]:
!pip install cohere gdown langchain openai pypdf python-dotenv tiktoken -q
!pip install accelerate bitsandbytes hf_transfer huggingface_hub optimum transformers -q 
!pip install auto-gptq
#!pip install auto-gptq --extra-index-url https://huggingface.github.io/autogptq-index/whl/cu118/  -q # Use cu117 if on CUDA 11.7

### LOAD LIBRARY

In [None]:
# LOAD LIBRARY
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
from langchain.llms.huggingface_pipeline import HuggingFacePipeline
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain

### Download model

In [None]:
%%bash
# Download model
mkdir -p /content/OpenHermes-2.5-Mistral-7B-GPTQ
HF_HUB_ENABLE_HF_TRANSFER=1 huggingface-cli download TheBloke/OpenHermes-2.5-Mistral-7B-GPTQ --local-dir /content/OpenHermes-2.5-Mistral-7B-GPTQ --local-dir-use-symlinks False

### Load Model
temperature 的參數值越小，模型就會回傳越確定的結果。如果調高該參數值，大語言模型可能會返回更隨機的結果，也就是說這可能會帶來更多樣化或更具創造性的產出


In [None]:
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

MODEL_ID = "/content/OpenHermes-2.5-Mistral-7B-GPTQ"
tokenizer = AutoTokenizer.from_pretrained(MODEL_ID, use_fast=True)
model = AutoModelForCausalLM.from_pretrained(MODEL_ID, device_map="auto")

pipe = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    max_new_tokens=512,
    do_sample=True,
    temperature=0.7,
    top_p=0.95,
    top_k=40,
    repetition_penalty=1.1
)

llm = HuggingFacePipeline(pipeline=pipe, model_kwargs={"temperature": 0.0})

### QUESTION to Model

In [None]:
response=llm("什麼是聯邦式學習?")

print(response)

## LANGCHAIN PROMPT

In [None]:
# PROMPT01
template01="""[INST] <<SYS>>
You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe.  Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature. If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.
<</SYS>>
{question}[/INST]

"""

template02="""<|im_start|>system
You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe.  Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature. If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.<|im_end|>
<|im_start|>user
{question}<|im_end|>
<|im_start|>assistant
"""


template03="""You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe.  Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature. If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.

### Instruction:
{question}

### Response:
"""


prompt01 = PromptTemplate(template=template01, input_variables=["question"])
prompt02 = PromptTemplate(template=template02, input_variables=["question"])
prompt03 = PromptTemplate(template=template03, input_variables=["question"])

In [None]:
# PROMPT RESULT
question = "什麼是聯邦式學習?"
print(prompt01.format(question=question))

print(prompt02.format(question=question))

print(prompt03.format(question=question))

## LANGCHAIN LLM+PROMPT

In [None]:
## Create Chain (prompt + model)
chain01 = LLMChain(llm=llm, prompt=prompt01)

chain02 = LLMChain(llm=llm, prompt=prompt02)

chain03 = LLMChain(llm=llm, prompt=prompt03)

### Lnagchain QUESTION

In [None]:
question = "什麼是聯邦式學習?"
print(chain01.run(question=question))

In [None]:
question = "什麼是聯邦式學習?"
print(chain02.run(question=question))

In [None]:
question = "什麼是聯邦式學習?"
print(chain03.run(question=question))