##### Chains dùng để kết hợp các module nhỏ thành chuỗi dài
##### Có nhiều cách dùng chains:
##### - Cách 1: LLMChains(trong đó có chứa các parameter và truyền các module vào)
##### - Cách 2: Sử dụng dấu " | " để kết hợp từng module


In [None]:
!pip install -r requirements.txt

In [2]:
from configparser import ConfigParser
import os
config = ConfigParser()
config.read("./config.ini")
os.environ["OPENAI_API_KEY"]=config["KEY"]["openai_key"]

In [11]:
import torch
from transformers import(
        BitsAndBytesConfig,
        AutoTokenizer,
        AutoModelForCausalLM,
        pipeline
)

In [9]:
model_name = "microsoft/phi-2"

## Quantization Configuration

In [5]:
bnb_config = BitsAndBytesConfig(
        load_in_4bit=True,
        bnb_4bit_quant_type="nf4",
        bnb_4bit_use_double_quant=True,
        bnb_4bit_compute_dtype=torch.bfloat16
)


## Create Model

In [6]:
model = AutoModelForCausalLM.from_pretrained(
        model_name,
        quantization_config = bnb_config,
        low_cpu_mem_usage=True,
)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/735 [00:00<?, ?B/s]

model.safetensors.index.json:   0%|          | 0.00/35.7k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/2 [00:00<?, ?it/s]

model-00001-of-00002.safetensors:   0%|          | 0.00/5.00G [00:00<?, ?B/s]

model-00002-of-00002.safetensors:   0%|          | 0.00/564M [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

## Create Tokenizer

In [13]:
tokenizer = AutoTokenizer.from_pretrained(model_name)

tokenizer_config.json:   0%|          | 0.00/7.34k [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/798k [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/2.11M [00:00<?, ?B/s]

added_tokens.json:   0%|          | 0.00/1.08k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/99.0 [00:00<?, ?B/s]

## Create Pipeline

In [17]:
model_pipeline = pipeline(
    "text-generation",
    model = model,
    tokenizer = tokenizer,
    max_new_tokens=256,
    pad_token_id = tokenizer.eos_token_id
)

## Create HuggingFace Pipeline

In [14]:
from langchain.llms.huggingface_pipeline import HuggingFacePipeline

In [18]:
llm = HuggingFacePipeline(
    pipeline=model_pipeline,
    model_kwargs={
        "temperature":0.2
    }
)

## Create Output Parser

In [36]:
from pydantic import BaseModel, Field
from langchain_core.output_parsers import JsonOutputParser

In [37]:
class Joke(BaseModel):
        setup: str = Field(description="question to set up a joke")
        punchline: str = Field(description="answer to resolve the joke")

parser = JsonOutputParser(pydantic_object=Joke)

## Create Prompt Template

In [19]:
from langchain.prompts import PromptTemplate

In [38]:
prompt = PromptTemplate(
    template="Answer the user query \n{format_instructions}\n{query}\n",
    input_variables=["query"],
    partial_variables={"format_instructions":parser.get_format_instructions()}
)

In [40]:
prompt.format(query="tell me a joke")

'Answer the user query \nThe output should be formatted as a JSON instance that conforms to the JSON schema below.\n\nAs an example, for the schema {"properties": {"foo": {"title": "Foo", "description": "a list of strings", "type": "array", "items": {"type": "string"}}}, "required": ["foo"]}\nthe object {"foo": ["bar", "baz"]} is a well-formatted instance of the schema. The object {"properties": {"foo": ["bar", "baz"]}} is not well-formatted.\n\nHere is the output schema:\n```\n{"properties": {"setup": {"description": "question to set up a joke", "title": "Setup", "type": "string"}, "punchline": {"description": "answer to resolve the joke", "title": "Punchline", "type": "string"}}, "required": ["setup", "punchline"]}\n```\ntell me a joke\n'

## Create Chains = PromptTemplate | LLM | OutputParser

In [43]:
chain = prompt | llm | parser

In [46]:
output = chain.invoke({"query":"tell me a joke story"})

In [47]:
output

{'properties': {'setup': {'description': 'question to set up a joke',
   'title': 'Setup',
   'type': 'string'},
  'punchline': {'description': 'answer to resolve the joke',
   'title': 'Punchline',
   'type': 'string'}},
 'required': ['setup', 'punchline']}