##### Sẽ có nhiều cách sử dụng model, tuỳ thuộc 3rd party integration mà gọi đến
##### Các 3rd party khác dùng Class import
##### Hugging Face sẽ yêu cầu HuggingFacePipeline 

In [1]:
from configparser import ConfigParser
import os
config = ConfigParser()
config.read("./config.ini")
os.environ["OPENAI_API_KEY"]=config["KEY"]["openai_key"]

## Using ChatOpenAI

In [None]:
!pip install langchain-openai

In [7]:
from langchain_openai import ChatOpenAI

In [8]:
llm = ChatOpenAI(
        api_key=os.environ["OPENAI_API_KEY"],
        temperature=0.2,
        max_tokens=100
)

In [9]:
output = llm.invoke("what is transformers architecture")

In [10]:
output.content

'The Transformers architecture is a type of deep learning model that has been widely used in natural language processing tasks, such as language translation, text generation, and sentiment analysis. It was introduced in a paper titled "Attention is All You Need" by Vaswani et al. in 2017.\n\nThe key innovation of the Transformers architecture is the self-attention mechanism, which allows the model to weigh the importance of different words in a sentence when making predictions. This mechanism enables the model to capture long-range'

## Using HuggingFace + Quantization

### Quantization

In [11]:
import torch
from transformers import BitsAndBytesConfig

In [12]:
model_name = "microsoft/phi-2"

nf4_config = BitsAndBytesConfig(
        load_in_4bit=True,
        bnb_4bit_quant_type="nf4",
        bnb_4bit_use_double_quant=True,
        bnb_4bit_compute_dtype=torch.bfloat16
)

### HuggingFace

In [13]:
from transformers import (
        AutoTokenizer,
        AutoModelForCausalLM,
        pipeline
)

In [14]:
model = AutoModelForCausalLM.from_pretrained(
        model_name,
        quantization_config=nf4_config,
        low_cpu_mem_usage=True
)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/735 [00:00<?, ?B/s]

model.safetensors.index.json:   0%|          | 0.00/35.7k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/2 [00:00<?, ?it/s]

model-00001-of-00002.safetensors:   0%|          | 0.00/5.00G [00:00<?, ?B/s]

model-00002-of-00002.safetensors:   0%|          | 0.00/564M [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

In [25]:
tokenizer = AutoTokenizer.from_pretrained(model_name)

In [26]:
pipeline_model = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    max_new_tokens=1024,
    pad_token_id=tokenizer.eos_token_id
)

In [29]:
from langchain.llms.huggingface_pipeline import HuggingFacePipeline

In [34]:
llm = HuggingFacePipeline(
    pipeline=pipeline_model,
    model_kwargs={
        "temperature":0.2,
        "max_tokens":100
    }
)

In [35]:
output = llm.invoke("what is transformers architecture")

In [36]:
output

'what is transformers architecture?\n\nA transformer is a device that can change the voltage and current of an alternating current (AC) signal. Transformers are widely used in power transmission and distribution, as well as in electronic devices such as radios, televisions, and computers. Transformers have two main parts: a primary coil and a secondary coil. The primary coil is connected to the AC source, and the secondary coil is connected to the load. The AC source produces an alternating current that flows through the primary coil, creating a changing magnetic field. The changing magnetic field induces an alternating current in the secondary coil, which can be adjusted by changing the number of turns in the coils. Transformers can also have a third coil, called a neutral or ground coil, that connects the secondary circuit to the ground.\n\nWhat are the types of transformers?\n\nTransformers can be classified into two main types: step-up and step-down. A step-up transformer increases