# Huggingface
Hugging provide two ways to invoke LLM
1. By Api token 
2. local model


## install Libs

In [23]:
!pip install langchain huggingface_hub transformers sentence_transformers accelerate bitsandbytes

Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple


In [24]:
from getpass import getpass
HUGGINGFACEHUB_API_TOKEN = getpass()

In [25]:
import os
os.environ["HUGGINGFACEHUB_API_TOKEN"] = HUGGINGFACEHUB_API_TOKEN

In [26]:
from langchain_community.llms import HuggingFaceHub
from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate

In [27]:
### create pormpt template
question = "Where is the capital of China?"
template = """Question: {question}
Answer: Let's think step by step."""
prompt = PromptTemplate(template=template, input_variables=["question"])

In [28]:
repo_id ="google/flan-t5-base"

In [29]:
llm = HuggingFaceHub(repo_id=repo_id)
llm_chain = LLMChain(prompt=prompt, llm=llm,llm_kwargs={"temperature":0,"max_length":512})
print(llm_chain.run(question))


China is located in the north of the world. The capital of China is Beijing. The answer: Beijing.


## Create RAG pipeline

In [30]:
! pip install pypdf faiss-cpu

Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple


In [31]:
from langchain.document_loaders import PyPDFLoader
loader = PyPDFLoader("data/Eng_11_Qiulu.pdf")
pages = loader.load()

from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=300,
    chunk_overlap=50,
)
docs = text_splitter.split_documents(pages[:4])

In [32]:
# pip install --upgrade langchain_community
#! pip uninstall InstructorEmbedding

In [33]:
from langchain_community.embeddings import HuggingFaceInferenceAPIEmbeddings
from langchain_community.vectorstores import FAISS



In [34]:
# 不需要传递 api_key，HuggingFaceInstructEmbeddings 会自动从环境变量读取 API 密钥
embedding = HuggingFaceInferenceAPIEmbeddings(api_key = HUGGINGFACEHUB_API_TOKEN,model_name="sentence-transformers/all-MiniLM-L6-v2")

# 假设 docs 已经准备好了文档列表
db = FAISS.from_documents(docs, embedding)

# 执行相似度搜索
query = "What's the person name?"
result_slmii = db.similarity_search(query, k=3)

In [35]:
source_knowledge = "\n".join([x.page_content for x in result_slmii])

In [36]:
augmented_prompt = """Using the contexts below, answer the query.
contexts:
{source_knowledge}
query:
{query}
"""

In [37]:
prompt = PromptTemplate(template=augmented_prompt,input_variables=["source_knowledge","query"])

llm_chain = LLMChain(prompt=prompt, llm=llm,llm_kwargs={"temperature":0,"max_length":1024})


In [38]:
print(llm_chain.run({"source_knowledge": source_knowledge,"query":query}))

Qiu lu


In [39]:
argumented_prompt_2=f"""Using the contexts below, anser the query.
contexts:
{source_knowledge}
query:
{query}
"""

In [40]:
print(argumented_prompt_2)

Using the contexts below, anser the query.
contexts:
automated testing. 
 
April 2019 – DEC 2023: Thales Facial Recognition Platform (FRP) 
Role: Technical Lead for Product Security Solutions 
Project Description: Leading a cross-disciplinary team to develop a multifunctional
platform using Thales' advanced facial recognition algorithms. Applications include 
airport automatic boarding, customs face recognition verification, and building access 
control systems. 
Responsibilities and Achievements:
Personal Information 
Name: Qiu lu 
Gender: Male 
Contact Number: 13552136227 
Location: Beijing 
Email: tjqiulu@hotmail.com 
 
 
Professional Skills 
Information Security: 
In-depth knowledge and practical experience in the field of information security.
query:
What's the person name?

