In [25]:
! pip install sentence-transformers langchain faiss-cpu openai transformers

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)




### 1. Load a PDF file or Webpage

In [None]:
from langchain.document_loaders import PyPDFLoader
loader = PyPDFLoader("Interview Questions.pdf")
pages = loader.load()

[Document(page_content='Q u a l c o m m  G e n A I  P r e p a r a t i o n\nM o d e l  O p t i m i z a t i o n  &  D e p l o y m e n t\n1.  What ar e quantization and pruning? Ho w do t he y help optimiz e AI models?\nThese ar e model compr ession t echniques  used to reduce the siz e, latency, and \npower consumption of deep le arning models ‚Äî especiall y important for deplo ying \nmodels on edge de vices lik e smartphones and Io T.\nüîπ  Quantization\nQuantization r educes the pr ecision of model p ar amet ers (weights and \nactivations), t ypically from 32-bit flo ating-point (flo at32\ue082 to 8-bit int egers (int8\ue082,  \nwithout signific antly impacting accurac y.\n‚úÖ  Benef it s:\n‚Ä¢Smaller model siz e\n‚Ä¢Faster inference (especiall y on mobile/edge de vices)\n‚Ä¢Reduced memor y usage\nüõ†  T ypes:\n‚Ä¢ P ost -tr aining quantization:  Apply after training ( easy, minimal e xtra steps)\n‚Ä¢ Quantization-aw ar e tr aining \ue081Q A T\ue082\ue092 Simulat es quantization dur

### 2. Split long texts into chunks

In [9]:
from langchain.text_splitter import RecursiveCharacterTextSplitter
splitter = RecursiveCharacterTextSplitter(chunk_size = 500, chunk_overlap = 50)
document = splitter.split_documents(pages)

### 3. Convert to embeddings

In [None]:
from langchain.vectorstores import FAISS
from langchain.embeddings import HuggingFaceBgeEmbeddings

embedding_model = HuggingFaceBgeEmbeddings(model_name = 'sentence-transformers/all-MiniLM-L6-v2')
vector_base = FAISS.from_documents(document, embedding_model)

<langchain_community.vectorstores.faiss.FAISS at 0x120325ad0>

### 4. Ask a question

In [22]:
query = "What is AWS?"
docs = vector_base.similarity_search(query, k=3)

### 5. Generate an answer using OpenAI or LLM

In [26]:

from transformers import pipeline
from langchain.llms import HuggingFacePipeline
from langchain.chains.question_answering import load_qa_chain

# Load GPT-2 via HuggingFace Transformers
generator = pipeline("text-generation", model="gpt2", max_new_tokens=200)

# Wrap in LangChain's LLM interface
llm = HuggingFacePipeline(pipeline=generator)

# Load the basic QA chain (note: no "with_sources" variant for HuggingFace models)
from langchain.chains.question_answering import load_qa_chain
chain = load_qa_chain(llm, chain_type="stuff")

# Run QA with your retrieved documents
response = chain.run(input_documents=docs, question=query)

print(response)

Device set to use mps:0
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
  test_elements = torch.tensor(test_elements)


Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer.

A ir flo w /
K ubeFlo w
PipelinesDefine and manage comple x ML workflows
SageMak er
Studio / GCP AI
W or kbench /
Azur e ML StudioFull cloud-b ased not ebooks with logging and deplo yment
Neptune.ai /
Comet.mlExperiment management and comp arison
Summar y of Section 6
T opic K e y P oint s
Cloud Plat f or msAWS ÓÇõ fle xibility, GCP ÓÇõ AI-nativ e, Azure = enterprise MS int egration
Sc aling AI Use data/model p arallelism, aut o-scaling, ser verless inference

‚Ä¢ Dat a Management ÓÇí DVC, Delta Lake
‚Ä¢ Exper iment T r ackingÓÇí MLflow, Weights & Biases
‚Ä¢ Pipelines ÓÇí Kubeflow, Airflow, TFX
‚Ä¢ Deplo ymentÓÇí Seldon, Bent oML, SageMak er
‚Ä¢ Monit or ingÓÇí Prometheus, Graf ana, Fiddler
‚úÖ ML Ops ensur es r epr oducibilit y ,  r eliabilit y ,  and sc alabilit y of machine le arning 
systems in pr oduction en vironments.