# LLAMA2-7B using Langchain

### Instaling important libraries

In [None]:
!pip install -q transformers einops accelerate langchain bitsandbytes sentence_transformers pypdf chromadb xformers

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m44.6/44.6 kB[0m [31m585.2 kB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m270.9/270.9 kB[0m [31m3.0 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m803.6/803.6 kB[0m [31m10.5 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m105.0/105.0 MB[0m [31m8.2 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m86.0/86.0 kB[0m [31m6.6 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m15.9/15.9 MB[0m [31m26.7 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m283.9/283.9 kB[0m [31m26.4 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m509.0/509.0 kB[0m [31m

## loging to huggingface hub

In [None]:

!git config --global credential.helper store
!huggingface-cli login --token 'your-hugging-face-api'

Token will not been saved to git credential helper. Pass `add_to_git_credential=True` if you want to set the git credential as well.
Token is valid (permission: write).
Your token has been saved to /root/.cache/huggingface/token
Login successful


In [None]:
import torch
from langchain import HuggingFacePipeline
from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig, pipeline


Loading Llama-2-7B Model and tokens

In [None]:
MODEL_NAME = "meta-llama/Llama-2-7b-chat-hf"

tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME, use_fast=True)

In [None]:
model = AutoModelForCausalLM.from_pretrained(
    MODEL_NAME,
    torch_dtype=torch.float16,
    trust_remote_code=True,
    device_map="auto"
)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer_config.json:   0%|          | 0.00/1.62k [00:00<?, ?B/s]

tokenizer.model:   0%|          | 0.00/500k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.84M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/414 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/614 [00:00<?, ?B/s]

model.safetensors.index.json:   0%|          | 0.00/26.8k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/2 [00:00<?, ?it/s]

model-00001-of-00002.safetensors:   0%|          | 0.00/9.98G [00:00<?, ?B/s]

model-00002-of-00002.safetensors:   0%|          | 0.00/3.50G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/188 [00:00<?, ?B/s]

Defining model parameters

In [None]:
generation_config = GenerationConfig.from_pretrained(MODEL_NAME)
generation_config.max_new_tokens = 1024
generation_config.temperature = 0.0001
generation_config.top_p = 0.95
generation_config.do_sample = True
generation_config.repetition_penalty = 1.15


Creating Huggingface Pipeline

In [None]:
text_pipeline = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    generation_config=generation_config,
)

llm = HuggingFacePipeline(pipeline=text_pipeline, model_kwargs={"temperature": 0})

Question answering to the model

In [None]:

result = llm(prompt="Explain me the difference between Data Lakehouse and Data Warehouse.")

  warn_deprecated(


In [None]:
print(result)


A data lakehouse is a centralized repository that stores all of an organization's structured and unstructured data, including raw, unprocessed data from various sources, as well as processed and transformed data in various formats. A data warehouse, on the other hand, is a repository specifically designed to store data in a structured format for querying and analysis.
Here are some key differences between a data lakehouse and a data warehouse:
1. Structure: A data lakehouse stores data in its native format without any predefined schema or structure, while a data warehouse stores data in a structured format with a defined schema.
2. Data Types: A data lakehouse can store both structured and unstructured data, such as text files, images, videos, etc., while a data warehouse typically only stores structured data like relational databases.
3. Processing: A data lakehouse allows for more flexible processing and transformation of data through distributed computing frameworks like Apache Spa

Using Prompts 

In [None]:
from langchain import PromptTemplate

Defining prompt template

In [None]:

template = """
<s>[INST] <<SYS>>
you are my financial advisor provide me the financial advices based on my queries, if you dont know the about the query provide me generic response based on your knowledge
<</SYS>>

{text} [/INST]
"""

prompt = PromptTemplate(
    input_variables=["text"],
    template=template,
)

In [None]:
prompt

PromptTemplate(input_variables=['text'], template='\n<s>[INST] <<SYS>>\nyou are my financial advisor provide me the financial advices based on my queries, if you dont know the of the query provide me generic response based on your knowledge\n<</SYS>>\n\n{text} [/INST]\n')

In [None]:
text = "2023 indian health budget"
print(prompt.format(text=text))


<s>[INST] <<SYS>>
you are my financial advisor provide me the financial advices based on my queries, if you dont know the of the query provide me generic response based on your knowledge
<</SYS>>

2023 indian health budget [/INST]



In [None]:
result = llm(prompt.format(text=text))
print(result)

In [None]:
print(result)

Thank you for reaching out to me! I'm happy to help you with any questions or concerns you may have regarding the Indian Health Budget. However, I must inform you that as an AI language model, I do not have access to real-time information or updates on the current Indian Health Budget. The Union Budget is typically presented by the Finance Minister in Parliament every year, and it includes details on government spending and revenue projections for various sectors, including healthcare.

That being said, here are some general points about the Indian Health Budget:

1. Allocation: The Indian Health Budget has been increasing steadily over the years, with a significant increase of around 14% in the last budget (FY 2022-23). The total allocation for the Ministry of Health and Family Welfare in the FY 2022-23 was around INR 69,000 crore.
2. Priorities: The Indian Health Budget prioritizes primary healthcare, preventive care, and rural health initiatives. A significant portion of the budget 

### Using LLM Chains

In [None]:
from langchain.chains import LLMChain

In [None]:
chain = LLMChain(llm=llm, prompt=prompt)
result = chain.run(text)
print(result)

### Using Sequential chains

In [None]:
template = "<s>[INST] Use the summary {summary} and give 2 examples of practical applications with 1 sentence explaining each [/INST]"

examples_prompt = PromptTemplate(
    input_variables=["summary"],
    template=template,
)
examples_chain = LLMChain(llm=llm, prompt=examples_prompt)

In [None]:
from langchain.chains import SimpleSequentialChain

multi_chain = SimpleSequentialChain(chains=[chain, examples_chain], verbose=True)
result = multi_chain.run(text)
print(result.strip())

  warn_deprecated(




[1m> Entering new SimpleSequentialChain chain...[0m
[36;1m[1;3mThank you for reaching out to me! I'm happy to help you with any questions or concerns you may have regarding the Indian Health Budget. However, I must inform you that as an AI language model, I do not have access to real-time information or updates on the current Indian Health Budget. The Indian government has not yet released the official budget for 2023, and it is important to rely on credible sources for accurate information.

That being said, here are some general insights into the Indian healthcare system and budget:

1. India's healthcare expenditure: According to a report by the World Bank, India's total healthcare expenditure was around $75 billion in 2020. This accounts for approximately 4% of the country's GDP.
2. Government spending: The Indian government has been increasing its allocation towards healthcare over the years. In 2020-21, the government allocated around 3.6% of its total budget towards health

In [None]:
print(result.strip())

Sure, here are two examples of practical applications based on the information provided:

Example 1: Improving Access to Essential Medicines
The Indian government could allocate more funds towards improving the supply chain management of essential medicines, particularly in rural areas where access to these medicines is often limited. By doing so, they can ensure that people have access to life-saving medications at an affordable price, reducing the burden of non-communicable diseases.

Example 2: Strengthening Healthcare Workforce
India could invest in training and capacity building programs for healthcare professionals, particularly in underserved areas. By increasing the number of skilled healthcare workers, the government can improve the quality of care and reduce the disparities in healthcare access between urban and rural areas, ultimately leading to better health outcomes for all Indians.


#### Using RAG creating document QnA chain

In [None]:
from langchain.document_loaders import PyPDFLoader #UnstructuredMarkdownLoader

# loader = UnstructuredMarkdownLoader("bitcoin.md")
loader = PyPDFLoader('/content/CTGAN.pdf')
docs = loader.load()
len(docs)

15

In [None]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(chunk_size=1024, chunk_overlap=64)
texts = text_splitter.split_documents(docs)
len(texts)

49

In [None]:
texts[0]

Document(page_content='Modeling Tabular Data using Conditional GAN\nLei Xu\nMIT LIDS\nCambridge, MA\nleix@mit.eduMaria Skoularidou\nMRC-BSU, University of Cambridge\nCambridge, UK\nms2407@cam.ac.uk\nAlfredo Cuesta-Infante\nUniversidad Rey Juan Carlos\nMóstoles, Spain\nalfredo.cuesta@urjc.esKalyan Veeramachaneni\nMIT LIDS\nCambridge, MA\nkalyanv@mit.edu\nAbstract\nModeling the probability distribution of rows in tabular data and generating realistic\nsynthetic data is a non-trivial task. Tabular data usually contains a mix of discrete\nand continuous columns. Continuous columns may have multiple modes whereas\ndiscrete columns are sometimes imbalanced making the modeling difﬁcult. Existing\nstatistical and deep neural network models fail to properly model this type of data.\nWe design CTGAN , which uses a conditional generator to address these challenges.\nTo aid in a fair and thorough comparison, we design a benchmark with 7 simulated\nand 8 real datasets and several Bayesian network b

In [None]:
from langchain.embeddings import HuggingFaceEmbeddings

embeddings = HuggingFaceEmbeddings(
    model_name="sentence-transformers/all-mpnet-base-v2",
    model_kwargs={"device": "cuda"},
    encode_kwargs={"normalize_embeddings": True},
)
# thenlper/gte-large
# sentence-transformers/all-mpnet-base-v2
query_result = embeddings.embed_query(texts[0].page_content)
print(len(query_result))

.gitattributes:   0%|          | 0.00/1.18k [00:00<?, ?B/s]

1_Pooling/config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/10.6k [00:00<?, ?B/s]

config.json:   0%|          | 0.00/571 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

data_config.json:   0%|          | 0.00/39.3k [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/438M [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/239 [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/363 [00:00<?, ?B/s]

train_script.py:   0%|          | 0.00/13.1k [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

768


In [None]:
from langchain.vectorstores import Chroma

db = Chroma.from_documents(texts, embeddings, persist_directory="db")

In [None]:
results = db.similarity_search("what is gans", k=2)
print(results[0].page_content)

work, we would derive a theoretical justiﬁcation on why GANs can work on a distribution with both
discrete and continuous data.
Acknowledgements
This paper is partially supported by the National Science Foundation Grants ACI-1443068. We
(authors from MIT) also acknowledge generous support provided by Accenture for the synthetic
data generation project. Dr. Cuesta-Infante is funded by the Spanish Government research fundings
RTI2018-098743-B-I00 (MICINN/FEDER) and Y2018/EMT-5062 (Comunidad de Madrid).
References
[1]Martin Arjovsky, Soumith Chintala, and Léon Bottou. Wasserstein generative adversarial
networks. In International Conference on Machine Learning , 2017.
9


In [None]:
from langchain.chains import RetrievalQA

template = """
<s>[INST] <<SYS>>
Act as a financial expert. Use the following information to answer the question at the end.
<</SYS>>

{context}

{question} [/INST]
"""

prompt = PromptTemplate(template=template, input_variables=["context", "question"])


PromptTemplate(input_variables=['context', 'question'], template='\n<s>[INST] <<SYS>>\nAct as a financial expert. Use the following information to answer the question at the end.\n<</SYS>>\n \n{context}\n \n{question} [/INST]\n')

In [None]:
text = "2023 indian health budget"
print(prompt.format(context=text, question=text))


<s>[INST] <<SYS>>
Act as a financial expert. Use the following information to answer the question at the end.
<</SYS>>
 
2023 indian health budget
 
2023 indian health budget [/INST]



In [None]:
qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=db.as_retriever(search_kwargs={"k": 2}),
    return_source_documents=True,
    chain_type_kwargs={"prompt": prompt},
)

In [None]:
result = qa_chain(
    "How does proof-of-work solves the majority decision making problem? Explain like I am five."
)
print(result["result"].strip())

  warn_deprecated(


Hello there! *adjusts glasses* So, you want to know how proof-of-work solves the majority decision-making problem? Well, let me tell you a story! 😊

Imagine you're at a big party with all your friends, and you need to decide which game to play next. You have a bunch of different games to choose from, but some of them are really popular, and others aren't so popular. How do you decide which game to play? That's where proof-of-work comes in! 🎮

Proof-of-work is like a special kind of magic that helps your friends agree on what game to play. Here's how it works:

First, everyone at the party has their own idea of what game they want to play. Some people might want to play soccer, while others might want to play Monopoly. But here's the thing – not everyone will agree on the same game! 🤔

So, to make sure everyone agrees on the same game, we need something called "consensus." Consensus means that everyone has to agree on the same thing, like a majority vote. But sometimes, it's hard for ev

In [None]:
result

{'query': 'How does proof-of-work solves the majority decision making problem? Explain like I am five.',
 'result': '\nHello there! *adjusts glasses* So, you want to know how proof-of-work solves the majority decision-making problem? Well, let me tell you a story! 😊\n\nImagine you\'re at a big party with all your friends, and you need to decide which game to play next. You have a bunch of different games to choose from, but some of them are really popular, and others aren\'t so popular. How do you decide which game to play? That\'s where proof-of-work comes in! 🎮\n\nProof-of-work is like a special kind of magic that helps your friends agree on what game to play. Here\'s how it works:\n\nFirst, everyone at the party has their own idea of what game they want to play. Some people might want to play soccer, while others might want to play Monopoly. But here\'s the thing – not everyone will agree on the same game! 🤔\n\nSo, to make sure everyone agrees on the same game, we need something cal

In [None]:
result = qa_chain(
    "what is ctgan. how can i implement, provid me a python code"
)
print(result["result"].strip())