<a href="https://colab.research.google.com/github/Dimildizio/DS_course/blob/main/Neural_networks/NLP/Langchain/Langchain_hf.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Langchain with huggingface

## Get libs

In [1]:
!pip -q install transformers huggingface-hub langchain

## Imports

In [18]:
import os
import torch
from langchain import PromptTemplate, HuggingFaceHub, LLMChain, HuggingFacePipeline
from langchain.embeddings import HuggingFaceEmbeddings

## Get huggingface API KEY

In [3]:
with open('./hf_api.txt') as f:
  key = f.readline()

In [4]:
os.environ['HUGGINGFACEHUB_API_TOKEN'] = key

## Create a prompt for messages

In [5]:
template = """Question: {question}
Answer: Lets' think step by step.
"""
prompt = PromptTemplate(template=template, input_variables=["question"])

## Run online

### Upload a model

In [None]:
repo_id = "google/flan-t5-base"
model = HuggingFaceHub(repo_id=repo_id, model_kwargs={'temperature':0.5, 'max_length':256})


In [7]:
llm_chain = LLMChain(prompt=prompt, llm=model)

### Execute

In [10]:
question = "How to make a sandwich?"
q = 'Capital of Russia?'

In [9]:
result = llm_chain.run(question)
print(len(result), result)

108 A sandwich is a sandwich that is made with bread, cheese, lettuce, tomato, and cheese. The answer: sandwich.


## Run locally

### Import additional classes and func form HF

In [22]:
#! pip install -q accelerate bitsandbytes

In [20]:
from transformers import AutoTokenizer, AutoModelForCausalLM, AutoModelForSeq2SeqLM, pipeline

###Load model to VRAM

In [None]:
model_id = 'google/flan-t5-large'
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSeq2SeqLM.from_pretrained(model_id)

### Create model

In [9]:
pipe = pipeline("text2text-generation", model=model, tokenizer=tokenizer, max_length=128)
llm = HuggingFacePipeline(pipeline=pipe)

### Execute

In [19]:
llm(q)

'st petersburg'

### Create langchain model instance

In [17]:
llm_chain = LLMChain(prompt=prompt, llm=llm)

In [20]:
print(llm_chain.run(q))

Moscow is the capital of Russia. Moscow is the capital of Russia. So the answer is Moscow.


In [7]:
def load_model(name, model_type='decoder'):
  model_id = name
  tokenizer = AutoTokenizer.from_pretrained(model_id)
  if model_type == 'decoder':
    model = AutoModelForCausalLM.from_pretrained(model_id)
    task = 'text-generation'
  else:
     model = AutoModelForSeq2SeqLM.from_pretrained(model_id)
     task = 'text2text-generation'
  pipe = pipeline(task, model=model, tokenizer=tokenizer, max_length=128)
  llm = HuggingFacePipeline(pipeline=pipe)
  return llm

def get_answer(llm, prompt, question):
  print(llm(question))
  llm_chain = LLMChain(prompt=prompt, llm=llm)
  result = llm_chain.run(question)
  print(result)

In [None]:
llm1 = load_model('gpt2-medium')

In [27]:
get_answer(llm1, prompt, q)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.




The word, "Russian," was introduced into Russian in 1794 by Alexander von Humboldt. The word comes from the Latin word "rhaeus" (the god) and was originally composed from the Latin word "re" (to live). The Russian word rhaeus, when added to Spanish, literally means "God."

When we say "The City of Heaven" (Moscow) was the first capital we hear for a second time, we can see from this that St. Petersburg and its area today, have their roots in the ancient Roman cities. In fact the city does
It takes 3 1/2 years from start of construction until completion.  The cost of the site has been around 3 USD MWh.  That's over 6 Million USD to construct, 5 million USD to operate and 2 million USD to sell. For Russia's largest metro system, the cost of the train tunnel (main tunnel) is around 5.5 billion Russian rubles ($1 billion).
In its first year, it cost over 3 million rubles to construct the tunnel alone and 4 million rubles to provide an emergency lift during


### Try blenderbot

In [8]:
bot_id = 'facebook/blenderbot-1B-distill'
blenderbot = load_model(bot_id, 'ed')

Downloading (…)neration_config.json:   0%|          | 0.00/347 [00:00<?, ?B/s]

In [11]:
get_answer(blenderbot, prompt, q)

 The capital of Russia is Moscow. It is the most populous city in the world.
 Yes, the capital of Russia is Moscow, located in the southeastern part of the country.


## Get embeddings

In [19]:
! pip -q install sentence_transformers

[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/86.0 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [91m━━━━━━━━━━━━━━━━━━━[0m[90m╺[0m[90m━━━━━━━━━━━━━━━━━━━━[0m [32m41.0/86.0 kB[0m [31m1.3 MB/s[0m eta [36m0:00:01[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m86.0/86.0 kB[0m [31m1.8 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.3/1.3 MB[0m [31m7.0 MB/s[0m eta [36m0:00:00[0m
[?25h  Building wheel for sentence_transformers (setup.py) ... [?25l[?25hdone


In [23]:
emb_name = 'sentence-transformers/all-mpnet-base-v2'

In [24]:
hfemb = HuggingFaceEmbeddings(model_name=emb_name)

Downloading (…)a8e1d/.gitattributes:   0%|          | 0.00/1.18k [00:00<?, ?B/s]

Downloading (…)_Pooling/config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

Downloading (…)b20bca8e1d/README.md:   0%|          | 0.00/10.6k [00:00<?, ?B/s]

Downloading (…)0bca8e1d/config.json:   0%|          | 0.00/571 [00:00<?, ?B/s]

Downloading (…)ce_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

Downloading (…)e1d/data_config.json:   0%|          | 0.00/39.3k [00:00<?, ?B/s]

Downloading pytorch_model.bin:   0%|          | 0.00/438M [00:00<?, ?B/s]

Downloading (…)nce_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

Downloading (…)cial_tokens_map.json:   0%|          | 0.00/239 [00:00<?, ?B/s]

Downloading (…)a8e1d/tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

Downloading (…)okenizer_config.json:   0%|          | 0.00/363 [00:00<?, ?B/s]

Downloading (…)8e1d/train_script.py:   0%|          | 0.00/13.1k [00:00<?, ?B/s]

Downloading (…)b20bca8e1d/vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

Downloading (…)bca8e1d/modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

In [None]:
hfemb.embed_query(q)

In [27]:
hfemb.embed_documents([q, question])

[[0.027558671310544014,
  -0.041417233645915985,
  -0.017652276903390884,
  0.028651589527726173,
  0.0032571759074926376,
  0.016865737736225128,
  -0.015705812722444534,
  -0.05212003365159035,
  0.032898418605327606,
  0.020734786987304688,
  0.05036183446645737,
  -0.0648278295993805,
  0.01860487461090088,
  -0.015274914912879467,
  0.03353781998157501,
  -0.07157767564058304,
  -0.0022415427956730127,
  -0.05910788103938103,
  -0.005687318742275238,
  -0.01653829962015152,
  -0.026945127174258232,
  0.020301835611462593,
  -0.03379145637154579,
  0.0021648199763149023,
  0.013282177969813347,
  0.0564885213971138,
  -0.021916832774877548,
  -0.016664599999785423,
  -0.022723939269781113,
  0.04977184161543846,
  0.056581951677799225,
  -0.04610530287027359,
  -0.022694159299135208,
  0.008785776793956757,
  1.334754642812186e-06,
  -0.017687182873487473,
  -0.04196169227361679,
  0.020545171573758125,
  0.026250043883919716,
  0.00814784225076437,
  0.03872555121779442,
  -0.0207