# Using langchain with a local LLM or API-Based

See also
https://colab.research.google.com/drive/1h2505J5H4Y9vngzPD08ppf1ga8sWxLvZ?usp=sharing#scrollTo=GMg2xiRnfm21

Loading a local LLM

In [58]:
from langchain.llms import HuggingFacePipeline
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline, AutoModelForSeq2SeqLM

model_id = 'google/flan-t5-large'# go for a smaller model if you dont have the VRAM
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSeq2SeqLM.from_pretrained(model_id)

### playing around with the language model

In [59]:
input_context = "The weather is really nice today. I'm thinking about going for a"
input_ids = tokenizer.encode(input_context, return_tensors="pt")
for i in range(10):
    output = model.generate(input_ids, max_length=50, do_sample=True, temperature=2.0)
    print(tokenizer.decode(output.flatten(), skip_special_tokens=True), len(output.flatten()))
    
res = output.flatten()
for i,t in enumerate(res):
    print(i, '  ', t, '  ', tokenizer.decode(t))


play or have picnic. 7
running run but probably one at most to have great exercise. 14
hike 3
exercise 3
workout 3
exercise run first in light headed gusts like yesterday. 13
hike or running/riding 8
hiking 3
run after school. But it gets windier during the run as each lap passes by in an amazing array of motion. Sometimes it is much closer than any aural music to music â€” that, being itself at times, provides 50
swim? 4
0    tensor(0)    <pad>
1    tensor(9728)    swim
2    tensor(58)    ?
3    tensor(1)    </s>


### Using langchain prompt template

In [60]:
pipe = pipeline(
    "text2text-generation",
    model=model, 
    tokenizer=tokenizer, 
    max_length=100
)

llm = HuggingFacePipeline(pipeline=pipe)

In [61]:
from langchain import PromptTemplate, LLMChain

template = """
Question: {question}
Let's think step by step. Provide the answer to the question in the following way:
Answer:
"""

prompt = PromptTemplate(template=template, input_variables=["question"])

llm_chain = LLMChain(prompt=prompt, 
                     llm=llm,
                     verbose=True
                     )

question = "What is the capital of England?"

print(llm_chain.run(question))



[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3m
Question: What is the capital of England?
Let's think step by step. Provide the answer to the question in the following way:
Answer:
[0m

[1m> Finished chain.[0m
The capital of England is London. London is the capital of England. The answer: London.


In [62]:
llm_chain.run("Oliver has 10 tennis balls. He loses 1 ball every 2 weeks, after 4 weeks he buys 2 packs each having 2 balls. How many balls does he have after 8 weeks? ")



[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3m
Question: Oliver has 10 tennis balls. He loses 1 ball every 2 weeks, after 4 weeks he buys 2 packs each having 2 balls. How many balls does he have after 8 weeks? 
Let's think step by step. Provide the answer to the question in the following way:
Answer:
[0m

[1m> Finished chain.[0m


'Oliver loses 1 ball every 2 weeks so he loses 1 * 2 = 2 balls every 2 weeks. He buys 2 packs of tennis balls so he buys 2 * 2 = 4 packs of tennis balls. He loses 1 ball every 2 weeks so he loses 1 * 2 = 2 balls every 2 weeks. He buys 4 packs of tennis balls so he buys 4 * 4 = 32 balls. He has 32 balls and loses 1 ball every'