# Local Llama

There are many way to run Llama locally: HuggingFace (if you have the GPU), local [Ollama](https://ollama.com/), [vllm](https://github.com/vllm-project/vllm), [Llamafile](https://github.com/Mozilla-Ocho/llamafile) or [llama.cpp](https://github.com/ggerganov/llama.cpp).

This notes show how to use llamacpp in the server, and `LlamaCpp` langchain API.

To start, install `llama-cpp-python` first (see [here](https://python.langchain.com/docs/integrations/llms/llamacpp/)). Then download the model from HuggingFace. Choose a model that might fit to our memory, for example [TinyLlama](https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0). Download the .gguf file (the quantized model), then run load the model.

In [74]:
from langchain_community.llms import LlamaCpp, Llamafile
from langchain_core.callbacks import CallbackManager, StreamingStdOutCallbackHandler
from langchain_core.prompts import PromptTemplate

# Using LlamaCpp
# callback_manager = CallbackManager([StreamingStdOutCallbackHandler()]) #since llama.cpp is stdout handler
# llm = LlamaCpp(model_path='../model/tinyllama-1.1b-chat-v1.0.Q2_K.gguf',  callback_manager=callback_manager, verbose=True)

# or using LlamaFile
llm = Llamafile()

In [70]:

# some llm requires a specific template system (due to how they train), so we must adhere to them to get a better result
template = """
<|system|>
You are a helpful assistant to answer question with simple answer. 
Answer only the question asked.
<|user|>
{question}
<|assitant|>
"""
prompt = PromptTemplate.from_template(template)

s = ( prompt | llm).invoke('Why Germany invade France in WW2? and why France can\'t defend it?')

In [71]:
print(s)

Germany's invasion of France was ordered by Adolf Hitler, who saw the Allied Powers (British, French, Soviet) as a potential threat to his growing power. The Germans believed that their strategy of massive bombing attacks on cities, along with the threat of German military might, would weaken their enemies and result in a humiliating defeat. In response, France invaded Germany in 1940. However, the French were outnumbered by many times and had to rely heavily on air power to overcome the enemy's defensive tactics. The Allied Powers responded with massive bombing campaigns that devastated German infrastructure, causing significant damage to the war effort.


In [76]:
s = ( prompt | llm).invoke('How to create and build habit?')

In [77]:
print(s)

Habits can be created or broken through consistency in effortless actions. Here's how you can create and build habits:

1. Identify your goal (or goals) - this will help guide your actions.
2. Create the actionable steps/steps for achieving your goal(s).
3. Establish a routine or schedule for those steps.
4. Set small, achievable goals within that routine/schedule.
5. Practice regularly in achieving these small goals.
6. Reward yourself with something you enjoy (like coffee) after achieving each goal(s).
7. Make this process a habit - a habit is a routine or schedule, repeated actions over time.
8. Consistent effort (like consistently putting in the same amount of work every day/week).
9. Regularity and persistence are key to creating and building habits.
10. Finally, make it fun! 

Hope this helps!
