## Running Llama 3 on Mac, Windows or Linux
This notebook goes over how you can set up and run Llama 3.1 locally on a Mac, Windows or Linux using [Ollama](https://ollama.com/).

### Steps at a glance:
1. Download and install Ollama.
2. Download and test run Llama 3.1
3. Use local Llama 3.1 via Python.
4. Use local Llama 3.1 via LangChain.


#### 1. Download and install Ollama

On Mac or Windows, go to the Ollama download page [here](https://ollama.com/download) and select your platform to download it, then double click the downloaded file to install Ollama.

On Linux, you can simply run on a terminal `curl -fsSL https://ollama.com/install.sh | sh` to download and install Ollama.

#### 2. Download and test run Llama 3

On a terminal or console, run `ollama pull llama3.1` to download the Llama 3.1 8b chat model, in the 4-bit quantized format with size about 4.7 GB.

Run `ollama pull llama3.1:70b` to download the Llama 3.1 70b chat model, also in the 4-bit quantized format with size 39GB.

Then you can run `ollama run llama3.1` and ask Llama 3.1 questions such as "who wrote the book godfather?" or "who wrote the book godfather? answer in one sentence." You can also try `ollama run llama3.1:70b`, but the inference speed will most likely be too slow - for example, on an Apple M1 Pro with 32GB RAM, it takes over 10 seconds to generate one token using Llama 3.1 70b chat (vs over 10 tokens per second with Llama 3.1 8b chat).

You can also run the following command to test Llama 3.1 8b chat:
```
curl http://localhost:11434/api/chat -d '{
  "model": "llama3.1",
  "messages": [
    {
      "role": "user",
      "content": "who wrote the book godfather?"
    }
  ],
  "stream": false
}'
```

The complete Ollama API doc is [here](https://github.com/ollama/ollama/blob/main/docs/api.md).

#### 3. Use local Llama 3.1 via Python

The Python code below is the port of the curl command above.

In [1]:
import requests

url = "http://localhost:11434/api/chat"

def llama3(prompt):
    data = {
        "model": "llama3.1",
        "messages": [
            {
              "role": "user",
              "content": prompt
            }
        ],
        "stream": False
    }
    
    headers = {
        'Content-Type': 'application/json'
    }
    
    response = requests.post(url, headers=headers, json=data)
    
    return(response.json()['message']['content'])

In [2]:
response = llama3("who wrote the book godfather")
print(response)

The book "The Godfather" was written by Mario Puzo, an American author. The novel was published in 1969 and it's a fictional story about the Corleone crime family, but it's heavily influenced by real-life events and figures from the Italian-American Mafia.

Mario Puzo is credited with creating the iconic characters of Don Vito Corleone (also known as "The Godfather") and his children, particularly Michael Corleone. The novel was a massive success, and it won the Pulitzer Prize for Fiction in 1973.

Puzo's book was later adapted into the famous film directed by Francis Ford Coppola, which became one of the greatest films of all time. The movie starred Marlon Brando as Don Vito Corleone, Al Pacino as Michael Corleone, and James Caan as Sonny Corleone.

Mario Puzo wrote two more books in the Godfather series: "The Sicilian" (1984) and "The Fourth K" (1992). However, his later works did not quite match the impact of his first novel.


#### 4. Use local Llama 3.1 via LangChain

In [3]:
%pip install langchain langchain_community

Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Note: you may need to restart the kernel to use updated packages.


In [4]:
from langchain_community.chat_models import ChatOllama

llm = ChatOllama(model="llama3.1", temperature=0)
response = llm.invoke("who wrote the book godfather?")
print(response.content)

  llm = ChatOllama(model="llama3.1", temperature=0)


The novel "The Godfather" was written by Mario Puzo. It was published in 1969 and became a huge success, leading to the famous film adaptation directed by Francis Ford Coppola in 1972.

Mario Puzo (1920-1999) was an American author of Italian descent, born in New York City's Little Italy neighborhood. He wrote several novels, but "The Godfather" remains his most famous and enduring work. The book is a sprawling epic that explores the world of organized crime, family loyalty, and the American Dream.

Interestingly, Puzo also co-wrote the screenplay for the film adaptation with Coppola, which won several Academy Awards, including Best Picture in 1973.

Puzo went on to write two more novels set in the same universe: "The Last Don" (1996) and "Omertà" (2000), both of which were also adapted into films.
