# Running an LLM on your own laptop
In this notebook, we're going to learn how to run a Hugging Face LLM on our own machine.

## Download the LLM
We're going to write some code to manually download the model.

In [13]:
import os
from huggingface_hub import hf_hub_download

In [14]:
HUGGING_FACE_API_KEY = os.environ.get("HUGGING_FACE_API_KEY")

In [15]:
model_id = "lmsys/fastchat-t5-3b-v1.0"
filenames = [
        "pytorch_model.bin", "added_tokens.json", "config.json", "generation_config.json", 
        "special_tokens_map.json", "spiece.model", "tokenizer_config.json"
]

In [16]:
for filename in filenames:
        downloaded_model_path = hf_hub_download(
                    repo_id=model_id,
                    filename=filename,
                    token=HUGGING_FACE_API_KEY
        )
        print(downloaded_model_path)

/Users/markhneedham/.cache/huggingface/hub/models--lmsys--fastchat-t5-3b-v1.0/snapshots/0b1da230a891854102d749b93f7ddf1f18a81024/pytorch_model.bin
/Users/markhneedham/.cache/huggingface/hub/models--lmsys--fastchat-t5-3b-v1.0/snapshots/0b1da230a891854102d749b93f7ddf1f18a81024/added_tokens.json
/Users/markhneedham/.cache/huggingface/hub/models--lmsys--fastchat-t5-3b-v1.0/snapshots/0b1da230a891854102d749b93f7ddf1f18a81024/config.json
/Users/markhneedham/.cache/huggingface/hub/models--lmsys--fastchat-t5-3b-v1.0/snapshots/0b1da230a891854102d749b93f7ddf1f18a81024/generation_config.json
/Users/markhneedham/.cache/huggingface/hub/models--lmsys--fastchat-t5-3b-v1.0/snapshots/0b1da230a891854102d749b93f7ddf1f18a81024/special_tokens_map.json
/Users/markhneedham/.cache/huggingface/hub/models--lmsys--fastchat-t5-3b-v1.0/snapshots/0b1da230a891854102d749b93f7ddf1f18a81024/spiece.model
/Users/markhneedham/.cache/huggingface/hub/models--lmsys--fastchat-t5-3b-v1.0/snapshots/0b1da230a891854102d749b93f7ddf

## Run the LLM
Now let's try running the model. But before we do that, let's disable the Wi-Fi.

In [17]:
from utils import check_connectivity, toggle_wifi
import time

print(check_connectivity())
toggle_wifi("off")
time.sleep(0.5)
print(check_connectivity())

You have Network Connectivity (IP Address: 192.168.86.249)
WiFi disabled.
No Network Connectivity (IP Address: 127.0.0.1)


In [18]:
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained(model_id, legacy=False)
model = AutoModelForSeq2SeqLM.from_pretrained(model_id)

pipeline = pipeline("text2text-generation", model=model, device=-1, tokenizer=tokenizer, max_length=1000)

'(ReadTimeoutError("HTTPSConnectionPool(host='huggingface.co', port=443): Read timed out. (read timeout=10)"), '(Request ID: f38c0ab3-254b-48dd-9ca8-8b65f92bdb17)')' thrown while requesting HEAD https://huggingface.co/lmsys/fastchat-t5-3b-v1.0/resolve/main/tokenizer_config.json


In [19]:
pipeline("What are competitors to Apache Kafka?")

[{'generated_text': '<pad> Apache Kafka is a popular open source message broker that is used for streaming and aggregating data from multiple sources. Some of its competitors include Apache Spark, Apache Storm, Apache Flume, and Apache Flink.'}]

In [20]:
pipeline("""My name is Mark.
I have brothers called David and John and my best friend is Michael.
Using only the context above. Do you know if I have a sister?    
""")

[{'generated_text': '<pad> No,  I  do  not  have  a  sister.\n'}]