In [1]:
%pip install llama-index llama-index-llms-huggingface llama-index-embeddings-huggingface llama-index-readers-web transformers accelerate -q

[0mNote: you may need to restart the kernel to use updated packages.


In [3]:
from llama_index.core import ServiceContext
from llama_index.core import VectorStoreIndex
from llama_index.llms.huggingface import HuggingFaceLLM
from llama_index.core.prompts.base import PromptTemplate
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.readers.web import BeautifulSoupWebReader
from os import environ
import torch

In [4]:
environ["HIP_VISIBLE_DEVICES"]="0"

use_cuda = torch.cuda.is_available()
if use_cuda:
    print('__CUDNN VERSION:', torch.backends.cudnn.version())
    print('__Number CUDA Devices:', torch.cuda.device_count())
    count = torch.cuda.device_count()

__CUDNN VERSION: 3001000
__Number CUDA Devices: 1


In [5]:
def messages_to_prompt(messages):
  prompt = ""
  for message in messages:
    if message.role == 'system':
      prompt += f"<|system|>\n{message.content}</s>\n"
    elif message.role == 'user':
      prompt += f"<|user|>\n{message.content}</s>\n"
    elif message.role == 'assistant':
      prompt += f"<|assistant|>\n{message.content}</s>\n"

In [7]:
llm = HuggingFaceLLM(
    model_name="HuggingFaceH4/zephyr-7b-alpha",
    tokenizer_name="HuggingFaceH4/zephyr-7b-alpha",
    query_wrapper_prompt=PromptTemplate("<|system|>\n</s>\n<|user|>\n{query_str}</s>\n<|assistant|>\n"),
    context_window=3900,
    max_new_tokens=256,
    model_kwargs={"use_safetensors": False},
    # tokenizer_kwargs={},
    generate_kwargs={"do_sample":True, "temperature": 0.7, "top_k": 50, "top_p": 0.95},
    messages_to_prompt=messages_to_prompt,
    device_map="cuda",
)

Downloading shards: 100%|██████████| 8/8 [06:28<00:00, 48.56s/it]
  return self.fget.__get__(instance, owner)()
Loading checkpoint shards: 100%|██████████| 8/8 [00:11<00:00,  1.48s/it]


In [8]:
question = "How does Paul Graham recommend to work hard? Can you list it as steps"
response = llm.complete(question)
print(response)

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


Paul Graham's recommendations for working hard are as follows:

1. Work on what interests you: Choose a project or task that excites you and makes you want to put in extra effort.

2. Make it a habit: Set aside a specific time each day or week for working on your project. This will help make it a habit and a part of your daily routine.

3. Set goals: Set specific and measurable goals for what you want to achieve. This will help you stay focused and motivated.

4. Eliminate distractions: Remove any distractions that might prevent you from working effectively. This could mean turning off your phone, closing unnecessary tabs on your computer, or finding a quiet space to work.

5. Work smarter, not harder: Focus on working efficiently and effectively, rather than simply working longer hours. This will help you get more done in less time.

6. Take breaks: Take regular breaks to recharge your batteries and avoid burnout. This will help you stay focused and productive over the long term.

7. 

In [9]:
url = "https://paulgraham.com/hwh.html"

documents = BeautifulSoupWebReader().load_data([url])

In [11]:
context = documents[0].text
prompt = f"""Answer the question based on the context below. If the
question cannot be answered using the information provided answer
with "I don't know".

Context: {context}

Question: {question}

Answer: """

In [12]:
response = llm.complete(prompt)
print(response)

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


Paul Graham suggests the following steps to work hard:

1. Learn what real work is by distinguishing it from the distorted versions of it you may have encountered in school or in certain types of work.

2. Find the limit of working hard by noticing if the quality of the work decreases beyond a certain number of hours per day.

3. Constantly judge both how hard you're trying and how well you're doing, and aim toward the center of the problem, rather than merely pushing yourself to work.

4. Figure out which type of work you're suited for by understanding the shape of real work and which kind interests you, rather than just following your talents.

5. Figure out what to work on by being honest with yourself about your interests, rather than just following the current consensus about which problems are most important.

6. Give yourself time to get going when working on a new type of work, but also be prepared to switch fields if you're not getting good enough results.

7. Find what you fi