In [1]:
# Import transformer classes for generation
from transformers import AutoTokenizer, AutoModelForCausalLM, TextStreamer, BitsAndBytesConfig
# Import torch for datatype attributes
import torch

In [2]:
# Define variable to hold llama2 weights naming
name = "dwightf/berkshireGPT"
# Set auth token variable from hugging face


In [3]:
# Create model
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_use_double_quant=False,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.float16
)

model = AutoModelForCausalLM.from_pretrained(name, quantization_config=bnb_config)
tokenizer = AutoTokenizer.from_pretrained(name)

Loading checkpoint shards:   0%|          | 0/3 [00:00<?, ?it/s]

In [5]:
# Setup a prompt
prompt = """### Instruction:
You are a value investor giving your advice on stocks. And choosing whether to buy, sell, or hold them.

### Text:
Q - What information do you need to buy or sell a stock?

"""
# Pass the prompt to the tokenizer
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
# Setup the text streamer
streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)

In [6]:
# Actually run the thing
output = model.generate(**inputs, streamer=streamer,
                        use_cache=True, max_new_tokens=float('inf'))

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


A - As a value investor, I consider several key factors before deciding whether to buy, sell, or hold a stock. These include:
1. Fundamental Analysis: This involves examining a company's financial statements, such as its income statement, balance sheet, and cash flow statement. I look for signs of strong revenue growth, healthy profit margins, low debt levels, and a solid return on equity.
2. Economic Moat: I assess the company's competitive advantage or "moat" to determine if it can sustain its profitability over time. This could be due to factors like brand recognition, patents, economies of scale, or efficient distribution channels.
3. Valuation: I analyze the stock's valuation metrics, such as price-to-earnings (P/E) ratio, price-to-book (P/B) ratio, and dividend yield. These ratios help me determine if the stock is undervalued or overvalued relative to its peers and industry averages.
4. Industry Trends: I evaluate the broader industry trends and outlook to understand how the comp

In [7]:
# Covert the output tokens back to text
output_text = tokenizer.decode(output[0], skip_special_tokens=True)

In [8]:
# Import the prompt wrapper...but for llama index
from llama_index.prompts.prompts import SimpleInputPrompt
# Create a system prompt
system_prompt = """### Instruction:
You are a quant investor giving your advice on stocks. And choosing whether to buy, sell, or hold them.

### Text:

"""
# Throw together the query wrapper
query_wrapper_prompt = SimpleInputPrompt("### Text:\n{query_str} ")

In [9]:
# Complete the query prompt
query_wrapper_prompt.format(query_str='hello')

'### Text:\nhello '

In [10]:
# Import the llama index HF Wrapper
from llama_index.llms import HuggingFaceLLM
# Create a HF LLM using the llama index wrapper
llm = HuggingFaceLLM(context_window=6144,
                    max_new_tokens=1024,
                    system_prompt=system_prompt,
                    query_wrapper_prompt=query_wrapper_prompt,
                    model=model,
                    tokenizer=tokenizer)

In [11]:
# Bring in embeddings wrapper
from llama_index.embeddings import LangchainEmbedding
# Bring in HF embeddings - need these to represent document chunks
from langchain.embeddings.huggingface import HuggingFaceEmbeddings

In [12]:
# Create and dl embeddings instance
embeddings=LangchainEmbedding(
    HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")
)

  return self.fget.__get__(instance, owner)()


In [13]:
# Bring in stuff to change service context
from llama_index import set_global_service_context
from llama_index import ServiceContext

In [14]:
# Create new service context instance
service_context = ServiceContext.from_defaults(
    chunk_size=1024,
    llm=llm,
    embed_model=embeddings
)
# And set the service context
set_global_service_context(service_context)

In [15]:
# Import deps to load documents
from llama_index import VectorStoreIndex, download_loader, SimpleDirectoryReader
from pathlib import Path

In [55]:
documents = SimpleDirectoryReader("pdfs").load_data()


In [56]:
# Create an index - we'll be able to query this in a sec
index = VectorStoreIndex.from_documents(documents)

In [57]:
# Setup index query engine using LLM
query_engine = index.as_query_engine()

In [60]:
# Test out a query in natural
response = query_engine.query("These documents are for Hyundai. What are the pros and cons of their forecast?")

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


In [61]:
print(response)


The provided context mainly outlines Hyundai Motor Company's 2024 Annual Guidance Update. It offers insights into their financial projections and plans for the upcoming year. However, to analyze the pros and cons of this forecast, we would need more specific details about their targets, goals, and potential challenges.

From the given information, we can identify some key points:

Pros:
1. Revenue growth through increased wholesale units (4.24 million units).
2. Investment plan, which may lead to future growth and expansion.
3. Quarterly dividend distribution starting from 2023 Q2, indicating a commitment to shareholder returns.
4. Cancellation of 1% of treasury stock annually for the next three years (2024-2026), potentially improving the company's capital structure.
5. Announced annual dividend payout ratio of 25% or above, suggesting a consistent return to shareholders.

Cons (or potential risks/challenges):
1. The document does not provide specific financial targets for earnings, 