In [1]:
%%capture
import os
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    # Do this only in Colab notebooks! Otherwise use pip install unsloth
    !pip install --no-deps bitsandbytes accelerate xformers==0.0.29.post3 peft trl==0.15.2 triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf datasets huggingface_hub hf_transfer
    !pip install --no-deps unsloth

In [2]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


Following is the save path which contains the adapter weights. Make sure you set this right based on where you saved it while running finetuning

In [3]:
SAVE_PATH = "/content/drive/MyDrive/HPML_Project/copy_unsloth_a100_8"

#-------MAKE SURE YOU SET THIS PATH CORRECT--------#

In [4]:
# inference.py

import torch
from unsloth import FastLanguageModel
from peft import prepare_model_for_kbit_training, PeftModel



base, tokenizer = FastLanguageModel.from_pretrained(
    model_name     = "unsloth/Llama-3.2-1B-bnb-4bit",  # base
    max_seq_length = 128,
    dtype          = None,                   # or None for auto
    load_in_4bit   = True,
    device_map     = "auto",
)


tokenizer.pad_token = tokenizer.eos_token
base.config.pad_token_id = tokenizer.pad_token_id
base.config.use_cache      = True


base = prepare_model_for_kbit_training(base)


model = PeftModel.from_pretrained(
    base,
    SAVE_PATH,     # folder where you saved adapters + tokenizer
    device_map="auto",          # shard onto GPU automatically
)

# model.eval()
FastLanguageModel.for_inference(model)


def answer(prompt: str,
           max_new_tokens: int = 128,
           temperature: float    = 0.2,
           top_p: float          = 0.7,
           repetition_penalty: float = 1.2,
           no_repeat_ngram_size: int = 3):
    # 1) tokenize
    inputs = tokenizer(
        prompt,
        return_tensors="pt",
        truncation=True,
        max_length=512
    ).to(model.device)

    input_ids = inputs["input_ids"]

   
    outputs = model.generate(
        **inputs,
        max_new_tokens       = max_new_tokens,
        temperature          = temperature,
        top_p                = top_p,
        do_sample            = True,
        repetition_penalty   = repetition_penalty,
        no_repeat_ngram_size = no_repeat_ngram_size,
        eos_token_id         = tokenizer.eos_token_id,
        pad_token_id         = tokenizer.pad_token_id,
        early_stopping       = True,
    )


    gen_ids = outputs[0][ input_ids.shape[-1] : ]
    return tokenizer.decode(gen_ids, skip_special_tokens=True).strip()




🦥 Unsloth: Will patch your computer to enable 2x faster free finetuning.
🦥 Unsloth Zoo will now patch everything to make training faster!
==((====))==  Unsloth 2025.4.7: Fast Llama patching. Transformers: 4.51.3.
   \\   /|    NVIDIA A100-SXM4-40GB. Num GPUs = 1. Max memory: 39.557 GB. Platform: Linux.
O^O/ \_/ \    Torch: 2.6.0+cu124. CUDA: 8.0. CUDA Toolkit: 12.4. Triton: 3.2.0
\        /    Bfloat16 = TRUE. FA [Xformers = 0.0.29.post3. FA2 = False]
 "-____-"     Free license: http://github.com/unslothai/unsloth
Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!


model.safetensors:   0%|          | 0.00/1.03G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/230 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/50.6k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/17.2M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/459 [00:00<?, ?B/s]

In [5]:
print(answer("How is starbucks stock doing?"))

Starbucks (SBUX) shares are trading higher today after the company reported better-than-expected first-quarter earnings and raised its full-year profit outlook. The coffee giant also announced a $1 billion share buyback program, which could boost investor confidence in SBUX's future performance.
The news was welcomed by Wall Street analysts who have been cautious about their forecasts for this iconic brand of American fast food due to macroeconomic challenges like inflation and rising interest rates. However, they remain optimistic that Starbucks can navigate these headwinds with strategic initiatives such as expanding into new markets or enhancing digital capabilities while maintaining strong customer loyalty through consistent product innovation and pricing strategies


In [6]:
print(answer("What is the best performing stock?"))


What stocks are cheap and what should you buy now to make a profit in 2025?
The market has been volatile since President Trump’s election. The S&P500 index fell by more than 10% from its peak on Election Day, but it recovered quickly as investors rushed into high-quality companies that were expected to benefit from economic growth under his administration.
However, this recovery was short-lived: In early February, fears of recession began to grow after China announced tariffs on US goods worth $162 billion. This led to an unexpected sell-off across all major indices – including those with strong fundamentals like Apple (AAPL), Microsoft(MS


In [7]:
print(answer("What is the news on Lockheed Martin?"))


The company has been in a bit of an uproar lately, with investors taking to social media platforms like Twitter and Reddit to express their frustration over what they see as poor leadership. This latest episode began when CEO Marillyn Hewson announced that she would be stepping down from her position at the end of 2025 after more than two decades leading one of America’s largest defense contractors.
The announcement came just days before another major event for Lockheed: its annual shareholder meeting was scheduled for April 24th this year. However, it quickly became clear that there were some significant issues brewing behind closed doors between management and shareholders alike – including concerns


In [8]:
print(answer("What is a stock option?"))


A stock options are the right to buy or sell shares of a company at an agreed-upon price on a specific date. They can be used as compensation for employees, executives and other key personnel in companies that go public.
How do you use Stock Options?
Stock Option Basics: What Are The Benefits Of Using Them In Your Business?  The most common way people invest their money today involves buying stocks from major corporations like Apple (AAPL), Microsoft (MSFT) and Amazon.com (AMZN). These three tech giants have been around since before anyone could remember – they were founded decades ago! But what if there was another type


#### Timing the inference

In [9]:
def benchmark(prompt: str,
           max_new_tokens: int = 128,
           temperature: float    = 0.2,
           top_p: float          = 0.7,
           repetition_penalty: float = 1.2,
           no_repeat_ngram_size: int = 3):
    # 1) tokenize
    inputs = tokenizer(
        prompt,
        return_tensors="pt",
        truncation=True,
        max_length=512
    ).to(model.device)

    input_ids = inputs["input_ids"]
    starter, ender = torch.cuda.Event(enable_timing=True), torch.cuda.Event(enable_timing=True)

    for _ in range(3):
        model.generate(
          **inputs,
          max_new_tokens       = max_new_tokens,
          temperature          = temperature,
          top_p                = top_p,
          do_sample            = True,
          repetition_penalty   = repetition_penalty,
          no_repeat_ngram_size = no_repeat_ngram_size,
          eos_token_id         = tokenizer.eos_token_id,
          pad_token_id         = tokenizer.pad_token_id,
          early_stopping       = True,
      )


    # 2) generate with anti‐repetition tweaks
    starter.record()
    with torch.no_grad():
      outputs = model.generate(
          **inputs,
          max_new_tokens       = max_new_tokens,
          temperature          = temperature,
          top_p                = top_p,
          do_sample            = True,
          repetition_penalty   = repetition_penalty,
          no_repeat_ngram_size = no_repeat_ngram_size,
          eos_token_id         = tokenizer.eos_token_id,
          pad_token_id         = tokenizer.pad_token_id,
          early_stopping       = True,
      )
    ender.record()
    torch.cuda.synchronize()
    print("Latency:", starter.elapsed_time(ender), "ms")


    # 3) strip off prompt‐tokens and decode only the new ones
    gen_ids = outputs[0][ input_ids.shape[-1] : ]
    return tokenizer.decode(gen_ids, skip_special_tokens=True).strip()



In [10]:
  print(benchmark("How is starbucks stock doing?"))

Latency: 3893.335205078125 ms
Starbucks (SBUX) shares are trading at $80.50, up 0.9% as of this writing on April 2nd after the company reported better-than-expected first-quarter results and issued a strong outlook for fiscal year 2025. The consensus estimate was for earnings per share to be $1.31 while sales were expected to come in at $17 billion. Shares have risen more than 20% over the past six months.
The news helped boost investor sentiment toward SBUX despite some concerns about its competitive position against Amazon.com’s (AMZN), which has been expanding into coffee shops with new stores opening every


In [11]:
  print(benchmark("What is the best performing stock?"))

Latency: 3860.179931640625 ms
What are some of the most popular stocks to invest in right now?
The answer depends on your perspective. If you’re a value investor, then it’s probably going to be companies that have been trading at low prices for years and which offer good long-term growth potential.
If you prefer momentum investing instead, then there will likely be more exciting opportunities available today because so many investors were caught off guard by last week’s market crash (and still haven’t recovered). However, if history repeats itself again next time around – meaning another bear market followed by an even bigger one later down the road–then those same old favorites might not hold up as


In [12]:
  print(benchmark("What is the news on Lockheed Martin?"))


Latency: 3933.564697265625 ms
The company has been in a bit of an uproar lately, with investors concerned about its future. On Tuesday morning, however, things took a turn for the better when it announced that it would be buying back $1 billion worth of stock from shareholders at 50 cents per share.
This move comes after months of speculation and rumors surrounding what was seen as a potential buyout by private equity firm Blackstone Group Inc (NYSE: BX). However, this latest announcement suggests something more concrete than just speculation – one analyst believes there could actually be value behind these talks! In fact, he thinks they might even help boost shares over time!
The


In [13]:
print(benchmark("What is a stock option?"))

Latency: 3901.0634765625 ms
What are the benefits of options trading?
Options Trading: A Beginner’s Guide to Options and How They Can Help You Make Money in Stocks
Stocks, bonds, commodities – these three major asset classes have been at the forefront of financial markets for decades. But what if you could add another layer of complexity by taking advantage of an additional market that has historically underperformed but can now be used as a hedge against inflation or other economic risks? That's where options come into play.
An Option Is Like Buying a Share of Stock - Here Are 5 Ways It Helps Investors Today (And Tomorrow) An Option Is like buying a share of
