<a href="https://colab.research.google.com/github/Amir-Fasil/Prompt_Vs_finetuning/blob/main/Zero_shot.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
!pip install langchain transformers torch sentence-transformers langchain-community

Collecting langchain-community
  Downloading langchain_community-0.3.23-py3-none-any.whl.metadata (2.5 kB)
Collecting nvidia-cuda-nvrtc-cu12==12.4.127 (from torch)
  Downloading nvidia_cuda_nvrtc_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-runtime-cu12==12.4.127 (from torch)
  Downloading nvidia_cuda_runtime_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-cupti-cu12==12.4.127 (from torch)
  Downloading nvidia_cuda_cupti_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cudnn-cu12==9.1.0.70 (from torch)
  Downloading nvidia_cudnn_cu12-9.1.0.70-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cublas-cu12==12.4.5.8 (from torch)
  Downloading nvidia_cublas_cu12-12.4.5.8-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cufft-cu12==11.2.1.3 (from torch)
  Downloading nvidia_cufft_cu12-11.2.1.3-py3-none-manylinux2014_x86_64.whl.metad

In [None]:
from langchain.llms import HuggingFacePipeline
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
from google.colab import userdata
from huggingface_hub import login
import os
import torch

**Initialize the Llama 2 model and tokenizer**

In [None]:
hf_token = userdata.get('HF_TOKEN')
login(token=hf_token, add_to_git_credential=True)

In [None]:
# Initialize the Llama 2 model and tokenizer
def load_llama2_model():
    model_name = "meta-llama/Llama-2-7b-chat-hf"

    # Load tokenizer
    tokenizer = AutoTokenizer.from_pretrained(model_name)

    # Load model
    model = AutoModelForCausalLM.from_pretrained(
        model_name,
        device_map="auto",
        torch_dtype=torch.float16,
    )

    # Create text generation pipeline
    pipe = pipeline(
        "text-generation",
        model=model,
        tokenizer=tokenizer,
        max_new_tokens=512,
        temperature=0.7,
        top_p=0.95,
        repetition_penalty=1.15
    )

    # Wrap in LangChain HuggingFacePipeline
    llm = HuggingFacePipeline(pipeline=pipe)

    return llm

llm = load_llama2_model()

**Zero_shot Prompt**

In [None]:
def create_zero_shot_prompt():

    template = """<s>[INST] <<SYS>>
    You are an AI assistant. Answer the question based on the context below. If you can't answer the question,
    reply 'I don't know' instead of making up an answer.
    <</SYS>>

    {question} [/INST]"""

    return PromptTemplate(
        input_variables=["question"],
        template=template
    )

prompt_template = create_zero_shot_prompt()

**Wrapping prompt and model in a Chain**

In [None]:
# Create a zero-shot chain
def create_zero_shot_chain(llm, prompt_template):

    return LLMChain(

        llm=llm,
        prompt=prompt_template,
        verbose=True  # Set to True to see the prompt and response details
    )

zero_shot_chain = create_zero_shot_chain(llm, prompt_template)

**Execution**

In [None]:
def main(zero_shot_chain):

    # Example questions
    questions = [

        "Explain the concept of gravitational waves in simple terms.",
        "What is the capital of France?",
        "How do I bake a chocolate cake without eggs?",
        "What is the meaning of life?",
        "Who won the 2024 Indonesian presidential election?",
        "Which team won Super Bowl LVIII in 2024?",

    ]

    # Get answers
    for question in questions:

        print(f"\nQuestion: {question}")
        response = zero_shot_chain.run(question=question)
        print(f"Answer: {response}")


main(zero_shot_chain)

Loading Llama 2 model...


model.safetensors.index.json:   0%|          | 0.00/26.8k [00:00<?, ?B/s]

Fetching 2 files:   0%|          | 0/2 [00:00<?, ?it/s]

model-00001-of-00002.safetensors:   0%|          | 0.00/9.98G [00:00<?, ?B/s]

model-00002-of-00002.safetensors:   0%|          | 0.00/3.50G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/188 [00:00<?, ?B/s]

Device set to use cuda:0
  llm = HuggingFacePipeline(pipeline=pipe)
  return LLMChain(
  response = zero_shot_chain.run(question=question)



Question: Explain the concept of gravitational waves in simple terms.


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3m<s>[INST] <<SYS>>
    You are an AI assistant. Answer the question based on the context below. If you can't answer the question, 
    reply 'I don't know' instead of making up an answer.
    <</SYS>>
    
    Explain the concept of gravitational waves in simple terms. [/INST][0m

[1m> Finished chain.[0m
Answer: <s>[INST] <<SYS>>
    You are an AI assistant. Answer the question based on the context below. If you can't answer the question, 
    reply 'I don't know' instead of making up an answer.
    <</SYS>>
    
    Explain the concept of gravitational waves in simple terms. [/INST]  Of course! I'd be happy to help explain the concept of gravitational waves in simple terms.

Gravitational waves are ripples in the fabric of spacetime that are caused by massive cosmic events, such as the collision of two black holes or neutron stars. T