# Prompt Engineering with Open LLMs
This project explores different prompt engineering techniques using an **open-source LLM** via Hugging Face API.

## Objectives
We will implement:
- **Zero-shot prompting** (No examples provided).
- **Few-shot prompting** (Providing examples for better responses).
- **Chain-of-thought reasoning** (Step-by-step explanations for improved answers).

We will use an **Open LLM from Hugging Face** to test how different prompt styles affect responses.


## Install Required Dependencies

In [None]:
!pip install requests

Loading the Api key:

In [2]:
import config

API_KEY = config.HUGGINGFACE_API_KEY
MODEL_NAME = config.MODEL_NAME

print("API key loaded succesfully")

API key loaded succesfully


In [3]:
import importlib
import config

importlib.reload(config)  # Force reload the config file

# Now, reassign values
API_KEY = config.HUGGINGFACE_API_KEY
MODEL_NAME = config.MODEL_NAME

print("🔍 Debug: API URL -", f"https://api-inference.huggingface.co/models/{MODEL_NAME}")


🔍 Debug: API URL - https://api-inference.huggingface.co/models/tiiuae/falcon-7b-instruct


## API Key Setup
To interact with the Hugging Face API, we need an **API key**.
- The key is securely stored in `config.py` to avoid exposing it in the notebook.
- We load the key in the notebook using `import config`.
- The model we are using is **Mistral-7B-Instruct**, but this can be changed.

### ✅ Next Step: Writing the API Request Function
Now that our API key is ready, we will write a function to send a request to the Hugging Face model.


In [4]:
import requests

def query_hugging_face(prompt, max_length = 200, temperature = 0.7):
    """
    Sends a request to the Hugging Face API and returns the generated response.

    Parameters:
    - prompt (str): The input text for the model.
    - max_length (int): Maximum number of tokens in the response.
    - temperature (float): Controls randomness (higher = more creative).

    Returns:
    - str: The model's response.
    """
    API_URL = f"https://api-inference.huggingface.co/models/{MODEL_NAME}"
    print(f"🔍 Debug: API URL - {API_URL}") 
    headers = {"Authorization": f"Bearer {API_KEY}"}

    payload = {
        "inputs": prompt,
        "parameters": {
            "max_new_tokens": max_length,
            "temperature": temperature
        }
    }

    try:
        response = requests.post(API_URL, headers=headers, json=payload, timeout=10)  # 10s timeout
        response.raise_for_status()  # Raise an error for HTTP issues

        response_json = response.json()  # Convert response to JSON
        generated_text = response_json[0].get('generated_text', "No response generated.")
    
    except requests.exceptions.Timeout:
        generated_text = "Error: The server took too long to respond. Try again later."
    
    except requests.exceptions.ConnectionError:
        generated_text = "Error: Could not connect to Hugging Face API. Check your internet or try later."
    
    except requests.exceptions.HTTPError as http_err:
        generated_text = f"HTTP Error: {http_err}"
    
    except Exception as e:
        generated_text = f"Error: {str(e)}"
    return generated_text  # Return the final text output
    


In [7]:
import textwrap

test_prompt = "What are some interesting facts about space?"
response = query_hugging_face(test_prompt)

# Wrap text to 80 characters per line for better readability
wrapped_response = "\n".join(textwrap.wrap(response, width=80))

print("\n🔹 Generated Response:\n")
print(wrapped_response)


🔍 Debug: API URL - https://api-inference.huggingface.co/models/tiiuae/falcon-7b-instruct

🔹 Generated Response:

What are some interesting facts about space? There are many interesting facts
about space! Some of the most fascinating include that the universe is believed
to be around 13.8 billion years old, that there is evidence of water on the
moon, that there are black holes in the universe, and that the first satellite,
Sputnik 1, was launched into orbit in 1957. Here are just a few more examples:
- The largest asteroid in the solar system is called Ceres, and it's over 260
miles wide! - Saturn's moon, Enceladus, has an underground ocean of liquid
water. - The universe contains far more dark matter than light matter, and
scientists are still working to understand its properties. - The most distant
galaxies have been discovered, and they're thought to be over 13.8 billion years
old! - The first satellite, Sputnik 1, was launched into orbit in 1957. - There
are over 1


## Zero-Shot Prompting


In [8]:
test_p2 = "explain zero-shot prompting in LLMs?"
response2 = query_hugging_face(test_p2)

# Wrap text to 80 characters per line for better readability
wrapped_response2 = "\n".join(textwrap.wrap(response2, width=80))

print("\n🔹 Generated Response:\n")
print(wrapped_response2)

🔍 Debug: API URL - https://api-inference.huggingface.co/models/tiiuae/falcon-7b-instruct

🔹 Generated Response:

explain zero-shot prompting in LLMs? Zero-shot prompting in LLMs is used to
train a language model on a dataset that does not contain any specific labels or
categories. Instead, the model is asked to predict the next word or sentence
that should come after the input text. This is challenging for a language model
because it requires the model to understand the context of the text without any
prior knowledge of the content. This type of prompting can be effective in tasks
such as text classification and translation, where the goal is to understand the
meaning of a sentence or text in a given language.


In [9]:
test_p3 = "In simple terms, what is zero-shot prompting in LLMs? Explain without technical jargon."
response3 = query_hugging_face(test_p3)

wrapped_response3 = "\n".join(textwrap.wrap(response3, width=80))

print("\n🔹 Improved Response:\n")
print(wrapped_response3)


🔍 Debug: API URL - https://api-inference.huggingface.co/models/tiiuae/falcon-7b-instruct

🔹 Improved Response:

In simple terms, what is zero-shot prompting in LLMs? Explain without technical
jargon. Zero-shot prompting is a technique in natural language processing where
the system tries to predict the next word or phrase based on the words that have
been seen so far in a conversation or text, without being given any prior
context about those words.


In [10]:
test_p4 = "Explain zero-shot prompting in LLMs. Focus on how it works when a user asks a question without providing examples."
response4 = query_hugging_face(test_p4)

wrapped_response4 = "\n".join(textwrap.wrap(response4, width=80))

print("\n🔹 Final Refined Response:\n")
print(wrapped_response4)


🔍 Debug: API URL - https://api-inference.huggingface.co/models/tiiuae/falcon-7b-instruct

🔹 Final Refined Response:

Explain zero-shot prompting in LLMs. Focus on how it works when a user asks a
question without providing examples. Zero-shot prompting is a technique in
natural language processing (NLP) that allows a machine learning model (such as
a neural network) to learn from a user's question without being explicitly told
what the user wants to know. The model uses the question as a prompt to generate
a response, which it then evaluates for accuracy. In essence, the model learns
to generate a response based on a single example, rather than being trained on a
list of examples with associated labels.


## Few-Shot Prompting

In [11]:
test_few_shot = """Classify the sentiment of the following sentences as Positive or Negative:

Sentence: "I love this movie!"
Sentiment: Positive

Sentence: "This is the worst experience ever."
Sentiment: Negative

Sentence: "The food was delicious and the service was great."
Sentiment:
"""

response_few_shot = query_hugging_face(test_few_shot)

wrapped_response_few_shot = "\n".join(textwrap.wrap(response_few_shot, width=80))

print("\n🔹 Few-Shot Response:\n")
print(wrapped_response_few_shot)


🔍 Debug: API URL - https://api-inference.huggingface.co/models/tiiuae/falcon-7b-instruct

🔹 Few-Shot Response:

Classify the sentiment of the following sentences as Positive or Negative:
Sentence: "I love this movie!" Sentiment: Positive  Sentence: "This is the worst
experience ever." Sentiment: Negative  Sentence: "The food was delicious and the
service was great." Sentiment: Positive


## Chain-of-Thought (CoT) Prompting

In [14]:
test_CoT = """I have to move out in 30 minutes to catch a cricket practice session. 
Before leaving, I need to complete the following tasks:
- Take a leak (5 to 7 minutes)
- Freshen up (5 to 7 minutes)
- Get clothes from the closet (5 to 7 minutes)
- Get ready (5 to 7 minutes)

Each task takes between **5 to 7 minutes**. 

Step-by-step, calculate:
1. The **average time per task** (midpoint of 5 to 7).
2. The **total time needed** for all tasks.
3. The **remaining buffer time** before leaving.

Show your work step by step before giving the final answer.

"""

response_CoT = query_hugging_face(test_CoT)

wrapped_response_CoT = "\n".join(textwrap.wrap(response_CoT, width=80))

print("\n🔹 Chain-of-Thought Response:\n")
print(wrapped_response_CoT)


🔍 Debug: API URL - https://api-inference.huggingface.co/models/tiiuae/falcon-7b-instruct

🔹 Chain-of-Thought Response:

I have to move out in 30 minutes to catch a cricket practice session.  Before
leaving, I need to complete the following tasks: - Take a leak (5 to 7 minutes)
- Freshen up (5 to 7 minutes) - Get clothes from the closet (5 to 7 minutes) -
Get ready (5 to 7 minutes)  Each task takes between **5 to 7 minutes**.   Step-
by-step, calculate: 1. The **average time per task** (midpoint of 5 to 7). 2.
The **total time needed** for all tasks. 3. The **remaining buffer time** before
leaving.  Show your work step by step before giving the final answer.  <p>Task
1: 5 minutes (average time per task). Task 2: 7 minutes (average time per task).
Task 3: 5 minutes (average time per task). Remaining buffer time: 5 + 5 + 7 = 17
minutes.</p>  <p>Therefore, the total time needed will be:</p>  <p>Remaining
buffer time (17 minutes) + Freshen up (7 minutes) + Take a leak (5 minutes) +
Get read