# Module 3 Project 1: Prompt Engineering
- Create a basic prompt engineering loop with different variations of the same prompt.
- Run the loop over the examples and evalualte the results yourself (automated eval comes later)


## STEP 1: IMPORTS
- Import [LangChain](https://www.langchain.com/) for `llama-cpp-python` wrapper (makes our life simpler for now)
- Import regex for prompt extraction and time for rate limiting (gives our CPU some time to cool down)

In [None]:
from langchain.llms import LlamaCpp
from langchain.callbacks.manager import CallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler

import re
import time

## STEP 2: LOAD THE MODEL
- Load the model from your file path (`.gguf` format)
- The below parameters are used with Apple Silicon to run on your CPU

In [None]:
# Set GPU layers and batch size
n_gpu_layers = 1
n_batch = 4096  
callback_manager = CallbackManager([StreamingStdOutCallbackHandler()])

# Instantiate the model
language_model = LlamaCpp(
    model_path="/path/to/your/model.gguf",
    n_gpu_layers=n_gpu_layers,
    n_batch=n_batch,
    n_ctx=4096,
    f16_kv=True,
    callback_manager=callback_manager,
    verbose=False,
    echo=False
)


## STEP 3: LOAD THE PROMPTS
- We want to load our prompt file, where each prompt is delimited with "==="
- Read in the file and split out the prompts

In [None]:
# File path for prompt list separated by '==='
file_path = "prompts_codegen.txt"

def read_file(file_path):
    with open(file_path, 'r') as file:
        return file.read()

text = read_file(file_path)

# Read the prompts into a list
results = re.findall(r'===(.*?)===', text, re.DOTALL)

## STEP 4: ITERATE
- Given our list of prompts, run each one every 5 seconds
- Results are displayed through LlamaCpp

In [None]:
# Iterate over the prompts, and prompt the LLM with each one
for result in results:
    print("Prompt: \n" + result.strip())
    print("\n")

    result = language_model(result.strip())
    time.sleep(5)