# Self Consistency Prompting

One of the more advanced techniques in prompt engineering is self-consistency, introduced by `Wang et al. (2022)`. 

This method seeks to improve upon the traditional greedy decoding typically used in chain-of-thought (CoT) prompting. 

The core concept involves sampling multiple diverse reasoning paths through few-shot CoT and leveraging these variations to determine the most consistent answer. The technique  enhances the effectiveness of CoT prompting, particularly for tasks requiring arithmetic and commonsense reasoning.

## References:
* [Wang et al. (2022)](https://arxiv.org/abs/2203.11171)

## Running this code on MyBind.org

Note: remember that you will need to **adjust CONFIG** with **proper URL and API_KEY**!

[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/GenILab-FAU/prompt-eng/HEAD?urlpath=%2Fdoc%2Ftree%2Fprompt-eng%2Fself_consistency.ipynb)



In [None]:
from _pipeline import create_payload, model_req

#### (1) Adjust the inbounding Prompt, simulating inbounding requests from users or other systems
MESSAGE = "I have a monthly income of $7000, expenses of $4000, and want to invest wisely for retirement. What should I do?"

#### (2) Adjust the Prompt Engineering Technique to be applied, simulating Workflow Templates
SELF_CONSISTENCY = """
You are a financial planner. To ensure accuracy, generate multiple independent solutions and find the most consistent recommendation.
1. Income: The client earns $7000 per month.
2. Expenses: The client spends $4000 per month.
3. Savings Potential: $7000 - $4000 = $3000 per month.
4. Generate multiple financial strategies based on different risk levels.
5. Identify the most frequently recommended strategy.

What is the most consistent financial strategy for the client?
Provide only the final recommendation, no explanation!
"""

PROMPT = SELF_CONSISTENCY 

#### (3) Configure the Model request, simulating Workflow Orchestration
# Documentation: https://github.com/ollama/ollama/blob/main/docs/api.md
payload = create_payload(target="open-webui",
                         model="gemma2", 
                         prompt=PROMPT, 
                         temperature=0.7,  # Adjust for diverse responses
                         num_ctx=200,  # Increase processing context
                         num_predict=100)

### YOU DONT NEED TO CONFIGURE ANYTHING ELSE FROM THIS POINT
# Send out to the model
time, response = model_req(payload=payload)
print(response)
if time: print(f'Time taken: {time}s')


!!ERROR!! HTTP Response=400, {"detail":"404: Model not found"}
Time taken: -1s
