# Statistics ChatBot LLM
***This LLM chatbot has two attitudes to determine whether a question is basic or advanced and responds accordingly.***


In [96]:
#Student asks hypothesis tests
#"What's mean and variance?"
#"In which cases we apply Chi-Square?"

In [64]:
from langchain.prompts import ChatPromptTemplate,PromptTemplate
from langchain.chat_models import ChatOpenAI

In [2]:
path = "C:\\Users\\Burk\\Desktop\\Folders\\openai api.txt"
api_key = open(path).read()
basic_question_template = """You are a statistics teacher who is to understand terms.You assume no prior knowledge.Here is your question : \n{input}"""
advanced_question_template = """You are a statistics professor who explains to advanced audiance members.You can assume anyone you answer has a PhD
in Statistics.Here's your question : \n{input}"""

In [68]:
prompt_infos = [
    {
        "name" : "basic statistics",
        "description" : "Answers basic statistics questions",
        "template" : basic_question_template
    ,},
    {
        "name":"advanced statistics",
        "description" : "Answers advanced statistics questions",
        "template" : advanced_question_template,
    }
]

In [70]:
from langchain.chains import LLMChain
llm = ChatOpenAI(api_key = api_key)
destination_chains = {}
for p_info in prompt_infos:
    name=p_info['name']
    prompt_template = p_info['template']
    prompt = ChatPromptTemplate.from_template("{input}")
    chain = LLMChain(llm=llm,prompt=prompt)
    destination_chains[name] = chain
default_prompt = ChatPromptTemplate.from_template("{input}")
default_chain = LLMChain(llm=llm,prompt=default_prompt)

In [71]:
from langchain.chains.router.multi_prompt_prompt import MULTI_PROMPT_ROUTER_TEMPLATE
destinations = [f"{p['name']} : {p['description']}" for p in prompt_infos]
destinations_str = "\n".join(destinations)
print(MULTI_PROMPT_ROUTER_TEMPLATE)

Given a raw text input to a language model select the model prompt best suited for the input. You will be given the names of the available prompts and a description of what the prompt is best suited for. You may also revise the original input if you think that revising it will ultimately lead to a better response from the language model.

<< FORMATTING >>
Return a markdown code snippet with a JSON object formatted to look like:
```json
{{{{
    "destination": string \ name of the prompt to use or "DEFAULT"
    "next_inputs": string \ a potentially modified version of the original input
}}}}
```

REMEMBER: "destination" MUST be one of the candidate prompt names specified below OR it can be "DEFAULT" if the input is not well suited for any of the candidate prompts.
REMEMBER: "next_inputs" can just be the original input if you don't think any modifications are needed.

<< CANDIDATE PROMPTS >>
{destinations}

<< INPUT >>
{{input}}

<< OUTPUT (must include ```json at the start of the respon

In [92]:
from langchain.chains.router.llm_router import LLMRouterChain,RouterOutputParser
from langchain.chains.router import MultiPromptChain
router_template = MULTI_PROMPT_ROUTER_TEMPLATE.format(destinations=destinations_str)
router_prompt = PromptTemplate(template=router_template,
                              input_variables = ['input'],
                              output_parser = RouterOutputParser())
router_chain = LLMRouterChain.from_llm(llm,router_prompt)
chain = MultiPromptChain(router_chain=router_chain,
                        destination_chains = destination_chains,
                        default_chain=default_chain,verbose=True)
print(chain.run("What's mean and variance?"))



[1m> Entering new MultiPromptChain chain...[0m
basic statistics: {'input': "What's mean and variance?"}
[1m> Finished chain.[0m
Mean is a measure of central tendency that represents the average value of a dataset. It is calculated by adding up all the values in the dataset and dividing by the total number of values.

Variance is a measure of dispersion that represents how spread out the values in a dataset are from the mean. It is calculated by taking the average of the squared differences between each value and the mean. A higher variance indicates that the values in the dataset are more spread out, while a lower variance indicates that the values are closer to the mean.


In [94]:
print(chain.run("In which cases we apply Chi-Square?"))



[1m> Entering new MultiPromptChain chain...[0m
advanced statistics: {'input': 'In which cases do we apply the Chi-Square test?'}
[1m> Finished chain.[0m
The Chi-Square test is typically used in the following cases:

1. To determine if there is a significant association between two categorical variables.
2. To test for independence between two categorical variables.
3. To compare observed frequencies with expected frequencies in a contingency table.
4. To assess goodness of fit between observed data and expected data in a hypothesis test.
