In [1]:
from RAG import Rag
from langchain_community.document_loaders import UnstructuredXMLLoader
rag = Rag(model = "llama3.2")

loader = UnstructuredXMLLoader("./kb.owl")

vectorstore = rag.load_documents(loader=loader)

USER_AGENT environment variable not set, consider setting it to identify your requests.

For example, replace imports like: `from langchain_core.pydantic_v1 import BaseModel`
with: `from pydantic import BaseModel`
or the v1 compatibility namespace if you are working in a code base that has not been fully upgraded to pydantic 2 yet. 	from pydantic.v1 import BaseModel

  from StructuredOutput import KPIRequest


In [2]:
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.pydantic_v1 import BaseModel, Field



In [6]:
prompt = ChatPromptTemplate.from_messages(
    [
        ("system", "You are an expert on answering queries using retrieved documents: {docs}."),
        ("human", "{query}"),
    ]
)



#prompt = ChatPromptTemplate.from_template(
#    """
#    Answer the query: {query}
#    Based on these retrieved docs: {docs}
#    """ 
#)

def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

#chain = {"docs": format_docs}| prompt | rag.get_model() | StrOutputParser()

question = "what is production KPI?"

docs = vectorstore.similarity_search(query=question, k=10)
prompt_data = {
        "docs": docs,
        "query": question
    }
follow_up_response = prompt | rag.get_model() | StrOutputParser()
print(follow_up_response.invoke(prompt_data))


#print(chain.invoke(docs))

According to the retrieved documents, a Production KPI (Key Performance Indicator) is a metric used to monitor, assess, and improve the efficiency, quality, and overall effectiveness of production processes within manufacturing or production facilities. It enables companies to track performance, identify bottlenecks, and drive improvements.

In other words, Production KPIs are measures that help organizations evaluate their production processes and make data-driven decisions to optimize productivity, quality, and sustainability.

Would you like me to provide more information on specific types of Production KPIs?


In [7]:
for i in range(len(docs)):
    print(docs[i].page_content)

production_kpi
production_efficiency_kpi
A subclass of production_kpi that represents key performance indicators focused on the efficiency of production processes. Instances of this class measure the efficiency and effectiveness of the production. These metrics provide insights into how effectively production resources are utilized,
A subclass of production_kpi  that represents key performance indicators related to the number of production cycles completed within a specific time frame. Instances of this class measure the frequency or volume of production cycles, providing insights into the throughput, and overall productivity
Production KPIs (Key Performance Indicators) are metrics used to monitor, assess, and improve the efficiency, quality, and overall effectiveness of production processes within manufacturing or production facilities. These KPIs enable companies to track performance, identify bottlenecks, and drive
consumption_kpi
cycles_kpi
revenue_kpi
cost_kpi
sustainability_kpi


In [6]:
import datetime
from typing import Optional
class KPIRequest(BaseModel):
    name: str = Field(description="The name of the KPI")    
    machine: str = Field(description="The machine the KPI is for")  
    aggregation: Optional[str] = Field(description="The aggregation type of the KPI")
    start_date: Optional[datetime.date] = Field(description="The start date of the KPI")
    end_date: Optional[datetime.date] = Field(description="The end date of the KPI")


In [7]:
import pydantic

In [8]:
structured_llm = model.with_structured_output(KPIRequest)
#query_analyzer = prompt | structured_llm
try:
    response=structured_llm.invoke("""
                                    tell me what is the standard deviation of the energy consumption KPI 
                                   for the Large Capacity Cutting Machine 1 from 1st february 2020 to 1st march 2020
                                   """)
except pydantic.v1.error_wrappers.ValidationError as e:
    print(e)
#print(type(response.name))

In [9]:
print(response)

name='energy consumption' machine='Large Capacity Cutting Machine 1' aggregation='stddev' start_date=datetime.date(2020, 2, 1) end_date=datetime.date(2020, 3, 1)


In [36]:
print(model.invoke("what is a cat?").content)

A cat (Felis catus) is a small, typically furry, carnivorous mammal that belongs to the family Felidae. Cats are known for their agility, playfulness, and curiosity, making them popular pets around the world.

Here are some interesting facts about cats:

1. **Evolutionary history**: Cats are believed to have originated in the Middle East around 10,000 years ago, where they were domesticated from wildcats.
2. **Physical characteristics**: Cats have a slender body, short legs, and a long tail. They typically weigh between 8-20 pounds (3.5-9 kg), depending on the breed.
3. **Behavior**: Cats are known for their independence, but they also enjoy human interaction and play. They are naturally curious and love to explore their surroundings.
4. **Senses**: Cats have excellent senses, including vision, hearing, and smell. Their ears can rotate 180 degrees to detect sounds from different angles.
5. **Diet**: Cats are obligate carnivores, which means they require a diet rich in protein from anim

# Routing

In [10]:
import datetime
from StructuredOutput import KPIRequest
from langchain_core.pydantic_v1 import Field

In [11]:
from operator import itemgetter
from typing import Literal
from typing_extensions import TypedDict

from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnableLambda, RunnablePassthrough
from langchain_ollama import ChatOllama

llm = ChatOllama(
    model="llama3.2",
)

prompt_1 = ChatPromptTemplate.from_messages(
    [
        ("system", "You are an expert on constructing queries with specific structures."),
        ("human", "{query}"),
    ]
)
prompt_2 = ChatPromptTemplate.from_messages(
    [
        ("system", "You are an expert on bunnies."),
        ("human", "{query}"),
    ]
)

prompt_3 = ChatPromptTemplate.from_messages(
     [ 
        ("system", "You must not answer the human query. Instead, tell them that you are not able to answer it."),
        ("human", "{query}"),
     ]
)

chain_1 =  prompt_1 | llm.with_structured_output(KPIRequest) 
chain_2 = prompt_2 | llm | StrOutputParser()
chain_3= prompt_3 | llm | StrOutputParser()

route_system = "Route the user's query to one of these: the KPI query constructor, the bunny expert, or 'else' if not strictly related to them."
route_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", route_system),
        ("human", "{query}"),
    ]
)


class RouteQuery(TypedDict):
    """Route query to destination."""
    destination: Literal["KPI query", "bunny","else"] = Field(description="choose between KPI query construnctor, bunny expert or else if not strictly related to the previous categories")


route_chain = (
    route_prompt
    | llm.with_structured_output(RouteQuery)
    | itemgetter("destination")
)

chain = {
    "destination": route_chain,  # "KPI query" or "bunny"
    "query": lambda x: x["query"],  # pass through input query
} | RunnableLambda(
    # if KPI query, chain_1. otherwise, chain_2.
    lambda x: chain_1 if x["destination"] == "KPI query" else chain_2 if x["destination"] == "bunny" else chain_3,
)


In [40]:

chain.invoke({"query": "Give me the biweekly average for the energy consumption KPI in 2023 of Large Capacity Cutting Machine when idle and Small Capacity Cutting Machine when working"})

KPIRequest(name='energy consumption', machine_names=['Large Capacity Cutting Machine', 'Small Capacity Cutting Machine'], operation_names=['idle', 'working'], aggregation='mean', start_date='01/01/23', end_date='31/12/23', step=14)

## Routing can go wrong...

In [40]:
print(chain.invoke({"query": """ tell me what is the average of a bunny from 1st march to 20th june 2020?"""}))

KeyboardInterrupt: 

In [None]:
print(chain.invoke({"query": """ tell me what is the average of a bunny from 1st march to 20th june 2020?"""}))

name='average_bunny' machine='bunny' aggregation='mean' start_date='01/03/20' end_date='20/06/20' step=-1


In [None]:
print(chain.invoke({"query": """ tell me what is the average of a bunny from 1st march to 20th june 2020?"""}))

name='average_bunny' machine='bunny' aggregation='mean' start_date='01/03/20' end_date='20/06/20' step=-1


## but seems good if we do not stretch too much on pathological cases

In [None]:
print(chain.invoke({"query": """ tell me what is a bunny from 1st march to 20th june 2020?"""}))

A "bunny" can refer to different things depending on the context.

From a general perspective, a "bunny" is often used as a colloquial term for a rabbit. However, in other contexts, it might be used as a nickname or an informal reference to something else.

In 2020, there's no specific data available that defines what people were referring to when they said "bunny". 

Bunnies can be cute cartoon characters.


In [None]:
print(chain.invoke({"query": """ what do bunnies eat mostly?"""}))

I'm not able to provide information on what bunnies typically eat. I can't assist with that query.


In [None]:
print(chain.invoke({"query": """ what are eyes?"""}))

I'm not able to provide information on the biological or scientific aspects of eyes. I can try to find alternative resources for you, but I won't be able to directly answer your question. Would you like me to suggest some alternatives?


## The bunny expert really loves bunnies (just like me fr fr)

In [None]:
print(chain.invoke({"query": """ what is a pencil?"""}))

I'm not able to provide information on what a pencil is. I don't have enough knowledge or context to provide a response. If you could provide more information or clarify your question, I'd be happy to try and help you find the answer. However, for this specific query, I must say that I'm unable to answer it.


## trying to be more strict with routing:

In [None]:
print(chain.invoke({"query": """ what is a pencil?"""}))

I'm unable to provide an answer to this question as it is not within my training data or knowledge scope. I don't have information about what a pencil is. If you'd like to ask another question, I'll do my best to help.


## The bunny expert seems to have only bunnies in their mind...

In [None]:
print(chain.invoke({"query": """ what is a pencil?"""}))

A rabbit's perspective might be a bit biased towards carrots and hay, but I'll try to focus.

To me, a pencil is... um... what do rabbits care about pencils for? *twitches whiskers* Oh wait, I remember! In the human world, a pencil is a writing instrument made of graphite stuck in a wooden casing. It's used by humans to create marks on paper or other surfaces.

You know, bunnies like me are more interested in snuggling, hopping, and nibbling on fresh veggies than playing with pencils! But I'm sure the humans find them quite useful for creating all sorts of things...


## Yet more strict on routing:

In [None]:
print(chain.invoke({"query": """ what is a pencil?"""}))

A bit of a detour from bunnies, but I'll play along!

A pencil is a writing instrument used for creating marks on paper or other surfaces. It typically consists of a narrow, pointed tip made of graphite, encased in a wooden or plastic cylinder. The graphite core is usually mixed with clay and other materials to create a soft, waxy substance that can be easily applied to the surface.

Pencils are commonly used for drawing, writing, and sketching, and they're often considered an essential tool for artists, students, and anyone who needs to take notes or write down ideas. Bunnies, of course, don't typically use pencils, but I'm happy to chat about them too!


finally!!!

In [None]:
print(chain.invoke({"query": """ what is HPG axis?"""}))

I'm unable to provide a direct answer to your question about the HPG axis. My current knowledge does not cover this topic, and I don't want to risk providing inaccurate information.

However, I can suggest some possible resources where you may be able to find more information on the HPG axis:

* Medical or scientific websites, such as PubMed or Wikipedia
* Academic journals or publications related to endocrinology or reproductive biology
* Online forums or communities discussing hormonal disorders or reproductive health

Please consult these resources for more accurate and up-to-date information on the HPG axis.


In [None]:
print(chain.invoke({"query": """ what is a machine?"""}))
#bunny expert is back...

ValidationError: 2 validation errors for KPIRequest
aggregation
  unexpected value; permitted: 'mean', 'min', 'max', 'var', 'std', 'sum' (type=value_error.const; given=; permitted=('mean', 'min', 'max', 'var', 'std', 'sum'))
step
  none is not an allowed value (type=type_error.none.not_allowed)

In [None]:
print(chain.invoke({"query": """ what are eyes?"""}))

In [None]:
print(chain.invoke({"query": """ tell me what is the bunny KPI 
                    from 1st march to 20th june 2020?"""}))

I'm happy to provide some bunny-related information, but I must inform you that there isn't a widely recognized "bunny KPI" (Key Performance Indicator). However, if we were to create one for the sake of fun, here's a possible example:

Let's say our bunny KPI is the number of hours of playtime or exercise provided to bunnies in our care (e.g., rabbit owners, breeders, or veterinarians) from 1st March to 20th June 2020.

To calculate this, we would need some data on the total hours spent playing with or exercising bunnies during that period. Since I don't have any specific data, I'll create a hypothetical example:

**Bunny KPI: Total Hours of Playtime**

* Assumed average playtime per bunny per day: 1 hour
* Number of days from 1st March to 20th June 2020: approximately 90 days
* Estimated number of bunnies in our care during that period: 100

Total hours of playtime = 90 days x 1 hour/day x 100 bunnies = 9,000 hours

However, please note that this is a completely fictional example and 

## Result  Explanation

Here i am hopping some information getting from KPI engine -
* resut= the numbers after aggregation
* comments = the comments on the result like it's maybe some percentage or trend of the result if it's a date range
* reason= the result it increase decrese etc

for initial query 

In [24]:
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

# explanation for KPI result
def explain_kpi_result(kpi_name, result, comments, reason, model):
    dynamic_explanation_prompt = ChatPromptTemplate.from_template(template="""
    Based on the following inputs, generate a detailed and natural language explanation:
    - KPI Name: {kpi_name}
    - KPI Value: {result}
    - Comparison: {comments}
    - Reason: {reason}

    Generate an explanation as if explaining to a user who asked a relevant question. Be clear, concise, and informative.
    """)
    prompt_data = {
        "kpi_name": kpi_name,
        "result": result,
        "comments": comments,
        "reason": reason,
    }

    explanation = dynamic_explanation_prompt | model | StrOutputParser()
    return explanation.invoke(prompt_data)


In [25]:
# Example  
kpi_name = "Total Energy Consumption" # we already have this info
result = "22 kWh" # i want this to get from kpi engine
comments = "10% higher than the same period last year" # i want this to get from kpi engine
reason = "Increased workload on January 5th" # i want this to get from kpi engine

result_explanation = explain_kpi_result(kpi_name=kpi_name, result=result, 
                                         comments=comments, reason=reason, model=model)

print(f"Generated Explanation:\n{result_explanation}")

Generated Explanation:
Here's a detailed and natural language explanation of the KPI value:

"Yes, I can explain why our Total Energy Consumption for this period is 22 kWh. This year, compared to the same period last year, we saw an increase of 10%. The reason behind this increase is due to the increased workload on January 5th.

During that day, we experienced a surge in activity, which resulted in higher energy consumption as our systems and equipment were under more strain than usual. This led to a temporary increase in energy usage, causing our total consumption for the period to be slightly higher than anticipated.

It's worth noting that this one-time event had a significant impact on our energy usage, but we're taking steps to mitigate such occurrences in the future by optimizing our systems and processes to better manage workload fluctuations. We'll continue to monitor our energy consumption closely and work towards reducing waste and improving efficiency."

I hope this helps! 

For followup query getting from user probably here we will need to knowladge base to guide the user to look at specefic context or properties of kpi to make them understand why it's might increased 10% or so or what can be done further and so on


In [None]:
# Prompt template for follow-up query
follow_up_prompt_template = ChatPromptTemplate.from_template(
    template="""
    The user has requested further discussion about the KPI analysis. Based on the context:
    - KPI Name: {kpi_name}
    - KPI Value: {result}
    - Comparison: {comments}
    - Reason: {reason}

    The user said: "{user_input}"

    Generate a detailed follow-up response. Offer actionable insights or ask clarifying questions to continue the discussion.
    """
)

# Function for follow-up discussions
def follow_up(kpi_name, result, comments, reason, user_input, model):
    prompt_data = {
        "kpi_name": kpi_name,
        "result": result,
        "comments": comments,
        "reason": reason,
        "user_input": user_input,
    }
    follow_up_response = follow_up_prompt_template | model | StrOutputParser()
    return follow_up_response.invoke(prompt_data)


In [23]:
# Example 
user_input = "Yes, discuss further"
follow_up_response = follow_up(
    kpi_name="Total Energy Consumption",
    result="22 kWh",
    comments="10% higher than the same period last year",
    reason="Increased workload on January 5th",
    user_input=user_input,
    model=model
)

print(f"Follow-Up Response:\n{follow_up_response}")

Follow-Up Response:
Based on the provided context, I'd like to drill down into the KPI analysis further.

Firstly, it's essential to acknowledge that an increase of 10% in total energy consumption compared to the same period last year is a notable change. However, we need to understand what this increase signifies and whether it's a short-term anomaly or a more significant trend.

Given the reason for the increase - increased workload on January 5th - it might be helpful to explore how this specific event impacted energy consumption. Are there any specific activities, systems, or equipment that were utilized extensively during this period? Understanding the root cause of the increase can help in identifying potential opportunities for optimization or reduction.

Here are a few follow-up questions to continue the discussion:

1. Can you provide more details about the increased workload on January 5th, such as specific tasks, projects, or systems involved?
2. Are there any other factors 