### Packages

In [17]:
# !pip install langchain langchain_community langchain_openai pymupdf chromadb tiktoken

### Imports
Note: langchain_community chat models are deprecated. Use the `langchain_openai` library istead. I'm still using `langchain_community.chat_models.ChatOpenAI` because of a corruption in my `langchain_openai` library.

In [18]:
from langchain_community.document_loaders import PyMuPDFLoader
from langchain_community.embeddings import OllamaEmbeddings
from langchain_community.chat_models import ChatOpenAI #Deprecated
# from langchain_openai import ChatOpenAI
from langchain.prompts import ChatPromptTemplate, PromptTemplate
import tiktoken
from IPython.display import display, Markdown

### Load documents

In [19]:
local_path = "docs/fpc-manual.pdf"

# Local PDF file uploads
if local_path:
    loader = PyMuPDFLoader(local_path)
    data = loader.load()
else:
    print("Upload a PDF file")

In [20]:
# Preview 1st page
data[0]



### Covert the whole document into a single text
We will leverage long context to feed the llm with the whole document as a context along user query

In [21]:
document_text = "\n".join([document.page_content for document in data])

In [22]:
# View the 1500 first characters
document_text[:1500]



### Create a local chatbot based on Llama 3.1 8B running locally

The following code assumes that Ollama is up and running a local Llama 3.1 8B model.

Create a LangChain ChatOpenAI instance with your own local Llama 3.1 model. The 8b version runs pretty well on a decent personal computer. For a modest computer, pull and use a smaller model like [gemma 2 2B](https://ollama.com/library/gemma2).<br>
You can still use a closed OpenAI GPT-X model if you want. In that case, use your API key and do not provide a base URL.

In [23]:
local_model = "llama3.1:8b"
# local_model = "gemma2:2b"

llm = ChatOpenAI(
    model=local_model,
    temperature=0,
    base_url="http://localhost:11434/v1",
    api_key="NA"
)

prompt_template = """
You are an assistant for question-answering tasks.
Use the following context to answer the question.
If you don't know the answer, say that you don't know.

<context>
    {context}
</context>

Question: {input}
"""

prompt = ChatPromptTemplate.from_template(prompt_template)

### Question answering and token usage

#### functions to estimate token usage
Tiktoken is a library designed to break down text into tokens. It can encode text strings into tokens, and can be used to estimate the cost of API calls when the encoding name for the model is known. It is specialized for OpenAI language models like GPT-3.
Even though we are using Llama 3.1, tiktoken can still give a rough estimate of token usage.

In [24]:
def estimate_tokens(text):
  encoding = tiktoken.encoding_for_model("gpt-3.5-turbo")  # Replace with your OpenAI LLM's encoding
  return len(encoding.encode(text))
  
def token_usage(formatted_prompt, ai_message):   
    """
    Estimates the token usage for a given prompt and AI response.
    Args:
        formatted_prompt (str): The formatted prompt sent to the AI.
        ai_message (object): An object containing the AI's response, with a 'content' attribute.
    Returns:
        None: Prints the token counts to the console. 
    """

    prompt_tokens = estimate_tokens(formatted_prompt)

    response_tokens = estimate_tokens(ai_message.content)

    print(f"Prompt Tokens: {prompt_tokens}")
    print(f"Completion Tokens: {response_tokens}")
    print(f"Total Tokens: {prompt_tokens + response_tokens}")

#### Query 1: Getting the minimum safe cooking temperature for chicken

In [25]:

user_query = "What is the minimum safe cooking temperature for chicken?"

formatted_prompt = prompt.format(input=user_query, context=document_text)

ai_message = llm.invoke(formatted_prompt)

display(Markdown(ai_message.content))

token_usage(formatted_prompt, ai_message)


According to the provided text, there is no specific information about the minimum safe cooking temperature for chicken. However, it does mention that all foods should be kept at 140°F (60°C) or higher and that food temperatures should be checked with an accurate food thermometer.

Typically, the recommended internal temperature for cooked chicken is 165°F (74°C). If you're looking for specific guidelines on cooking temperatures, I'd be happy to provide more general information.

Prompt Tokens: 80184
Completion Tokens: 93
Total Tokens: 80277


#### Query 2: FIFO in food safety

In [26]:
user_query = "What is FIFO in food safety?"

formatted_prompt = prompt.format(input=user_query, context=document_text)

ai_message = llm.invoke(formatted_prompt)

display(Markdown(ai_message.content))

token_usage(formatted_prompt, ai_message)



In the context of food safety, FIFO stands for "First In, First Out." This principle ensures that the oldest items in stock are used or sold before newer ones. This helps prevent older products from spoiling and being served to customers, which can lead to foodborne illnesses.

FIFO is an important concept in food safety because it helps maintain the quality and safety of perishable foods by ensuring that they are consumed within a reasonable time frame after their production or receipt. By following FIFO, businesses can minimize the risk of serving spoiled or contaminated food to customers.

In the provided text, the importance of maintaining high temperatures for hot holding foods is emphasized, but the concept of FIFO is not explicitly mentioned. However, it is implied in the instructions to "Stir frequently to evenly distribute the temperature throughout the food" and to "Record temperatures," which suggests that the goal is to ensure that all food items are stored and served at a safe temperature, regardless of their age or date of receipt.

Prompt Tokens: 80181
Completion Tokens: 198
Total Tokens: 80379


#### Query 3: Summarization

In [27]:
user_query = "Summarize the document in context"

formatted_prompt = prompt.format(input=user_query, context=document_text)

ai_message = llm.invoke(formatted_prompt)

display(Markdown(ai_message.content))

token_usage(formatted_prompt, ai_message)



The document appears to be a manual for food safety and sanitation procedures, specifically for establishments in New York City. It covers various aspects of maintaining a clean and safe environment for food preparation and handling.

Some key points from the document include:

* Maintaining proper temperatures for hot holding foods (140°F or higher)
* Using accurate thermometers to check food temperatures
* Preheating equipment before adding food
* Recording temperatures
* Stirring frequently to evenly distribute temperature throughout food

The manual also covers procedures for manual and mechanical dishwashing, including the use of sanitizing solutions and proper rinse temperatures.

Additionally, the document discusses pest control measures, such as:

* Eliminating holes, cracks, and crevices in food storage, preparation, and handling areas
* Storing food in vermin-proof containers with tightly fitted lids
* Using a licensed exterminator for any extermination done on the premises
* Keeping outside areas of the establishment clean to discourage pests

The document also includes a temperature log template for tracking temperatures.

Overall, the manual provides guidelines for maintaining a safe and sanitary environment for food preparation and handling in New York City establishments.

Prompt Tokens: 80182
Completion Tokens: 231
Total Tokens: 80413
