# AG DECISION ENGINE

The Ag Decision Engine accepts inputs from a user (Crop Type, Location and Timeframe) and develops a customized crop planning and protection plan for farmland owners or operatators across North Carolina. The decision engine offers a basic interface for <em><u>user input</u></em> and leverages ouputs from a <em><u>crop performance prediction model</u></em> and a RAG-enhanced LLM for recomendation building.

## 1.0 User Interface

**OVERVIEW**\
To the tool, a user must input a county, crops to consider and target planting season.
\
The user interface, built using gradio, is designed to accept three sets of user inputs: 
* INPUT 1: “Select County" -  user inputs a NC county from a dropdown list
* INPUT 2: “Crops To Consider" -  user inputs crop types by selecting appropriate checkboxes
* INPUT 3: “Planting Season(s) and Year (‘YYYY’)” – user inputs select relevant seasons using a set of checkbox and input a 4-digit year (between 2025 and 2035) using a keypad

\
**CODE CONTENT**\
1.0 - U/I Dependencies\
1.1 - Input Definition\
1.2 - U/I Design

**U/I DEPENDENCIES**

In [3]:
# Uses Gradio to build interface
import path
import gradio as gr

# Removes unnecessary warnings
import warnings
warnings.filterwarnings('ignore')

## 1.1 Input Definition

In [4]:
# Defines the inputs for counties, crops and seasons
counties = ["Alamance", "Alexander", "Alleghany", "Anson", "Ashe", "Avery", "Beaufort", "Bertie", "Bladen", "Brunswick",
            "Buncombe", "Burke", "Cabarrus", "Caldwell", "Camden", "Carteret", "Caswell", "Catawba", "Chatham",
            "Cherokee", "Chowan", "Clay", "Cleveland", "Columbus", "Craven", "Cumberland", "Currituck", "Dare",
            "Davidson", "Davie", "Duplin", "Durham", "Edgecombe", "Forsyth", "Franklin", "Gaston", "Gates", "Graham",
            "Granville", "Greene", "Guilford", "Halifax", "Harnett", "Haywood", "Henderson", "Hertford", "Hoke", "Hyde",
            "Iredell", "Jackson", "Johnston", "Jones", "Lee", "Lenoir", "Lincoln", "Macon", "Madison", "Martin",
            "McDowell", "Mecklenburg", "Mitchell", "Montgomery", "Moore", "Nash", "New Hanover", "Northampton",
            "Onslow", "Orange", "Pamlico", "Pasquotank", "Pender", "Perquimans", "Person", "Pitt", "Polk", "Randolph",
            "Richmond", "Robeson", "Rockingham", "Rowan", "Rutherford", "Sampson", "Scotland", "Stanly", "Stokes",
            "Surry", "Swain", "Transylvania", "Tyrrell", "Union", "Vance", "Wake", "Warren", "Washington", "Watauga",
            "Wayne", "Wilkes", "Wilson", "Yadkin", "Yancey"]

crops = ['Barley', 'Corn', 'Cotton', 'Hay', 'Oats', 'Peanuts', 'Bell Peppers', 'Pumpkins', 'Soybeans', 'Squash',
         'Sweet Potatoes', 'Tobacco', 'Wheat']

seasons = ['Spring', 'Summer', 'Fall']

# Function for formating inputs
def crop_prediction(county, crop_list, selected_seasons, year):
    # Placeholder function to simulate crop prediction
    crop_yields = [1.0] * len(crop_list)
    crop_values = [2.0] * len(crop_list)
    confidence_levels = [0.8] * len(crop_list)
    return crop_yields, crop_values, confidence_levels



## 1.2 U/I Design

In [5]:
# Function defining Gradio Interface
def user_interface(county, crop_list, selected_seasons, year):
    # Call the crop prediction model with user inputs
    crop_yields, crop_values, confidence_levels = crop_prediction(county, crop_list, selected_seasons, year)
    
    # Display results (for now, just showing the inputs for demonstration)
    return f"County: {county}\nCrops: {', '.join(crop_list)}\nSeasons: {', '.join(selected_seasons)}\nYear: {year}"

# Define the Gradio interface
inputs = [
    gr.Image(value="Images/ui_image.png", label="Farm Image"),  
    gr.Dropdown(choices=counties, label="Select County"),
    gr.CheckboxGroup(choices=crops, label="Crops to Consider"),
    gr.CheckboxGroup(choices=seasons, label="Planting Season(s)", value=seasons),
    gr.Number(label="Planting Year (YYYY)", value=2025, minimum=2025, maximum=2035)
]

# Defines Output Design
outputs = gr.Textbox(label="Planting and Protection Recommendations")

# Launches the Gradio interface
gr.Interface(fn=user_interface, inputs=inputs, outputs=outputs, title="Crop Planning and Protection Plan Generator").launch(share=True)

Running on local URL:  http://127.0.0.1:7860
Running on public URL: https://994eeedba894e3b2e3.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)




## 2.0 Retrival Augmented Generation (RAG) Support

**OVERVIEW**\
To improve the Ag Decision Engine recommendation building, we use a retrival augmented generation (RAG) design to enhance our existing LLM with USDA, NC Department of Agriculture and other related content.\
\
The multimodal RAG is designed to perform the following tasks:  
* Parsing text, tables and images from pdf documents using [Unstructured](https://unstructured.io/) 
* Generating text summaries from parsed content using [OpenAI GPT-4V](https://openai.com/index/gpt-4v-system-card/) 
* Embedding summarized content using [LancChain multi-vector retriever](https://python.langchain.com/docs/modules/data_connection/retrievers/multi_vector) 
* Storing and retrieving embedded summaries (with a reference to any raw images) using [Chroma](https://www.trychroma.com/) 
* Generating LLM query response from text chunks and raw images using [OpenAI GPT-4V](https://openai.com/index/gpt-4v-system-card/) 

\
**CODE CONTENT**\
2.0 - RAG Dependencie\
2.1 - RAG ChatModel\
2.2 - RAG Document Loader\
2.3 - RAG Text Summarizer\
2.4 - RAG VectorStore and Retriever

\
**ATTRIBUTIONS**\
RAG code adapted from [LangChain Cookbook: Multi-modal RAG](https://github.com/langchain-ai/langchain/blob/master/cookbook/Multi_modal_RAG.ipynb).


**RAG DEPENDENCIES (OPEN SOURCE PACKAGES)**

In [6]:
# pip install pdf2image #(includes poppler)
# ! pip install -U langchain openai langchain-chroma langchain-experimental # (requires newest version for multi-modal)
# ! pip install "unstructured[all-docs]" pillow pydantic lxml pillow matplotlib chromadb tiktoken
# pip install -U langchain-openai

Additional packages may be required _(see installation instructions linked)_
* [poppler](https://pdf2image.readthedocs.io/en/latest/installation.html) required for pdf handling
* [tesseract](https://tesseract-ocr.github.io/tessdoc/Installation.html) required for OCR text recognition
* [pytorch](https://pytorch.org/get-started/locally/#windows-installation) required when ingesting data via the [Unstructured](https://docs.unstructured.io/welcome) package

**RAG DEPENDENCIES (LIBRARIES)**

In [7]:
# Uses OpenAI LLM
import os # Needed to access API key
import openai

# LLM Chat
from langchain.chat_models import ChatOpenAI # Needed for chat model functionality
from langchain_openai import ChatOpenAI
from langchain.prompts import ChatPromptTemplate # Needed for using LangChain prompt templates
from langchain_core.messages import HumanMessage

# Data Loading
from langchain_text_splitters import CharacterTextSplitter # Needed for data loading, partitioning PDFs, tables, text, and images
from unstructured.partition.pdf import partition_pdf # Needed for partitioning PDFs, tables, text, and images

# Text, Table and Image Summaries
from langchain_core.output_parsers import StrOutputParser # Needed for summary generation
import base64

# Vectorstores
import uuid # Needed for generating document ids
from langchain.retrievers.multi_vector import MultiVectorRetriever # Needed to perform vector retrieval
from langchain.storage import InMemoryStore # Need to support proper content storage
from langchain_chroma import Chroma # Needed for accessing vector database
from langchain_core.documents import Document # Needed to 
from langchain_openai import OpenAIEmbeddings # Needed to generate vector embeddings

# RAG Retrieval
import io
import re
from IPython.display import HTML, display
from langchain_core.runnables import RunnableLambda, RunnablePassthrough
from PIL import Image

In [8]:
# Helper function for loading API key
from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # reads local .env file

openai.api_key = os.environ['OPENAI_API_KEY']

## 2.1 RAG ChatModel Setup

**LLM Setup (Cloud)**

In [9]:
# OpenAI model selection
llm_model = "gpt-4o-mini"

# Uses temperature = 0.0 to reduce randomness in retrieved responses
chat = ChatOpenAI(temperature=0.0, model=llm_model)

# Optional - shows custom chat settings
chat

ChatOpenAI(client=<openai.resources.chat.completions.Completions object at 0x000001A93ECAD600>, async_client=<openai.resources.chat.completions.AsyncCompletions object at 0x000001A93ECAEEF0>, root_client=<openai.OpenAI object at 0x000001A93E6A3610>, root_async_client=<openai.AsyncOpenAI object at 0x000001A93ECAD630>, model_name='gpt-4o-mini', temperature=0.0, openai_api_key=SecretStr('**********'), openai_proxy='')

**LLM Setup (Local)**

## 2.2 RAG Document Loader

**Partitioning Functions**

In [6]:
# Extracts elements from PDF
def extract_pdf_elements(path, fname):
    """
    Extract images, tables, and chunk text from a PDF file.
    path: File path, which is used to dump images (.jpg)
    fname: File name
    """
    return partition_pdf(
        filename=path + fname,
        extract_images_in_pdf=False,
        infer_table_structure=True,
        chunking_strategy="by_title",
        max_characters=4000,
        new_after_n_chars=3800,
        combine_text_under_n_chars=2000,
        image_output_dir_path=path,
    )


# Categorizes elements by type
def categorize_elements(raw_pdf_elements):
    """
    Categorize extracted elements from a PDF into tables and texts.
    raw_pdf_elements: List of unstructured.documents.elements
    """
    tables = []
    texts = []
    for element in raw_pdf_elements:
        if "unstructured.documents.elements.Table" in str(type(element)):
            tables.append(str(element))
        elif "unstructured.documents.elements.CompositeElement" in str(type(element)):
            texts.append(str(element))
    return texts, tables


In [None]:
# Accesses documents
# File path
fpath = "/Users/rlm/Desktop/cj/"
fname = "cj.pdf"

# Get elements
raw_pdf_elements = extract_pdf_elements(fpath, fname)

# Get text, tables
texts, tables = categorize_elements(raw_pdf_elements)

# Optional: Enforce a specific token size for texts
text_splitter = CharacterTextSplitter.from_tiktoken_encoder(
    chunk_size=4000, chunk_overlap=0
)
joined_texts = " ".join(texts)
texts_4k_token = text_splitter.split_text(joined_texts)

## 2.3 RAG Text Summarizers
Uses multi-vector-retriever to index image (and / or text, table) summaries,\
but retrieve raw images (along with raw texts or tables).\
Sourced from LangChain Cookbook (_https://github.com/langchain-ai/langchain/blob/master/cookbook/Multi_modal_RAG.ipynb_)

**Text and Table Summary Function**\
Uses GPT-4 to produce table and text summaries used to retrieve raw tables and raw chunks of text.

In [None]:
# Generate summaries of text elements
def generate_text_summaries(texts, tables, summarize_texts=False):
    """
    Summarize text elements
    texts: List of str
    tables: List of str
    summarize_texts: Bool to summarize texts
    """

    # Prompt
    prompt_text = """You are an assistant tasked with summarizing tables and text for retrieval. \
    These summaries will be embedded and used to retrieve the raw text or table elements. \
    Give a concise summary of the table or text that is well optimized for retrieval. Table or text: {element} """
    prompt = ChatPromptTemplate.from_template(prompt_text)

    # Text summary chain
    model = ChatOpenAI(temperature=0, model="gpt-4")
    summarize_chain = {"element": lambda x: x} | prompt | model | StrOutputParser()

    # Initialize empty summaries
    text_summaries = []
    table_summaries = []

    # Apply to text if texts are provided and summarization is requested
    if texts and summarize_texts:
        text_summaries = summarize_chain.batch(texts, {"max_concurrency": 5})
    elif texts:
        text_summaries = texts

    # Apply to tables if tables are provided
    if tables:
        table_summaries = summarize_chain.batch(tables, {"max_concurrency": 5})

    return text_summaries, table_summaries


# Get text, table summaries
text_summaries, table_summaries = generate_text_summaries(
    texts_4k_token, tables, summarize_texts=True
)

**Image Summary Function**\
Uses GPT-4V to produce the image summaries. See API documenation here: https://platform.openai.com/docs/guides/vision

In [None]:
# import base64


# def encode_image(image_path):
#     """Getting the base64 string"""
#     with open(image_path, "rb") as image_file:
#         return base64.b64encode(image_file.read()).decode("utf-8")


# def image_summarize(img_base64, prompt):
#     """Make image summary"""
#     chat = ChatOpenAI(model="gpt-4-vision-preview", max_tokens=1024)

#     msg = chat.invoke(
#         [
#             HumanMessage(
#                 content=[
#                     {"type": "text", "text": prompt},
#                     {
#                         "type": "image_url",
#                         "image_url": {"url": f"data:image/jpeg;base64,{img_base64}"},
#                     },
#                 ]
#             )
#         ]
#     )
#     return msg.content


# def generate_img_summaries(path):
#     """
#     Generate summaries and base64 encoded strings for images
#     path: Path to list of .jpg files extracted by Unstructured
#     """

#     # Store base64 encoded images
#     img_base64_list = []

#     # Store image summaries
#     image_summaries = []

#     # Prompt
#     prompt = """You are an assistant tasked with summarizing images for retrieval. \
#     These summaries will be embedded and used to retrieve the raw image. \
#     Give a concise summary of the image that is well optimized for retrieval."""

#     # Apply to images
#     for img_file in sorted(os.listdir(path)):
#         if img_file.endswith(".jpg"):
#             img_path = os.path.join(path, img_file)
#             base64_image = encode_image(img_path)
#             img_base64_list.append(base64_image)
#             image_summaries.append(image_summarize(base64_image, prompt))

#     return img_base64_list, image_summaries


# # Image summaries
# img_base64_list, image_summaries = generate_img_summaries(fpath)

## 2.4 RAG VectorStore and Retriever
Sourced from LangChain Cookbook (_https://github.com/langchain-ai/langchain/blob/master/cookbook/Multi_modal_RAG.ipynb_)

**Content Storage Function**\
Stores raw texts, tables, and images in the docstore; Stores texts, table and image summaries in the vectorstore for efficient semantic retrieval.

In [None]:
def create_multi_vector_retriever(
    vectorstore, text_summaries, texts, table_summaries, tables, image_summaries, images
):
    """
    Create retriever that indexes summaries, but returns raw images or texts
    """

    # Initialize the storage layer
    store = InMemoryStore()
    id_key = "doc_id"

    # Create the multi-vector retriever
    retriever = MultiVectorRetriever(
        vectorstore=vectorstore,
        docstore=store,
        id_key=id_key,
    )

    # Helper function to add documents to the vectorstore and docstore
    def add_documents(retriever, doc_summaries, doc_contents):
        doc_ids = [str(uuid.uuid4()) for _ in doc_contents]
        summary_docs = [
            Document(page_content=s, metadata={id_key: doc_ids[i]})
            for i, s in enumerate(doc_summaries)
        ]
        retriever.vectorstore.add_documents(summary_docs)
        retriever.docstore.mset(list(zip(doc_ids, doc_contents)))

    # Add texts, tables, and images
    # Check that text_summaries is not empty before adding
    if text_summaries:
        add_documents(retriever, text_summaries, texts)
    # Check that table_summaries is not empty before adding
    if table_summaries:
        add_documents(retriever, table_summaries, tables)
    # Check that image_summaries is not empty before adding
    if image_summaries:
        add_documents(retriever, image_summaries, images)

    return retriever


# The vectorstore to use to index the summaries
vectorstore = Chroma(
    collection_name="mm_rag_cj_blog", embedding_function=OpenAIEmbeddings()
)

# Create retriever
retriever_multi_vector_img = create_multi_vector_retriever(
    vectorstore,
    text_summaries,
    texts,
    table_summaries,
    tables,
    image_summaries,
    img_base64_list,
)

**Content Retrieval Functions**\
Bins the retrieved doc(s) into the correct parts of the GPT-4V prompt template.

In [None]:

def plt_img_base64(img_base64):
    """Disply base64 encoded string as image"""
    # Create an HTML img tag with the base64 string as the source
    image_html = f'<img src="data:image/jpeg;base64,{img_base64}" />'
    # Display the image by rendering the HTML
    display(HTML(image_html))


def looks_like_base64(sb):
    """Check if the string looks like base64"""
    return re.match("^[A-Za-z0-9+/]+[=]{0,2}$", sb) is not None


def is_image_data(b64data):
    """
    Check if the base64 data is an image by looking at the start of the data
    """
    image_signatures = {
        b"\xff\xd8\xff": "jpg",
        b"\x89\x50\x4e\x47\x0d\x0a\x1a\x0a": "png",
        b"\x47\x49\x46\x38": "gif",
        b"\x52\x49\x46\x46": "webp",
    }
    try:
        header = base64.b64decode(b64data)[:8]  # Decode and get the first 8 bytes
        for sig, format in image_signatures.items():
            if header.startswith(sig):
                return True
        return False
    except Exception:
        return False


def resize_base64_image(base64_string, size=(128, 128)):
    """
    Resize an image encoded as a Base64 string
    """
    # Decode the Base64 string
    img_data = base64.b64decode(base64_string)
    img = Image.open(io.BytesIO(img_data))

    # Resize the image
    resized_img = img.resize(size, Image.LANCZOS)

    # Save the resized image to a bytes buffer
    buffered = io.BytesIO()
    resized_img.save(buffered, format=img.format)

    # Encode the resized image to Base64
    return base64.b64encode(buffered.getvalue()).decode("utf-8")


def split_image_text_types(docs):
    """
    Split base64-encoded images and texts
    """
    b64_images = []
    texts = []
    for doc in docs:
        # Check if the document is of type Document and extract page_content if so
        if isinstance(doc, Document):
            doc = doc.page_content
        if looks_like_base64(doc) and is_image_data(doc):
            doc = resize_base64_image(doc, size=(1300, 600))
            b64_images.append(doc)
        else:
            texts.append(doc)
    return {"images": b64_images, "texts": texts}


def img_prompt_func(data_dict):
    """
    Join the context into a single string
    """
    formatted_texts = "\n".join(data_dict["context"]["texts"])
    messages = []

    # Adding image(s) to the messages if present
    if data_dict["context"]["images"]:
        for image in data_dict["context"]["images"]:
            image_message = {
                "type": "image_url",
                "image_url": {"url": f"data:image/jpeg;base64,{image}"},
            }
            messages.append(image_message)

    # Adding the text for analysis
    text_message = {
        "type": "text",
        "text": (
            "You are financial analyst tasking with providing investment advice.\n"
            "You will be given a mixed of text, tables, and image(s) usually of charts or graphs.\n"
            "Use this information to provide investment advice related to the user question. \n"
            f"User-provided question: {data_dict['question']}\n\n"
            "Text and / or tables:\n"
            f"{formatted_texts}"
        ),
    }
    messages.append(text_message)
    return [HumanMessage(content=messages)]


def multi_modal_rag_chain(retriever):
    """
    Multi-modal RAG chain
    """

    # Multi-modal LLM
    model = ChatOpenAI(temperature=0, model="gpt-4-vision-preview", max_tokens=1024)

    # RAG pipeline
    chain = (
        {
            "context": retriever | RunnableLambda(split_image_text_types),
            "question": RunnablePassthrough(),
        }
        | RunnableLambda(img_prompt_func)
        | model
        | StrOutputParser()
    )

    return chain


# Create RAG chain
chain_multimodal_rag = multi_modal_rag_chain(retriever_multi_vector_img)

# X.X USING RAG CAPABILITY

**Retrieval Example**\
Examines retrieval to check that images are relevant to query.

In [None]:
# Check retrieval
query = "Give me company names that are interesting investments based on EV / NTM and NTM rev growth. Consider EV / NTM multiples vs historical?"
docs = retriever_multi_vector_img.invoke(query, limit=6)

# We get 4 docs
len(docs)

# Check retrieval
query = "What are the EV / NTM and NTM rev growth for MongoDB, Cloudflare, and Datadog?"
docs = retriever_multi_vector_img.invoke(query, limit=6)

# Checks number of documents returned
len(docs)

# # Returns relevant images and image summary
# plt_img_base64(docs[0])
# image_summaries[3]

# Runs RAG and test ability to synthesize an answer to question.
chain_multimodal_rag.invoke(query)