Built 2 Agents Using LangGraph For PO Parsing: 
1. Agent 1 : Extracts the following fields from the PO description:  Grade , shape, , Ground, Finish type
2. Agent 2 : Extracts the following fields from the PO description: Parse the feilds  Diameter, Thickness, Length, Width


In [None]:
%pip install -U langchain_community langgraph
# langchain_anthropic langchain_experimental matplotlib

### Langgraph 

In [1]:
from langchain.chat_models import ChatOpenAI
from langchain.prompts import PromptTemplate
from langchain_core.messages import BaseMessage, HumanMessage
from langgraph.graph import StateGraph, END

import re
from fractions import Fraction

import os
from openai import AzureOpenAI
import PyPDF2
import json
import pandas as pd
import csv
import json
import re
import time
from dotenv import load_dotenv
from langchain_openai import AzureChatOpenAI



In [2]:
# Load environment variables from .env file
load_dotenv()

# Get values from environment
model_name = os.getenv("MODEL_NAME")
deployment = os.getenv("DEPLOYMENT")


# Create the AzureOpenAI client
client = AzureChatOpenAI(
    azure_deployment = os.getenv("DEPLOYMENT"),
    api_version = os.getenv("API_VERSION"),
    model_name = os.getenv("MODEL_NAME"),
    azure_endpoint = os.getenv("AZURE_ENDPOINT"),
    api_key = os.getenv("AZURE_SUBSCRIPTION_KEY"),
)


In [None]:
# Shared prompt templates
agent1_prompt = PromptTemplate.from_template("""
You are an expert in processing Purchase Orders in a Steel Bar manufacturing company. 
Your task is to extract specific fields from the Purchase Order (PO) Description provided by the customer. 
Carefully read the PO description text and extract the following fields with the given rules:
 
Fields to Extract:

1. Grade: The Material grade of the product. It can be in the format "12L14" or "1215" or "C12L14". 
2. Shape: The Shape of the product.
3. Ground: If the product should be ground. Set to "Yes" if specified in the description, otherwise set to "No".
4. Finish Type: The type of finish.

Rules:
- The Shape can be only be one of these: FL(Flat), Hex(Hexagon), OCT(Octagaon), RD(Round), SQ(Square), SS(Stainless).
- Finish types can be only be one of these: CF Bar, HR Bar, HR Coil, Stainless.
- If the description is empty then *Do Not* extract any fields.
                                             
Ensure to apply the rules.
Ensure to return blank if you cannot identify an attribute. Do not infer or assume any values not stated in the PO.           

PO Description: {description}

Example: 
Input : "9/16" HEX 12L14 CF Steel Bar"
Output: "Grade": "12L14", "Shape": "Hex", "Ground": "No", "Finish Type": "CF Bar"
""")


In [4]:
agent1_prompt

PromptTemplate(input_variables=['description'], input_types={}, partial_variables={}, template='\nYou are an expert in processing Purchase Orders in a Steel Bar manufacturing company. \nYour task is to extract specific fields from the Purchase Order (PO) Description provided by the customer. \nCarefully read the PO description text and extract the following fields with the given rules:\n\nFields to Extract:\n\n1. Grade: The Material grade of the product. It can be in the format "12L14" or "1215" or "C12L14". \n2. Shape: The Shape of the product.\n3. Ground: If the product should be ground. Set to "Yes" if specified in the description, otherwise set to "No".\n4. Finish Type: The type of finish.\n\nRules:\n- The Shape can be only be one of these: FL(Flat), Hex(Hexagon), OCT(Octagaon), RD(Round), SQ(Square), SS(Stainless).\n- Finish types can be only be one of these: CF Bar, HR Bar, HR Coil, Stainless.\n- If the description is Blank then *Do Not* extract any fields.\n\nEnsure to apply the

In [None]:
agent2_prompt = PromptTemplate.from_template("""
You are an intelligent document parser working for a steel manufacturing company. 
Your task is to extract specific fields from the Purchase Order (PO) Description provided by the customer. 
Carefully read the PO description text and extract the following fields with the given rules:

Fields to Extract:
 
1. Diameter: The Diameter of the product.
2. Thickness: The Thickness of the product.
3. Length: The length of the product.
4. Width: The width of the product.

Rules:

- Diameter is applicable for specific shapes only: Hex(Hexagon), OCT(Octagaon), RD(Round), SQ(Square).
- Thickness is applicable only for FL(Flat) and SS(Stainless) shapes.
- The description will contain either diameter or thickness value.
- Length is generally specified in 'feet' or 'inches'.
- If length is specified in the description, return the length in 'inches' only.
- If the length is not specified, default to 144 inches.
- Width is applicable only for FL(Flat) and SS(Stainless) shapes.

If a value contains fractions (like 1/2, 1 1/2, 2-1/8",1-5/16), convert them to decimals.
Ensure to apply the rules.
Ensure to return blank if you cannot identify an attribute. Do not infer or assume any values not stated in the PO.

PO Description: {description}

Example: 
    Input: "1-3/8" Round 1215 Steel"
    Output: "Diameter": "1.375","Thickness": "","Length": "144","Width":""
""")

In [6]:
agent2_prompt

PromptTemplate(input_variables=['description'], input_types={}, partial_variables={}, template='\nYou are an intelligent document parser working for a steel manufacturing company. \nYour task is to extract specific fields from the Purchase Order (PO) Description provided by the customer. \nCarefully read the PO description text and extract the following fields with the given rules:\n\nFields to Extract:\n\n1. Diameter: The Diameter of the product.\n2. Thickness: The Thickness of the product.\n3. Length: The length of the product.\n4. Width: The width of the product.\n\nRules:\n\n- Diameter is applicable for specific shapes only: Hex(Hexagon), OCT(Octagaon), RD(Round), SQ(Square).\n- Thickness is applicable only for FL(Flat) and SS(Stainless) shapes.\n- The description will contain either diameter or thickness value.\n- Length is generally specified in \'feet\' or \'inches\'.\n- If length is specified in the description, return the length in \'inches\' only.\n- If the length is not spec

In [7]:
# Node 1: Agent 1 - Extract Material Info
def agent1_node(state):
    description = state["input"]
    prompt = agent1_prompt.format(description=description)
    response = client.invoke([HumanMessage(content=prompt)])
    return {"agent1_output": response.content, "input": description}

In [8]:
# Node 2: Agent 2 - Extract Dimensions and Convert Fractions
def agent2_node(state):
    description = state["input"]
    prompt = agent2_prompt.format(description=description)
    response = client.invoke([HumanMessage(content=prompt)])
    return {"agent1_output": state["agent1_output"], "agent2_output": response.content}



In [9]:
# Build LangGraph

from typing import TypedDict, Optional

class AgentState(TypedDict):
    input: str
    agent1_output: Optional[str]
    agent2_output: Optional[dict]

builder = StateGraph(state_schema=AgentState)

# builder = StateGraph()

builder.add_node("agent1", agent1_node)
builder.add_node("agent2", agent2_node)

builder.set_entry_point("agent1")
builder.add_edge("agent1", "agent2")
builder.add_edge("agent2", END)

graph = builder.compile()


In [11]:
#  Example
description = """1-7/16"" 1215 CF Steel RD
color code bar ends gold
Certifications Required
21778-FS35 6-44B"""

result = graph.invoke({"input": description})
print("Agent 1 Output:\n", result["agent1_output"])
print("Agent 2 Output:\n", result["agent2_output"])

Agent 1 Output:
 {
  "Grade": "1215",
  "Shape": "RD",
  "Ground": "No",
  "Finish Type": "CF Bar"
}
Agent 2 Output:
 {
    "Diameter": "1.4375",
    "Thickness": "",
    "Length": "144",
    "Width": ""
}


In [12]:
# This O is Octagon ?
# Example run
description = "1-3/16 O 1215 1-3/16 O 1215 sourced/melted steel"

result = graph.invoke({"input": description})
print("Agent 1 Output:\n", result["agent1_output"])
print("Agent 2 Output:\n", result["agent2_output"])

Agent 1 Output:
 {
  "Grade": "1215",
  "Shape": "",
  "Ground": "No",
  "Finish Type": ""
}
Agent 2 Output:
 {
    "Diameter": "1.188",
    "Thickness": "",
    "Length": "144",
    "Width": ""
}


In [13]:
description = "3/4'' RD. 1018 12 ft bar of 3/4'' RD. 1018"

result = graph.invoke({"input": description})
print("Agent 1 Output:\n", result["agent1_output"])
print("Agent 2 Output:\n", result["agent2_output"])

Agent 1 Output:
 {
  "Grade": "1018",
  "Shape": "Round",
  "Ground": "No",
  "Finish Type": ""
}
Agent 2 Output:
 {
  "Diameter": "0.75",
  "Thickness": "",
  "Length": "144",
  "Width": ""
}


In [14]:
description =  "STEEL 0.6250 HEX-SHARP 1215 12' BAR"

result = graph.invoke({"input": description})
print("Agent 1 Output:\n", result["agent1_output"])
print("Agent 2 Output:\n", result["agent2_output"])

Agent 1 Output:
 {
  "Grade": "1215",
  "Shape": "Hex",
  "Ground": "No",
  "Finish Type": "Bar"
}
Agent 2 Output:
 {
    "Diameter": "0.6250",
    "Thickness": "",
    "Length": "144",
    "Width": ""
}


In [15]:
description =  ""

result = graph.invoke({"input": description})
print("Agent 1 Output:\n", result["agent1_output"])
print("Agent 2 Output:\n", result["agent2_output"])

Agent 1 Output:
 {
  "Grade": "12L14",
  "Shape": "Hex",
  "Ground": "No",
  "Finish Type": "CF Bar"
}
Agent 2 Output:
 {
  "Diameter": "1.375",
  "Thickness": "",
  "Length": "144",
  "Width": ""
}


### Reading Iteam Description CSV File

In [17]:
df_line_items = pd.read_csv("PO_Line Items.csv")
print("The number of rows and columns : ", df_line_items.shape)

The number of rows and columns :  (79, 15)


In [18]:
df_line_items.columns

Index(['Hdr No', 'PurchaseOrderNumber', 'LineNumber', 'ReleaseNumber',
       'ItemNumber', 'ItemDescription', 'QuantityOrdered', 'UnitOfMeasure',
       'UnitPrice', 'LineAmount', 'ExpectedDeliveryDate', 'Release Qty',
       'Request Date', 'Promise Date', 'Comments'],
      dtype='object')

In [19]:
selected_columns = ["ItemDescription"]
df_selected_new = df_line_items[selected_columns]

In [20]:
df_selected_new

Unnamed: 0,ItemDescription
0,15/16 HEX C12L14\n\nShape - HEX\nGrade - C12L1...
1,"Steel 1215 Round .632"" Cold Finished\n\n"
2,"Steel 1144 Round .663"" Cold Finished"
3,"Steel 1215 Round 1-3/8"" Cold Finished"
4,STEEL 1.0000 HEX-SHARP 12L14 12'\nBAR
...,...
74,M13750SRC15 1-3/8 ROUND STEEL C1215
75,M13125SRC15 1-5/16 RD C1215
76,"M22500SRL14 2-1/4"" STEEL RD 12L14"
77,"M10000SRL14 1"" STEEL RD 12L14"


In [21]:
import pandas as pd
from tqdm import tqdm  
tqdm.pandas()

# Define a function to process each description using the LangGraph
def parse_description(description):
    try:
        result = graph.invoke({"input": str(description)})
        combined_result = {
            "Grade": "", "Shape": "", "Ground": "", "Finish Type": "",
            "Diameter": "", "Thickness": "", "Length": "", "Width": ""
        }

        # Load and merge outputs
        if result.get("agent1_output"):
            try:
                agent1_data = eval(result["agent1_output"])    
                combined_result.update(agent1_data)
            except:
                pass

        if result.get("agent2_output"):
            try:
                agent2_data = eval(result["agent2_output"])
                combined_result.update(agent2_data)
            except:
                pass

        return combined_result

    except Exception as e:
        return {"error": str(e)}

# Apply the function to each row in the dataframe
df_selected_new["ParsedOutput_new"] = df_selected_new["ItemDescription"].progress_apply(parse_description)


100%|██████████| 79/79 [04:40<00:00,  3.55s/it]
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df_selected_new["ParsedOutput_new"] = df_selected_new["ItemDescription"].progress_apply(parse_description)


In [22]:
df_selected_new

Unnamed: 0,ItemDescription,ParsedOutput_new
0,15/16 HEX C12L14\n\nShape - HEX\nGrade - C12L1...,"{'Grade': 'C12L14', 'Shape': 'Hex', 'Ground': ..."
1,"Steel 1215 Round .632"" Cold Finished\n\n","{'Grade': '1215', 'Shape': 'Round', 'Ground': ..."
2,"Steel 1144 Round .663"" Cold Finished","{'Grade': '1144', 'Shape': 'Round', 'Ground': ..."
3,"Steel 1215 Round 1-3/8"" Cold Finished","{'Grade': '1215', 'Shape': 'Round', 'Ground': ..."
4,STEEL 1.0000 HEX-SHARP 12L14 12'\nBAR,"{'Grade': '12L14', 'Shape': 'Hex', 'Ground': '..."
...,...,...
74,M13750SRC15 1-3/8 ROUND STEEL C1215,"{'Grade': 'C1215', 'Shape': 'Round', 'Ground':..."
75,M13125SRC15 1-5/16 RD C1215,"{'Grade': 'C1215', 'Shape': 'RD', 'Ground': 'N..."
76,"M22500SRL14 2-1/4"" STEEL RD 12L14","{'Grade': '12L14', 'Shape': 'RD', 'Ground': 'N..."
77,"M10000SRL14 1"" STEEL RD 12L14","{'Grade': '12L14', 'Shape': 'Round', 'Ground':..."


In [23]:
parsed_df_new = pd.json_normalize(df_selected_new["ParsedOutput_new"])
df_selected_new = pd.concat([df_selected_new, parsed_df_new], axis=1)


In [24]:
df_selected_new

Unnamed: 0,ItemDescription,ParsedOutput_new,Grade,Shape,Ground,Finish Type,Diameter,Thickness,Length,Width
0,15/16 HEX C12L14\n\nShape - HEX\nGrade - C12L1...,"{'Grade': 'C12L14', 'Shape': 'Hex', 'Ground': ...",C12L14,Hex,No,,0.9375,,144,
1,"Steel 1215 Round .632"" Cold Finished\n\n","{'Grade': '1215', 'Shape': 'Round', 'Ground': ...",1215,Round,No,CF Bar,0.632,,144,
2,"Steel 1144 Round .663"" Cold Finished","{'Grade': '1144', 'Shape': 'Round', 'Ground': ...",1144,Round,No,CF Bar,0.663,,144,
3,"Steel 1215 Round 1-3/8"" Cold Finished","{'Grade': '1215', 'Shape': 'Round', 'Ground': ...",1215,Round,No,CF Bar,1.375,,144,
4,STEEL 1.0000 HEX-SHARP 12L14 12'\nBAR,"{'Grade': '12L14', 'Shape': 'Hex', 'Ground': '...",12L14,Hex,No,,1.0000,,144,
...,...,...,...,...,...,...,...,...,...,...
74,M13750SRC15 1-3/8 ROUND STEEL C1215,"{'Grade': 'C1215', 'Shape': 'Round', 'Ground':...",C1215,Round,No,,1.375,,144,
75,M13125SRC15 1-5/16 RD C1215,"{'Grade': 'C1215', 'Shape': 'RD', 'Ground': 'N...",C1215,RD,No,,1.3125,,144,
76,"M22500SRL14 2-1/4"" STEEL RD 12L14","{'Grade': '12L14', 'Shape': 'RD', 'Ground': 'N...",12L14,RD,No,,2.25,,144,
77,"M10000SRL14 1"" STEEL RD 12L14","{'Grade': '12L14', 'Shape': 'Round', 'Ground':...",12L14,Round,No,,1,,144,


In [None]:
df_line_items[['Hdr No', 'PurchaseOrderNumber', 'LineNumber', 'ReleaseNumber',
       'ItemNumber']]

In [25]:
df_line_items.shape

(79, 15)

In [26]:
df_selected_new.shape

(79, 10)

In [27]:
# Select the required columns from df_line_items
df_line_items_selected = df_line_items[['Hdr No', 'PurchaseOrderNumber', 'LineNumber', 'ReleaseNumber', 'ItemNumber']]

# Concatenate df_line_items_selected with df_selected_new
df_combined = pd.concat([df_line_items_selected.reset_index(drop=True), df_selected_new.reset_index(drop=True)], axis=1)

# Display the combined dataframe
df_combined

Unnamed: 0,Hdr No,PurchaseOrderNumber,LineNumber,ReleaseNumber,ItemNumber,ItemDescription,ParsedOutput_new,Grade,Shape,Ground,Finish Type,Diameter,Thickness,Length,Width
0,1,18301480,1.0,,JC BB BB 18301480,15/16 HEX C12L14\n\nShape - HEX\nGrade - C12L1...,"{'Grade': 'C12L14', 'Shape': 'Hex', 'Ground': ...",C12L14,Hex,No,,0.9375,,144,
1,2,140660,1.0,,ST1215XXRD006320CF,"Steel 1215 Round .632"" Cold Finished\n\n","{'Grade': '1215', 'Shape': 'Round', 'Ground': ...",1215,Round,No,CF Bar,0.632,,144,
2,3,140661,1.0,,ST1144XXRD006630CF,"Steel 1144 Round .663"" Cold Finished","{'Grade': '1144', 'Shape': 'Round', 'Ground': ...",1144,Round,No,CF Bar,0.663,,144,
3,4,140674,1.0,,ST1215XXRD013750CF,"Steel 1215 Round 1-3/8"" Cold Finished","{'Grade': '1215', 'Shape': 'Round', 'Ground': ...",1215,Round,No,CF Bar,1.375,,144,
4,5,35032-IL,1.0,,ST10000HS12L14-NA-12,STEEL 1.0000 HEX-SHARP 12L14 12'\nBAR,"{'Grade': '12L14', 'Shape': 'Hex', 'Ground': '...",12L14,Hex,No,,1.0000,,144,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
74,27,PT25430,1.0,,,M13750SRC15 1-3/8 ROUND STEEL C1215,"{'Grade': 'C1215', 'Shape': 'Round', 'Ground':...",C1215,Round,No,,1.375,,144,
75,27,PT25430,2.0,,,M13125SRC15 1-5/16 RD C1215,"{'Grade': 'C1215', 'Shape': 'RD', 'Ground': 'N...",C1215,RD,No,,1.3125,,144,
76,27,PT25430,3.0,,,"M22500SRL14 2-1/4"" STEEL RD 12L14","{'Grade': '12L14', 'Shape': 'RD', 'Ground': 'N...",12L14,RD,No,,2.25,,144,
77,27,PT25430,4.0,,,"M10000SRL14 1"" STEEL RD 12L14","{'Grade': '12L14', 'Shape': 'Round', 'Ground':...",12L14,Round,No,,1,,144,


In [28]:
df_combined['Shape'].value_counts()

Shape
Round    36
Hex      21
         15
RD        7
Name: count, dtype: int64

In [29]:
df_combined['Diameter'].value_counts()

Diameter
          15
1          7
1.25       5
1.375      5
0.75       5
1.1875     4
1.3125     4
0.8125     3
0.9375     2
1.4375     2
1.625      2
1.0625     2
1.125      2
0.5625     2
1.0000     2
0.625      1
1.5        1
2          1
2.250      1
2.125      1
0.875      1
2.375      1
0.663      1
0.500      1
0.632      1
0.375      1
1.313      1
0.624      1
0.6875     1
0.4375     1
0.6250     1
2.25       1
Name: count, dtype: int64

In [None]:
df_combined.to_csv("Version2_PO_Description_Parsed.csv", index=False)