## Testing prompt engineering pipeline

Aim to create an interface for model after receiving deficiency input, and return a standardarize output.
- testing a zeroshotprompter with sample dataset.
- AzureOpenAI GPT3.5 turbo
- return output as a `list[dict[str,str]]` for parsing results.

In [1]:
import os
import sys
from pathlib import Path

import dotenv
from langchain_openai import AzureChatOpenAI

sys.path.append("src")
import psclabeler as psc

dotenv.load_dotenv()

True

In [2]:
NEW_INSPECTION = Path("./data/New Inspection Report.pdf")
SAMPLE_INSPECTION = Path("./data/Sample Inspection Report.pdf")

In [3]:
report_str = psc.data_query.data_ingest.parse_pdf_to_string(SAMPLE_INSPECTION)
report_dict = psc.data_query.data_ingest.split_report_to_chunk(report_str)

In [4]:
report_dict

{1: 'Deficiency : Fire extinguisher for rescue boat rusted seriously.  \nRoot Cause: Human Factors  \nNOT APPLICABLE  \nVessel Factors  \nInappropriate storage Fire extinguisher was not protected from weather.  \nManagement Factors  \nNOT APPLICABLE  \nOther Factors  \nOthers Inclement weather conditions.  \nCorrective action: Fire extinguisher replaced with a new extinguisher. The extinguisher is kept covered \nfor protection against weather.  \nPreventive action: Brieifing of entire ship staff carried out by Superintendent as to checks of rescue boat \nequipment. Lessons learned shared with all the vessels in Fleet.',
 2: 'Deficiency : The company name on the DOC is not the same as on the CSR. The interim SMC and interim \nSecurity certificate have different company names to the DOC.  \nRoot Cause:  Company stated in CSR doc not same as the DOC. Master without fail to cross check trading \ncertificate.  \nCorrective action:  Master w/o fail to cross check all trading certificates and

### Set up Model

In [5]:
model = AzureChatOpenAI(
    api_key=os.getenv("AZURE_OPENAI_API_KEY"),
    api_version="2024-02-01",
    azure_endpoint=os.getenv("AZURE_OPENAI_ENDPOINT"),
    deployment_name=os.getenv("AZURE_MODEL"),
)

In [8]:
def chat_labeler(query: str):
    risk_guideline = psc.model.prompt.RISK_ASSESSMENT
    sys_prompt = psc.model.prompt.PSC_INSPECTOR
    initial_prompt = psc.model.prompt.ZERO_SHOT_PROMPT.format(
        risk_assessment=risk_guideline, deficiency_query=query
    )
    response = model.invoke(input=[sys_prompt, initial_prompt], temperature=0)

    return response

In [9]:
for v in report_dict.values():
    response = chat_labeler(v)
    print(response.content)
    print()

Deficiency: Fire extinguisher for rescue boat rusted seriously.

Reason: The fire extinguisher was not protected from weather, which is an inappropriate storage practice. 

Classification: Medium

Deficiency: The company name on the DOC is not the same as on the CSR. The interim SMC and interim Security certificate have different company names to the DOC.

Reason:
- This deficiency exposes a weakness in the organization's processes.
- It does not pose an immediate threat to human life or cause an accident, but it has the potential to cause medium to large economic and reputational harm.

Classification: Medium

Deficiency: Two different loadline certificate showed at the same time by master.

Reason:
This deficiency exposes a weakness in the organization's processes as it indicates a lack of proper record-keeping and attention to detail. It could also potentially cause confusion and errors in decision-making related to the vessel's load capacity.

Classification: Medium

Deficiency: Ty

### Findings.
Using zero shot prompt engineering, we force the model to adhere to a certain output with key, value pairs for further analysis.
Now the model output the following.
```python
"deficiency": "description of it"
"reason": "the model reasoning before assessing a risk"
"classification" : "a one word classification"
```

### Test modularize code

In [27]:
inspector = psc.model.labeler.ZeroShotLLMPSCInspector()
results = []
for v in report_dict.values():
    response = inspector.rate_risk(v)
    results.append(response.content)
    print(response.content)
    print()

Deficiency: Fire extinguisher for rescue boat rusted seriously.

Reason: The fire extinguisher was not protected from weather, which is an inappropriate storage practice. 

Classification: Medium

Deficiency: The company name on the DOC is not the same as on the CSR. The interim SMC and interim Security certificate have different company names to the DOC.

Reason:
- The deficiency exposes a weakness in the organization's processes.
- It is not significant enough to threaten human life or cause an accident.

Classification: Medium

Deficiency: Two different loadline certificate showed at the same time by master.

Reason:
This deficiency exposes a weakness in the organization's processes as it indicates a lack of proper record-keeping and attention to detail. It could also potentially cause confusion and errors in decision-making related to the vessel's load capacity.

Classification: Medium

Deficiency: Type of Ship in SE, SR and IOPP certificates not identified.

Reason: The deficiency

### Parsing the output into storable format
This result is a list of string. Aim to make it a dictionary of `deficiency`, `reason` and `classification`.

In [8]:
results

['Deficiency: Fire extinguisher for rescue boat rusted seriously.\n\nReason: The fire extinguisher was not protected from weather, which is an inappropriate storage practice. \n\nClassification: Medium',
 "Deficiency: The company name on the DOC is not the same as on the CSR. The interim SMC and interim Security certificate have different company names to the DOC.\n\nReason:\nThis deficiency exposes a weakness in the organization's processes as there is a lack of cross-checking and verification of trading certificates. It does not pose an immediate threat to human life or cause an accident, but it has the potential to cause medium to large economic and reputational harm.\n\nClassification: Medium",
 "Deficiency: Two different loadline certificate showed at the same time by master.\n\nReason:\nThis deficiency exposes a weakness in the organization's processes as it indicates a lack of proper record-keeping and attention to detail. It could also potentially cause confusion and errors in 

In [6]:
def parse_single_deficiency_response_to_dict(response: list[str]):
    """After splitting each response into a list of 3 items, convert it into a dictionary."""
    split_k_v = [i.split(":") for i in response]
    return {i[0].lower(): i[1].strip() for i in split_k_v}

In [28]:
parse_single_deficiency_response_to_dict([d.split("\n\n") for d in results][0])

{'deficiency': 'Fire extinguisher for rescue boat rusted seriously.',
 'reason': 'The fire extinguisher was not protected from weather, which is an inappropriate storage practice.',
 'classification': 'Medium'}

In [7]:
inspector = psc.model.labeler.ZeroShotLLMPSCInspector()
results = []
for v in report_dict.values():
    response = inspector.rate_risk(v)

    split_response = response.content.split("\n\n")

    parse_response = parse_single_deficiency_response_to_dict(split_response)
    results.append(parse_response)
    print(response.content)
    print()

Deficiency: Fire extinguisher for rescue boat rusted seriously.

Reason: The fire extinguisher was not protected from weather, which is an inappropriate storage practice.

Classification: Medium

Deficiency: The company name on the DOC is not the same as on the CSR. The interim SMC and interim Security certificate have different company names to the DOC.

Reason:
This deficiency exposes a weakness in the organization's processes as there is a lack of cross-checking and verification of trading certificates. It does not pose an immediate threat to human life or cause an accident or fire.

Classification: Medium

Deficiency: Two different loadline certificate showed at the same time by master.

Reason:
This deficiency exposes a weakness in the organization's processes as it indicates a lack of proper record-keeping and attention to detail. It could also potentially lead to confusion and errors in decision-making regarding the vessel's load capacity.

Classification: Medium.

Deficiency: T

### Final result for error analysis of model's output
For interpretability purpose, returning the returns together with model's analysis

In [8]:
results

[{'deficiency': 'Fire extinguisher for rescue boat rusted seriously.',
  'reason': 'The fire extinguisher was not protected from weather, which is an inappropriate storage practice.',
  'classification': 'Medium'},
 {'deficiency': 'The company name on the DOC is not the same as on the CSR. The interim SMC and interim Security certificate have different company names to the DOC.',
  'reason': "This deficiency exposes a weakness in the organization's processes as there is a lack of cross-checking and verification of trading certificates. It does not pose an immediate threat to human life or cause an accident or fire.",
  'classification': 'Medium'},
 {'deficiency': 'Two different loadline certificate showed at the same time by master.',
  'reason': "This deficiency exposes a weakness in the organization's processes as it indicates a lack of proper record-keeping and attention to detail. It could also potentially lead to confusion and errors in decision-making regarding the vessel's load 

In [31]:
import pandas as pd

pd.DataFrame(results)

Unnamed: 0,deficiency,reason,classification
0,Fire extinguisher for rescue boat rusted serio...,The fire extinguisher was not protected from w...,Medium
1,The company name on the DOC is not the same as...,- The deficiency exposes a weakness in the org...,Medium
2,Two different loadline certificate showed at t...,This deficiency exposes a weakness in the orga...,Medium
3,"Type of Ship in SE, SR and IOPP certificates n...",The deficiency was caused by a lack of skill a...,Medium.
4,Third mate forgot to record fire and abandon d...,The third mate was not familiar with the speci...,Low
