# Document Generation
In this notebook we will begin by creating several documents (probably word docs) and writing them to a databricks volume for parsing. These documents will be generated samples of corrective actions that need to be distilled based on previous output from a different system.

## Setup Block
We're going to create a number of cells that handle the setup and maintenance of the data and files for the project. Although this can be done anywhere I find that putting the configs as a dedicated section at the top of my notebook easy to manage. What matters here is the system. Since serverless pipelines don't support `%run` commands (they're declarative) it's best to decouple functions in other ways.

In [0]:
%pip install python-docx mlflow  --upgrade --pre

dbutils.library.restartPython()

In [0]:
#Python utilities
import os, json, uuid, requests, datetime, re
from docx import Document
from pathlib import Path

#Spark utilities
from pyspark.sql import functions as F, types as T

#Databricks utilities
from dbruntime.databricks_repl_context import get_context

In [0]:
#Set the variables for the PAT in the Databricks Secrets store
secret_scope_name = "general"
secret_key_name = "genie_access"

#Inject the variables into the agent for use
os.environ["DB_MODEL_SERVING_HOST_URL"] = "https://" + get_context().workspaceUrl
assert os.environ["DB_MODEL_SERVING_HOST_URL"] is not None

#Inject the databricks personal access token for use
os.environ["DATABRICKS_GENIE_PAT"] = dbutils.secrets.get(
    scope=secret_scope_name, key=secret_key_name
)
assert os.environ["DATABRICKS_GENIE_PAT"] is not None, (
    "The DATABRICKS_GENIE_PAT was not properly set to the PAT secret"
)

In [0]:
#Keeping this separate makes it easy to find - it's the most likely to need to get updated on a frequent basis
catalog = "ademianczuk"
db = "suncor_ehs"

In [0]:
#Set the operating environment details
DATABRICKS_HOST = os.environ.get("DB_MODEL_SERVING_HOST_URL")
DATABRICKS_TOKEN = os.environ.get("DATABRICKS_GENIE_PAT")
FM_ENDPOINT = "databricks-llama-4-maverick"  # your foundation model endpoint name

#Set the storage details
VOL_ROOT = f"/Volumes/{catalog}/{db}/data"
DOC_OUT_DIR = f"{VOL_ROOT}/docs"
Path(DOC_OUT_DIR).mkdir(parents=True, exist_ok=True)

#Set the table details
SCENARIO_TABLE = f"{catalog}.{db}.scenarios"
DOCS_TABLE = f"{catalog}.{db}.docs"

## Helper Functions
The two helper functions assist us in creating the contents for the document as well as writing them out to a storage volume. We may want to adjust the format of the word file to include tables as the real output will likely have. The point of this is to be able to reuse or extend this logic later. Since we may have a sample document, modelling off of that might be useful.

### Decoupling Helper Functionality
This might be worth porting over to a class file depending on how often it needs to be used.

In [0]:
#These operations are functionalized to make recall easier. We are leveraging globally assigned variables here. If we decouple this, the variables will need to be added to the signature and class constructor.

import traceback
import tempfile
import shutil

def call_chat(messages, temperature=0.2, max_tokens=1200):
    """
    Calls Databricks Foundation Model endpoint (chat-style).
    API schema per Foundation Model REST API docs.
    """
    # url = f"{DATABRICKS_HOST}/api/2.0/serving-endpoints/{FM_ENDPOINT}/invocations"
    url = "https://dbc-9c7dbe12-0a2f.cloud.databricks.com/serving-endpoints/databricks-llama-4-maverick/invocations"
    headers = {"Authorization": f"Bearer {DATABRICKS_TOKEN}", "Content-Type": "application/json"}
    payload = {
        "messages": messages,
        "temperature": temperature,
        "max_tokens": max_tokens
    }
    r = requests.post(url, headers=headers, data=json.dumps(payload), timeout=120)
    r.raise_for_status()
    resp = r.json()
    
    # Databricks FM APIs mirror OpenAI-like schema; adjust if your endpoint returns a different shape.
    # Try common fields first:
    content = None
    if isinstance(resp, dict):
        # 'choices' structure
        choices = resp.get("choices")
        if choices and len(choices) > 0:
            msg = choices[0].get("message") or {}
            content = msg.get("content")
    
    return content or str(resp)

def write_docx(title, actions_md, out_path):
    doc = Document()
    doc.add_heading(title, level=1)
    
    try:
        for line in actions_md.splitlines():
            if line.strip().startswith("- "):
                doc.add_paragraph(line.strip()[2:], style=None)
            else:
                doc.add_paragraph(line)
        with tempfile.NamedTemporaryFile(suffix=".docx", delete=False) as tmp:
            doc.save(tmp.name)
            shutil.copy(tmp.name, out_path)

    except Exception as e:
        print(f"An error occurred writing the docx file out: {e}")
        traceback.print_exc()

    # doc.save(out_path)

In [0]:
# Option A: manually seed 10–15 "circumstances"
manual_scenarios = [
    {"scenario_id": str(uuid.uuid4()), "title": "CNC mill overheating at low spindle speeds",
     "context": {"machine":"CNC mill","symptoms":["process temp rising","reduced coolant flow"],"env":"ambient 32°C",
                 "likely_modes":["Heat Dissipation Failure (HDF)"]}},
    {"scenario_id": str(uuid.uuid4()), "title": "Assembly press showing increased vibration",
     "context": {"machine":"hydraulic press","symptoms":["vibration 2x baseline","oil temp high"],"env":"normal",
                 "likely_modes":["Component Wear","Hydraulic cavitation"]}},
]

# spark.createDataFrame(
#     [(s["scenario_id"], s["title"], json.dumps(s["context"]), "manual") for s in manual_scenarios],
#     schema="scenario_id string, title string, context_json string, source string"
# ).write.mode("overwrite").saveAsTable(SCENARIO_TABLE)

# Option B: (optional) Add synthetic from AI4I 2020 if you've loaded it to a table
ai4i_df = spark.read.csv(f"{VOL_ROOT}/ai4i2020.csv", header=True, inferSchema=True)
# Create a few diverse conditions as scenarios:
synth = (ai4i_df
         .withColumn("scenario_id", F.expr("uuid()"))
         .withColumn("title", F.concat(F.lit("AI4I condition: setting="), F.col("Type")))
         .withColumn("context_json", F.to_json(F.struct(*ai4i_df.columns)))
         .withColumn("source",F.lit("ai4i_sample"))
         .limit(20))
synth.write.mode("overwrite").saveAsTable(SCENARIO_TABLE)

In [0]:
import tempfile
import shutil

gen_system = """You are a maintenance reliability engineer.
Produce a detailed, **actionable** corrective action plan for the given machinery circumstance.
Write steps that a qualified technician can execute, with parts, tools, checks, and acceptance criteria. 
Return:
- A 2–3 paragraph context summary.
- A prioritized list of corrective actions (10–20 items), each with:
  - Rationale
  - Skill level (Apprentice/Tech/Senior)
  - Est. downtime saved (hours)
  - Est. cost band ($, $$, $$$)
  - Est. Risk of further damage incurred by faulty action
  - Safety notes
Finish with a short 'Verification & Re-start Procedure' checklist.
Use neutral, professional tone.

This is an example of a real-world document to use as a single-shot of what I need:

Incident #: E-2024-000021886 
Document Name: Investigation E-2024-000021886 level 2 power outage investigation results
Relationship
Terms and Definitions Present in the Investigation Report
1. Causal Reasoning / Investigative Inquiry / Causal Analysis
•	Presence: The report reconstructs the sequence of events for each incident (Line Galloping, HIPPS Activation, Furnace Delays), identifying physical, human, and system-level causes.
•	Excerpts:
o	“Sag on Power lines/not enough tension for conditions found on Nov 27th… For Line Galloping to occur a combination of high Sag (low tension) and weather condition (wind + ice build up ) takes place together.”
o	“Design of control logic did not anticipate this scenario (no flow and valve remains open).”
o	“No PM for Equipment specific to Louver assembly.”
•	Explanation: These statements focus on what was present and the actual conditions, not on what was missing or who was at fault.
2. Physical Causes
•	Presence: Mechanical and environmental factors are described.
•	Excerpts:
o	“Sag on Power lines/not enough tension for conditions found on Nov 27th.”
o	“Ice build up in furnace box due to snuffing steam passing valve.”
•	Explanation: These are direct references to physical mechanisms (tension, ice, pressure).
3. System / System Failure / Influencing System
•	Presence: Organizational and procedural gaps are identified.
•	Excerpts:
o	“Design reviews: New extension power lines do not review existing line for tension calculation given a higher load is now present.”
o	“Projects team does not meet sparing strategy, the current spares is identified by reliability group.”
o	“No PM for Equipment specific to Louver assembly.”
•	Explanation: These statements highlight system-level influences and failures, such as design review processes and maintenance strategies.
4. Human Factor / Human Behaviour Causes
•	Presence: Human actions and decisions are referenced.
•	Excerpts:
o	“93 CRO dealing with multiple issues (FWKO trips) to re establish emulsion.”
o	“Late communication between panel operators across boundaries.”
•	Explanation: These statements describe human behaviors and decisions that contributed to the events.
5. Cause / Causal Factor / Critical or Main Cause
•	Presence: The report identifies specific causes and causal factors for each event.
•	Excerpts:
o	“Root Cause: Design specification of power lines in the region need improvement.”
o	“Root Cause: Design of control logic did not anticipate this scenario.”
o	“Root Cause: No PM for Equipment specific to Louver assembly.”
•	Explanation: These are explicit identifications of causes and causal factors.
6. Work-as-Imagined (WAI) / Work-as-Done (WAD)
•	Presence: The difference between planned and actual work is implied.
•	Excerpts:
o	“Design meets CTS-0560 and 0561, as well as AEUC and CSA 22.3 standards which consider weather conditions to avoid galloping… Although it adheres to the standard the specification used were the minimum requirement and more relax than previous power lines.”
o	“Control logic is set as up as per original commissioning of 93/94… No changes to logic found that can result in this event.”
•	Explanation: These statements compare documented standards (WAI) to actual outcomes (WAD).
7. Conditions
•	Presence: Necessary conditions for events are described.
•	Excerpts:
o	“For Line Galloping to occur a combination of high Sag (low tension) and weather condition (wind + ice build up ) takes place together.”
•	Explanation: These statements identify the conditions required for the incident to occur.
8. Event
•	Presence: The report describes occurrences neutrally.
•	Excerpts:
o	“Line Galloping or swing is a result of wind condition and sag of power lines, significant swing occurs where lines in the same post make contact and result in blown out fuse/power outage.”
•	Explanation: These are neutral descriptions of events.
9. Learning
•	Presence: The report includes recommendations and learnings.
•	Excerpts:
o	“Consider a review of Design specification for new lines.”
o	“Consider a repeat back form of communication to ensure both panels agree on start up sequence after a significant event.”
•	Explanation: These are learnings and recommendations for future improvement.
10. Negative Reasoning / Counterfactuals / Blame / Accountability / Hindsight
•	Presence: Some statements focus on what was missing or not done, or imply blame.
•	Excerpts:
o	“Interim Corrective actions not effective.”
o	“Notification was entered and temporary measures set up. While furnace is in operation snuffing steam into burner box was not a concern. Shutdown (Hours) during freezing conditions is what led to further delays.”
•	Explanation: These statements focus on what was not done or what failed, rather than on what was present.

Flagging
Table: Instances of Negative Reasoning, Counterfactuals, and Logical Errors
Identification of Negative Reasoning / Counterfactual 	Identification Reasoning 	Definitions Present 	Original Statement 
Negative Reasoning 	Focuses on what was missing, not what was present; assigns fault to interim actions 	Negative Reasoning, Counterfactual, Hindsight 	“Interim Corrective actions not effective.” 
Negative Reasoning + Counterfactual 	Focuses on what was not done (notification process), implies blame for not capturing risk 	Negative Reasoning, Counterfactual, Blame, Accountability 	“Notification was entered and temporary measures set up. While furnace is in operation snuffing steam into burner box was not a concern. Shutdown (Hours) during freezing conditions is what led to further delays.” 
Negative Reasoning 	Focuses on lack of PM for louver, implies what was missing rather than what was present 	Negative Reasoning, Counterfactual 	“No PM for Equipment specific to Louver assembly.” 
Counterfactual 	Implies that if communication had been better, the event would not have occurred 	Counterfactual, Hindsight 	“Communication between panels can break down during high activities event as observed on Nov 27.” 
Negative Reasoning 	Focuses on minimum requirements being too relaxed, implies what was missing 	Negative Reasoning, Counterfactual 	“Design of this line was more relaxed than previous designs adhering to the minimum requirement on the standard.” 
Negative Reasoning 	Focuses on what was not anticipated in control logic, rather than what was present 	Negative Reasoning, Counterfactual 	“Design of control logic did not anticipate this scenario (no flow and valve remains open).” 

Corrective Actions
Table 1: All Actions in the Report
Tally 	Cause 	Systemic Improvements 	Action 	OEMS Process 	OEMS Justification 	Related OEMS Process 	Sound Corrective Action? 	Action Justification 
1 	Design specification of power lines in the region need improvement 	Improve power line design standards 	Review of Design specification for new Power lines and communicate to future projects. 	Develop & Manage Projects 	“Deliver Site projects following the ADEM framework… accountability for design standards and project execution.” 	Manage Equipment Strategy, Manage Threats to Availability 	Yes 	Action is barrier-focused, addresses system-level design flaw, and is SMARTER. 
2 	Sag on Power lines/not enough tension for conditions 	Improve power line reliability 	Addition of line spreaders or re-tension at risk lines. 	Manage Equipment Strategy 	“Defines optimal equipment care tasks… risk/consequence-based approach to deliver business results.” 	Manage Maintenance, Manage Threats to Availability 	Yes 	Action is barrier-focused, eliminates risk by engineering control, and is SMARTER. 
3 	Sag on Power lines/not enough tension for conditions 	Improve power line reliability 	Apply PM program to confirm tension of lines and adjust accordingly 	Manage Maintenance 	“Maintain assets… scope identification, prioritization, planning, scheduling, work execution, performance tracking and continuous improvement.” 	Manage Equipment Strategy 	Yes 	Action is barrier-focused, links to causal factor, and is SMARTER. 
4 	Design of control logic did not anticipate scenario 	Improve control logic reliability 	Evaluate and change logic to prevent FWKO low level scenario from occurring. i.e. not rely on CRO intervention. 	Manage Operational Risk 	“Systematically identify significant operational hazards, minimize associated risks to an acceptable level, and enhance safety and reliability by recognizing and verifying the effectiveness of critical controls.” 	Management of Change, Manage Maintenance 	Yes 	Action is barrier-focused, addresses system-level logic flaw, and is SMARTER. 
5 	Late communication between panel operators 	Improve communication protocol 	Review and create communication protocol between panels, Consider a repeat back form of communication to ensure both panels agree on start up sequence after a significant event 	Conduct of Safe Operations 	“Provide a consistent approach for safe and reliable operations… Maintain Situational Awareness, Deliver Planned Commitments, Manage Alarm and Limits, Conduct Proactive Monitoring, and Manage Abnormal Situations.” 	Manage Operational Risk, Manage Incidents 	Yes 	Action is barrier-focused, addresses system-level communication gap, and is SMARTER. 
6 	No PM for Equipment specific to Louver assembly 	Improve equipment strategy for fired heaters 	Evaluate equipment strategy for the Fired Heaters that includes PMs for the louvers specifically 	Manage Equipment Strategy 	“Defines optimal equipment care tasks… risk/consequence-based approach to deliver business results.” 	Manage Maintenance 	Yes 	Action is barrier-focused, addresses system-level PM gap, and is SMARTER. 
7 	Snuffing steam valve passing 	Improve winterization and notification process 	Isolate snuffing steam to furnace 	Manage Maintenance 	“Maintain assets… scope identification, prioritization, planning, scheduling, work execution, performance tracking and continuous improvement.” 	Manage Equipment Strategy 	Yes 	Action is barrier-focused, eliminates hazard, and is SMARTER. 
8 	Snuffing steam valve passing 	Improve winterization and notification process 	Update notification process to capture additional risk due to change of weather/seasons. Suggest standard text template update. 	Manage Operational Risk 	“Systematically identify significant operational hazards, minimize associated risks to an acceptable level, and enhance safety and reliability by recognizing and verifying the effectiveness of critical controls.” 	Management of Change, Manage Maintenance 	Yes 	Action is barrier-focused, addresses system-level notification gap, and is SMARTER. 
Systemic Improvements Grouping
•	Improve power line design standards: Actions 1
•	Improve power line reliability: Actions 2, 3
•	Improve control logic reliability: Action 4
•	Improve communication protocol: Action 5
•	Improve equipment strategy for fired heaters: Action 6
•	Improve winterization and notification process: Actions 7, 8

Table 2: Key Actions (Systemic Improvements Grouped)
Key Action (Systemic Improvement) 	Main Cause (Umbrella Cause) 	OEMS Process 	OEMS Justification 	Action 	Reasoning 
Improve power line design standards 	Power line design standards did not account for increased load and weather variability, leading to galloping and outages. 	Develop & Manage Projects 	“Deliver Site projects following the ADEM framework… accountability for design standards and project execution.” 	Review of Design specification for new Power lines and communicate to future projects. 	Addresses the root cause by ensuring future projects use robust design standards that account for actual operating conditions and loads. 
Improve power line reliability 	Power line tension and sag were not adequately managed, increasing risk of galloping and outages. 	Manage Equipment Strategy 	“Defines optimal equipment care tasks… risk/consequence-based approach to deliver business results.” 	Addition of line spreaders or re-tension at risk lines. 	Directly reduces risk of galloping by engineering controls that address tension and sag. 
Improve power line reliability 	Power line tension and sag were not adequately managed, increasing risk of galloping and outages. 	Manage Maintenance 	“Maintain assets… scope identification, prioritization, planning, scheduling, work execution, performance tracking and continuous improvement.” 	Apply PM program to confirm tension of lines and adjust accordingly 	Ensures ongoing reliability by embedding tension checks into maintenance, preventing recurrence. 
Improve control logic reliability 	Control logic did not anticipate scenarios where vessel could fill with gas, leading to HIPPS activation. 	Manage Operational Risk 	“Systematically identify significant operational hazards, minimize associated risks to an acceptable level, and enhance safety and reliability by recognizing and verifying the effectiveness of critical controls.” 	Evaluate and change logic to prevent FWKO low level scenario from occurring. i.e. not rely on CRO intervention. 	Changes logic to prevent recurrence of scenario, removing reliance on operator intervention. 
Improve communication protocol 	Lack of clear communication between panels during high activity events led to uncoordinated actions and risk. 	Conduct of Safe Operations 	“Provide a consistent approach for safe and reliable operations… Maintain Situational Awareness, Deliver Planned Commitments, Manage Alarm and Limits, Conduct Proactive Monitoring, and Manage Abnormal Situations.” 	Review and create communication protocol between panels, Consider a repeat back form of communication to ensure both panels agree on start up sequence after a significant event 	Establishes robust communication protocol to prevent miscoordination during critical events. 
Improve equipment strategy for fired heaters 	Lack of PM for louver assemblies led to repeated reliability issues and delays. 	Manage Equipment Strategy 	“Defines optimal equipment care tasks… risk/consequence-based approach to deliver business results.” 	Evaluate equipment strategy for the Fired Heaters that includes PMs for the louvers specifically 	Embeds louver-specific PMs into equipment strategy, addressing recurring reliability issues. 
Improve winterization and notification process 	Inadequate winterization and notification processes allowed snuffing steam valve passing to cause delays during shutdowns. 	Manage Maintenance 	“Maintain assets… scope identification, prioritization, planning, scheduling, work execution, performance tracking and continuous improvement.” 	Isolate snuffing steam to furnace 	Eliminates hazard by isolating steam, preventing recurrence during shutdowns. 
Improve winterization and notification process 	Inadequate winterization and notification processes allowed snuffing steam valve passing to cause delays during shutdowns. 	Manage Operational Risk 	“Systematically identify significant operational hazards, minimize associated risks to an acceptable level, and enhance safety and reliability by recognizing and verifying the effectiveness of critical controls.” 	Update notification process to capture additional risk due to change of weather/seasons. Suggest standard text template update. 	Ensures risks due to seasonal changes are captured and managed proactively in notification process. 

Table 3: Missing Causes and Actions (Not Already Listed)
Tally 	Cause 	Action 	OEMS Process 	OEMS Justification 	Related OEMS Process 	Action Justification 
1 	Lack of predictive analytics for weather-related risks to power lines 	Implement a predictive analytics system to monitor weather and line conditions, triggering proactive maintenance or operational adjustments 	Manage Threats to Availability 	“Identify, prioritize, and address threats while capitalizing on opportunities to achieve the objectives outlined in the Strategic Asset Management Plan and Business plan.” 	Manage Equipment Strategy, Manage Maintenance 	Action is barrier-focused, uses technology to anticipate and prevent weather-related outages, and is SMARTER. 
2 	Inadequate cross-functional review of control logic changes 	Establish a formal cross-functional review process for all control logic changes, including operations, engineering, and maintenance 	Management of Change 	“Ensures that all health and safety, environmental, regulatory, reputational, and financial impacts have been rigorously and consistently identified, evaluated, and mitigated as part of making a change.” 	Manage Operational Risk, Manage Maintenance 	Action is barrier-focused, ensures robust review and learning, and is SMARTER. 
3 	Lack of real-time feedback for panel operator communication 	Install real-time communication monitoring and feedback system for panel operators during high activity events 	Conduct of Safe Operations 	“Provide a consistent approach for safe and reliable operations… Maintain Situational Awareness, Deliver Planned Commitments, Manage Alarm and Limits, Conduct Proactive Monitoring, and Manage Abnormal Situations.” 	Manage Incidents, Manage Operational Risk 	Action is barrier-focused, provides immediate feedback and learning, and is SMARTER. 
4 	No formal process for capturing lessons learned from repeat equipment failures 	Create a lessons learned database for repeat equipment failures, with mandatory review before maintenance planning 	Manage Incidents 	“Enables the reporting, investigating, and subsequent management of incidents in order to ensure a culture of thorough and transparent incident management that supports continuous improvement and sharing of learnings.” 	Manage Equipment Strategy, Manage Maintenance 	Action is barrier-focused, embeds learning into maintenance planning, and is SMARTER. 

End of Analysis.

"""

scenarios_df = spark.table(SCENARIO_TABLE).limit(15).collect()
outputs = []
for row in scenarios_df:
    ctx = json.loads(row.context_json)
    user_prompt = f"""Circumstance Title: {row.title}
Context JSON:\n{json.dumps(ctx, indent=4)}"""

    content = call_chat([
        {"role":"system","content": gen_system},
        {"role":"user","content": user_prompt}
    ], temperature=0.2, max_tokens=5000)

    file_name = f"{row.title.lower().replace(' ','_')[:60]}_{row.scenario_id[:8]}.docx"
    file_path = f"{DOC_OUT_DIR}/{file_name}"
    write_docx(row.title, content, file_path)

    outputs.append((row.scenario_id, row.title, file_path, datetime.datetime.utcnow().isoformat(), content))

spark.createDataFrame(outputs, "scenario_id string, title string, docx_path string, created_utc string, raw_text string")\
     .write.mode("overwrite").saveAsTable(DOCS_TABLE)

print(f"Generated {len(outputs)} documents → {DOC_OUT_DIR}")

In [0]:
sum_system = """You are an expert maintenance prioritization analyst.
Given a corrective-action document, extract each action and score it using this rubric:
1) RiskReduction (0-5), 2) DowntimeAvoided (0-5), 3) CostEffectiveness (0-5), 4) TimeToImplement (0-5, invert score so faster=5), 5) Repeatability (0-5).
Compute ImpactScore = 0.35*RiskReduction + 0.25*DowntimeAvoided + 0.20*CostEffectiveness + 0.10*TimeToImplement + 0.10*Repeatability.
Return JSON with fields: actions: [{title, justification, scores:{...}, ImpactScore}], plus a brief summary."""

docs = spark.table(DOCS_TABLE).collect()
rank_rows = []
for d in docs:
    summary = call_chat([
        {"role":"system","content": sum_system},
        {"role":"user","content": d.raw_text[:120000]}  # keep under context window
    ], temperature=0.5, max_tokens=5000)

    # Optional: light validation
    m = re.search(r"\{.*\}", summary, flags=re.S)
    json_blob = m.group(0) if m else "{}"

    rank_rows.append((d.scenario_id, d.title, d.docx_path, json_blob, summary))

schema = "scenario_id string, title string, docx_path string, ranking_json string, summary string"
spark.createDataFrame(rank_rows, schema).write.mode("overwrite").saveAsTable(f"{catalog}.{db}.rankings")

print(f"Ranking complete. Inspect table {catalog}.{db}.rankings for top actions per doc.")

In [0]:
print(rank_rows)

In [0]:
content = call_chat([
    {"role":"system","content": "You are a general purpose assistant. You will answer truthfully and completely regarding the source and origin of where your data comes from and what is available."},
    {"role":"user","content": "Where are you getting the remaining context for the AI4I 2020 data? When I look at the dataset I have it only consists of maybe 14 columns with no descriptors. How are you generating the context, corrective and actions based on this data?"}
], temperature=0.2, max_tokens=5000)

print(content)