Quick tool to get all different foods in the chinese document for pesticide residues. Used to create a list of selectable foods for the frontend of chipRAG.

In [2]:
# PROMPT AND QUERY DEFINITION

## Query
query = """
    SELECT t.text
    FROM chinese_pesticide_residues as t
"""

## Prompt
prompt = """
You're about to get a section about a pesticide followed by a single or multiple tables with a bunch of foods possibly within a food category. These tables have the headings \"Food/Category Name\"
and \"Maximum Residue Limit\". Some of those Foods got no Maximum Residue Limit value next to them. Those are usually categories for foods or simply a food that goes over
multiple rows in the original document.
Your job is to extract all categories and foods from the section and put them into a python list like this: ["Category", "Food", "Category", "Food", "Food", "Category"].
There is no need to treat categories and foods different. Just put all of them into a list.
Add all you find in the section into the list. Don't add, alter or remove any values whatsoever.
Do never send/write anything back but the list.
Here is the section you are supposed to do that task with:
{text}
"""

In [3]:
# IMPORT AND SETUP

import openai
from config.load_config import settings
from chiprag.postgres_utils import establish_connection, get_data

# setup openAI api
openai_client = openai.OpenAI(
        base_url=settings.kipitz_base_url,
        api_key=settings.kipitz_api_token
    )

# connect to database
conn, cur = establish_connection()

In [None]:
# GET, CLEAN AND SAVE DATA
#TODO: complete this, kipitz is currently offline!
raw_data = get_data(query)
data = [row[0].strip() for row in raw_data]

test = data[0]

fprompt = prompt.format(text=test)
fprompt
completion = openai_client.chat.completions.create(
    model=settings.kipitz_model,
    messages=[{"role": settings.kipitz_role, "content": fprompt}],   
)

answer = completion.choices[0].message.content
answer

['4.1 2,4-滴丁酸 (2,4-DB)\n\n4.1.1 Major purpose of use: herbicide.\n4.1.2 ADI: 0.02 mg/kg bw.\n4.1.3 Residue definition: sum of 2,4-DB, free 2,4-DB and conjugated 2,4-DB, expressed as 2,4-DB.\n4.1.4 Maximum residue limit: Shall comply with provisions in the Table 1.\nTable 1\nFood Category/Name Maximum residue limit, mg/kg\nCondiments\nMint 0.2*\nSpearmint 0.2*\nPepper 0.2*\n *The MRL is the temporary limit.',
 '4.2 2,4-滴丁酯 (2,4-D butylate)\n4.2.1 Major purpose of use: herbicide.\n4.2.2 ADI: 0.01 mg/kg bw.\n4.2.3 Residue definition: 2,4-D butylate\n4.2.4 Maximum residue limit: Shall comply with provisions in the Table 2.\nTable 2\nFood Category/Name Maximum residue limit, mg/kg\nCereal\nWheat 0.05\nCorn 0.05\nOil seed and oil\nSoybean 0.05\nSugar crops\nSugarcane 0.05\n\n4.2.5 Testing method: cereals are tested following methods provided in GB/T 5009.165, GB/T 5009.175;\nOil seed and oil are tested referring to methods provided in GB/T 5009.165; Sugar crops are tested\nreferring to metho