This notebook creates text descriptions based on altgosling output, specs, image, with an LLM.  

In [1]:
from openai import OpenAI
from PIL import Image
import base64
import os
import json
from dotenv import load_dotenv

load_dotenv()

True

You need an API key for this.

In [2]:
API_KEY = os.environ['API_KEY']

In [3]:
data_repo = "../data/unified"
lhs_path = "./cfg_rules_lhs.csv"

with open(lhs_path, "r") as f:
    lhs = f.read()

specs_dir = os.path.join(data_repo, "specs")
imgs_dir = os.path.join(data_repo, "imgs")
alt_dir = os.path.join(data_repo, "alt")

In [4]:
def get_files(name):
    with open(os.path.join(specs_dir, f"{name}.json"), 'r') as f:
        spec = f.read()
    
    with open(os.path.join(imgs_dir, f"{name}.png"), "rb") as f:
        img = f.read()
    img_base64 = base64.b64encode(img).decode("utf-8")

    with open(os.path.join(alt_dir, "altgosling", f"{name}.txt"), "r") as f:
        alt = f.read()

    with open(os.path.join(alt_dir, "processedspec", f"{name}.json"), "r") as f:
        processed_spec = f.read()

    return {"name": name, "spec": spec, "img": img_base64, "alt": alt, "processed_spec": processed_spec}

In [5]:
# load few shot examples
with open("few_shot_learning_examples.json", "r") as f:
    fewshot = json.load(f)

fewshots = []
for f in fewshot:
    obj = get_files(f)
    obj["description"] = fewshot[f]
    fewshots.append(obj)

In [6]:
prompt = f"""
I want to generate a text description of a visualization. 
The visualization is a json specification. 
I already have created an automatic alt text.
I also have classified the attributes of the specification.
I will use the text description in a multimodal search engine. 
The description should be as informative as possible.
"""
model = "gpt-4o"

In [7]:
## few shot result in too many tokens

# prompt = f"""
# I want to generate a text description of a visualization. 
# The visualization is a json specification. 
# I already have created an automatic alt text.
# I also have classified the attributes of the specification.
# I will use the text description in a multimodal search engine. 
# The description should be as informative as possible.

# Here are some examples of visualizations, their specs, images, alt texts, processed specs, and the description, which is the desired output.
# {fewshots}
# """
# model = "gpt-4o"

In [8]:
def send_prompt(prompt, model, files):
    client = OpenAI(
        api_key=API_KEY
    )

    response = client.chat.completions.create(
        messages=[
            {
                "role": "user",
                "content": [
                    {"type": "text", "text": prompt},
                    {"type": "text", "text": f"spec: {files['spec']}"},
                    {"type": "text", "text": f"processed spec: {files['processed_spec']}"},
                    {"type": "text", "text": f"altgosling alt: {files['alt']}"},
                    {"type": "text", "text": f"attribute classification: {lhs}"},
                    {"type": "image_url",
                     "image_url": {
                         "url": f"data:image/png;base64,{files['img']}",
                     }}
                ],
            }
        ],
        model=model,
        max_tokens=300,
    )
    return response.choices[0].message.content


In [None]:
# specs = os.listdir(specs_dir)

# for example in specs[0:1]:
#     name_i = example.split(".json")[0]
#     files = get_files(name_i)

#     response = send_prompt(prompt, model, files)
#     print(response)

The visualization is a circular bar chart depicting genomic data based on the hg38 assembly. It features a static layout with a center radius of 0.3 and circular alignment. The chart comprises pink bars representing quantitative expression values plotted against genomic positions on the x-axis. The chart has no interactions enabled, emphasizing purely visual data representation. The spacing is set to 1, and all tracks share the same alignment, optimizing the view for easy comparison of data peaks across different chromosomal positions. The outer radius measures approximately 171.5 with an inner radius of 51.45, while the track size is about 343x343 with a stacking arrangement.


In [None]:
specs = os.listdir(specs_dir)

for example in specs[0:10]:
    name_i = example.split(".json")[0]
    files = get_files(name_i)

    response = send_prompt(prompt, model, files)
    with open(os.path.join(alt_dir, "altgosling-llm", f"{name_i}.txt"), "w") as f:
        f.write(response)

In [14]:
specs[0:10]

['EX_SPEC_CIRCULR_RANGE_sw_0_7_s_0_7_cc_0.json',
 'EX_SPEC_GREMLIN_sw_1_2_s_0_7_cc_1.json',
 'gray_heatmap_sw_1_0_s_1_0_oc_sw_0_7_s_1_0_cc_0.json',
 'TEXT_sw_0_7_s_1_0_cc_0.json',
 'PBCA-DE-2009e5e7-1796-445b-8677-46b3804fe0bf.json',
 'POINT_sw_1_2_s_1_2_cc_2.json',
 'AREA_sw_1_2_s_1_2_oc.json',
 'responsive-circular_p_0_sw_1_0_s_1_0_oc.json',
 'combination-point-area_p_0_sw_1_2_s_0_7_cc_0.json',
 'EX_SPEC_CIRCOS_sw_0_7_s_0_7_cc_2.json']