## Claude.ai

This is just some messing around with the `example_hists` files.

### Docs:

* in theory, there is documentation here: https://docs.anthropic.com/en/home, 
* but in reality I just asked "Hey Claude.  How do I use your API through Python to upload an image and ask you questions about it?" followed by "Is there anyway to set the context?  For example, in chatgpt you have the ""role": "system", "content"" part of your message where you would say things like "You are a helpful assistant in charge of automating a process".  Or does one just incorporate that into the "question" part of the inputs?" and starting building from there

#### Key Points (cp-claude)

* API Key: Get your API key from the [Anthropic Console](https://console.anthropic.com/) and set it as an environment variable or pass it directly
* Supported formats: JPEG, PNG, GIF, and WebP images
* Model: Use `claude-sonnet-4-20250514` for Claude Sonnet 4
* Size limits: Images should be under 5MB and no larger than 8000x8000 pixels



First, install anthropic api (also, see .yml file for the environment for this project)

In [9]:
#!pip install anthropic

Where are things stored/going to be stored?

In [10]:
dir_api = '/Users/jnaiman/LLM_VQA_JCDL2025/example_hists/LLM_outputs/claude_api/' #store API results for example-hists

key_file = '/Users/jnaiman/.claudeai/key.txt'

jsons_dir = '/Users/jnaiman/LLM_VQA_JCDL2025/example_hists/jsons/' # directory where jsons created with figure are stored
imgs_dir = '/Users/jnaiman/LLM_VQA_JCDL2025/example_hists/imgs/' # where images are stored

# for saving temp images for reading in
tmp_dir = '/Users/jnaiman/Downloads/tmp/'

img_format = 'jpeg'

In [11]:
import anthropic
import base64
from PIL import Image
import numpy as np
import json
import re
import pickle
import os
from glob import glob

# debug
from importlib import reload
from anthropic import RateLimitError


from sys import path
path.append('../')
import utils.llm_utils
reload(utils.llm_utils)
from utils.llm_utils import parse_qa, load_image, get_img_json_pair, parse_for_errors

import time

In [12]:
# setup
with open(key_file,'r') as f:
    api_key = f.read()

client = anthropic.Anthropic(
  api_key=api_key.strip(),  # this is also the default, it can be omitted
)

In [13]:
jsons_to_parse = glob(jsons_dir + '/*.json')
jsons_to_parse[:3]

['/Users/jnaiman/LLM_VQA_JCDL2025/example_hists/jsons/nclust_3_trial9.json',
 '/Users/jnaiman/LLM_VQA_JCDL2025/example_hists/jsons/nclust_5_trial3.json',
 '/Users/jnaiman/LLM_VQA_JCDL2025/example_hists/jsons/nclust_2_trial0.json']

In [None]:
def send_to_claude(question_list, client, image_path, encoded_image,
                    model="claude-sonnet-4-20250514",
                    max_tokens=1000, temperature=0.1,
                    media_type = 'image/png',
                    tmp_dir = '/Users/jnaiman/Downloads/tmp/',
                    test_run = True, fac=1.0, 
                    verbose=True,
                    system_prompt = None, 
                    max_retries = 10, sleep_time=1):
    """
    Sends the question to claude and collects response.clear

    system_prompt : if None, then defaults to the overall system prompt generated with the questions.
    """
    if system_prompt is None:
        system_prompt = question_list['persona']

    #print('system prompt:', system_prompt)

    iFac = 1.0
    success = False
    #['persona', 'context','question', 'format']
    attempt = 0
    while not success and attempt < max_retries:
        try:
            question = question_list['context'] + " " + question_list['question'] + " " + question_list['format']
            # lowercase the first word, just in case
            question = question.lstrip() # no whitespace
            question = question[0].lower() + question[1:]
            if verbose: print('   on question:',question)
            # Prepare the API request
            prompt = f"I am going to show you an image. Here is the image: [Image: {encoded_image}]. Now, {question}"
            prompt_save = f"I am going to show you an image. Here is the image: [Image: <ENCODED IMAGE>]. Now, {question}"
            ##question_list['prompt'] = prompt
            
            if not test_run:
                # Send the request to the Claude api
                response = client.messages.create(
                    model = model,
                    max_tokens=max_tokens,
                    system = system_prompt,
                    temperature=temperature,
                    messages=[
                        {
                            "role": "user",
                            "content": [
                                {
                                    "type": "image",
                                    "source": {
                                        "type": "base64",
                                        "media_type": media_type,
                                        "data": encoded_image,
                                    },
                                },
                                {
                                    "type": "text",
                                    "text": prompt
                                }
                            ],
                        }
                        # {"role": "system", "content": question_list['persona']},
                        # {"role":"user", "content": [
                        #     {
                        #     "type": "text",
                        #     "text": prompt
                        #     },
                        #     {
                        #     "type": "image_url",
                        #     "image_url": {
                        #         "url": f"data:image/jpeg;base64,{encoded_image}"
                        #     }
                        #     }
                        # ]
                        # }
                    ]
                )
                success = True
            else:
                success = True
        except RateLimitError as e:
            if attempt < max_retries - 1:
                # Exponential backoff with jitter
                wait_time = sleep_time*(2 ** (attempt)) + np.random.uniform(0, 1)
                print(f"      Rate limit hit. Waiting {wait_time:.2f} seconds before retry {attempt + 1}")
                time.sleep(wait_time)
                attempt += 1
            else:
                print(f"Max retries exceeded. Error: {e}")
                raise
        except Exception as e:
            print(e)
            new_fac = fac/iFac
            print('      new fac = ', new_fac)
            encoded_image = load_image(image_path,fac=new_fac, tmp_dir=tmp_dir)
            iFac += 1
    
    print(response)
    if not test_run:
        # Get the response from the API
        answer = response.content[0].text #response.choices[0].message.content
        question_list['raw answer'] = answer
        # also calculate usage
        usage = response.usage
        question_list['usage'] = usage
        if verbose:
            print(f"Input tokens: {usage.input_tokens}")
            print(f"Output tokens: {usage.output_tokens}")
            print(f"Total tokens: {usage.input_tokens + usage.output_tokens}")
        # format answer
        answer_format = answer.split('```json"')[-1].split('\n')[0].replace('\n', '')
        #answer.replace("```json\n",'').replace("\n```",'')
        try:
            question_list['Response'] = json.loads(answer_format)
        except:
            question_list['Response'] = answer_format
            question_list['Error'] = 'JSON formatting'
        question_list['Response String'] = answer_format
        success = True
    else:
        question_list['Response'] = 'TEST RUN'
        question_list['Response String'] = 'TEST RUN'

    return question_list, prompt_save, system_prompt

In [15]:
iMax = 2
verbose = False
test_run = False # run w/o actually pinging openai
restart = False
model ="claude-sonnet-4-20250514"
max_tokens=500
# set system_prompt to None to default to what is in question list
system_prompt = """You are a helpful assistant that responds only in valid JSON format. Do not include any explanations, reasoning, or text outside of the JSON response."""
#system_prompt = """You must respond with only valid JSON. Start your response immediately with { and end with }. Do not write any text before or after the JSON."""
temperature=0.1
fac = 1.0
sleep = 60 # seconds


for ijson,json_path in enumerate(jsons_to_parse):
    if ijson >= iMax:
        continue

    print('on', ijson, 'of', iMax)

    # get image and base json
    img_path = imgs_dir + json_path.split('/')[-1].removesuffix('.json') + '.' + img_format
    encoded_image, img_format_media, base_json, err = get_img_json_pair(img_path, json_path, dir_api, 
                                                      fac=fac, restart=restart,
                                                      tmp_dir=tmp_dir)
    if err:
        continue

    ###### create QA ########
    qa = []
    
    for k,v in base_json['VQA']['Level 1']['Figure-level questions'].items():
        out = {'Q':v['Q'], 'A':v['A'], 'Level':'Level 1', 'type':'Figure-level questions', 'Response':""}
        qa.append(out)
    
    # what kinds?
    types = ['(words + list)', '(words)']
    
    # get uniques
    level_parse = 'Level 1'
    plot_level = 'Plot-level questions'
    qa = parse_qa(level_parse, plot_level, qa, base_json['VQA'], types)
    
    level_parse = 'Level 2'
    plot_level = 'Plot-level questions'
    qa = parse_qa(level_parse, plot_level, qa, base_json['VQA'], types)
    
    level_parse = 'Level 3'
    plot_level = 'Plot-level questions'
    qa = parse_qa(level_parse, plot_level, qa, base_json['VQA'], types)

    responses = []
    for question_list in qa:
        response, prompt, system_prompt_out = send_to_claude(question_list, client, img_path, encoded_image,
                    model = model, max_tokens=max_tokens, media_type='image/' + img_format_media,
                    test_run = test_run, system_prompt=system_prompt, temperature=temperature, 
                    sleep_time=sleep)
        responses.append(response)
        question_list['prompt'] = prompt
        question_list['system prompt'] = system_prompt_out
        #import sys; sys.exit() # just do one
        time.sleep(sleep)
    time.sleep(sleep)

    # parse for errors
    qa = parse_for_errors(qa, llm='claude')

    # dump to file
    if not test_run:
        with open(dir_api + json_path.split('/')[-1].removesuffix('.json')+ '.pickle', 'wb') as ff:
            pickle.dump([qa, model], ff)
    else:
        print('Would store at:', dir_api + json_path.split('/')[-1].removesuffix('.json')+ '.pickle')
    #import sys; sys.exit()

on 0 of 2
have file already: /Users/jnaiman/LLM_VQA_JCDL2025/example_hists/LLM_outputs/claude_api/nclust_3_trial9.pickle
on 1 of 2
system prompt: You are a helpful assistant that responds only in valid JSON format. Do not include any explanations, reasoning, or text outside of the JSON response.
   on question: how many bars are there in the specified figure panel? Please format the output as a json as {"nbars":""} for this figure panel, where the "nbars" value should be an integer.
Message(id='msg_011KicAmf3xWhdwkm3vWp4x3', content=[TextBlock(citations=None, text='I need to analyze the bar chart in the image to count the number of bars.\n\nLooking at the figure, I can see a bar chart with bars of different heights. Let me count each individual bar from left to right:\n\n1. First bar (leftmost)\n2. Second bar\n3. Third bar\n4. Fourth bar\n5. Fifth bar\n6. Sixth bar\n7. Seventh bar\n8. Eighth bar\n9. Ninth bar\n10. Tenth bar\n11. Eleventh bar\n12. Twelfth bar\n13. Thirteenth bar\n14. Fo

## Look at data

Check out one, if you wanna:

In [16]:
pickles = glob(dir_api + '*.pickle')
#pickles = glob('/Users/jnaiman/Downloads/tmp/JCDL2025/example_hists/claude_api/*pickle')
pickles[:5]

['/Users/jnaiman/LLM_VQA_JCDL2025/example_hists/LLM_outputs/claude_api/nclust_5_trial3.pickle',
 '/Users/jnaiman/LLM_VQA_JCDL2025/example_hists/LLM_outputs/claude_api/nclust_3_trial9.pickle']

In [17]:
ifile = 0
with open(pickles[ifile], 'rb') as f:
    qa_in = pickle.load(f)[0]

In [19]:
qa_in[0]

{'Q': 'You are a helpful assistant that can analyze images.  How many bars are there in the specified figure panel? Please format the output as a json as {"nbars":""} for this figure panel, where the "nbars" value should be an integer.',
 'A': 50,
 'Level': 'Level 1',
 'type': 'Plot-level questions',
 'Response': 'I need to analyze the bar chart in the image to count the number of bars.',
 'persona': 'You are a helpful assistant that can analyze images.',
 'context': '',
 'question': 'How many bars are there in the specified figure panel?',
 'format': 'Please format the output as a json as {"nbars":""} for this figure panel, where the "nbars" value should be an integer.',
 'plot number': 'plot0',
 'raw answer': 'I need to analyze the bar chart in the image to count the number of bars.\n\nLooking at the figure, I can see a bar chart with bars of different heights. Let me count each individual bar from left to right:\n\n1. First bar (leftmost)\n2. Second bar\n3. Third bar\n4. Fourth bar\

Claude outputs reasoning, so we have to do a bit of cleaning from the responses:

In [18]:
print(pickles[ifile])
print('*********')
for qa_pairs in qa_in:
    print('Prompt:', qa_pairs['prompt'])
    print('  Real A:', qa_pairs['A'])
    #response = qa_pairs['Response'].split('```json')
    response_claude = qa_pairs['raw answer']
    if '```json' in response_claude: # ideal
        response_claude = response_claude.split('```json')[-1].split('```')[0].replace('\n','')
        response_claude = json.loads(response_claude)
    elif '{"' in response_claude: # less ideal
        response_claude = '{"' + response_claude.split('{"')[-1].replace('\n','')
        response_claude = json.loads(response_claude)
    print('Claude A:', response_claude)
    print('')

/Users/jnaiman/LLM_VQA_JCDL2025/example_hists/LLM_outputs/claude_api/nclust_5_trial3.pickle
*********
Prompt: I am going to show you an image. Here is the image: [Image: <ENCODED IMAGE>]. Now, how many bars are there in the specified figure panel? Please format the output as a json as {"nbars":""} for this figure panel, where the "nbars" value should be an integer.
  Real A: 50
Claude A: {'nbars': '20'}

Prompt: I am going to show you an image. Here is the image: [Image: <ENCODED IMAGE>]. Now, what are the maximum data values in this figure panel?  Please format the output as a json as {"maximum x":""} for this figure panel, where the "maximum" value should be a float, calculated from the data values used to create the plot.
  Real A: 0.531999058363311
Claude A: {'maximum x': '0.5'}

Prompt: I am going to show you an image. Here is the image: [Image: <ENCODED IMAGE>]. Now, what are the mean data values in this figure panel?  Please format the output as a json as {"mean x":""} for this 