# SceneXplain: Accessibility Assistant

In this notebook we will use Jina AI's [SceneXplain](https://scenex.jina.ai) to cycle through a list of images and for each one:

- Generate an alt-tag
- Assess whether it is color-blind safe
- Assess whether it offers enough contrast for visually impaired users
- Offer advice on how to improve its accessibility

We will cycle through several different lists of images:

- "Standard" photographs
- Infographics and similar
- Color-blind unsafe images
- Low-contrast images

## Basic setup

Functions to send generate image lists, populate the data to send to SceneXplain, send that data, then pretty-print the output.

In [9]:
from dotenv import load_dotenv
import os
import json
import base64
import http.client
from pprint import pprint
from glob import glob

In [2]:
try:
    from dotenv import load_dotenv
    load_dotenv()
    print("Environment variables loaded from .env")
except ImportError:
    os.environ['OPENAI_API_KEY'] = "<your OpenAI key>"
    os.environ['SCENEX_SECRET'] = "<your SceneXplain key>"

Environment variables loaded from .env


In [129]:
def generate_image_list(subfolder, max_count=100, root_folder='./images'):
    file_path = os.path.join(root_folder, subfolder)
    filetypes = ['jpg', 'jpeg', 'png']
    image_files = []
    
    for filetype in filetypes:
        image_files.extend(glob(f'{file_path}/*.{filetype}'))

    return image_files[:max_count]

In [115]:
# SceneX setup and functions

SCENEX_SECRET=os.getenv('SCENEX_SECRET')
features = ["json"]

scenex_headers = {
    "x-api-key": f"token {SCENEX_SECRET}",
    "content-type": "application/json",
}

ALGO = "GPT4V"

def image_to_data_uri(file_path):
    with open(file_path, "rb") as image_file:
        encoded_image = base64.b64encode(image_file.read()).decode("utf-8")
        return f"data:image/png;base64,{encoded_image}"
        
def generate_scenex_data(image_files, json_schema=None, question=None, features=[]):
    data = {}
    data['data'] = []

    for file in image_files:
        cid = file.split('/')[-1]
        row = {
            "image": image_to_data_uri(file),
            "features": features,
            "algorithm": ALGO,
            "cid": cid
        }

        if question:
            row["question"] = question

        if json_schema:
            row["json_schema"] = json_schema

        data['data'].append(row)

    return data

def process_scenex(data):
    connection = http.client.HTTPSConnection("api.scenex.jina.ai")
    connection.request("POST", "/v1/describe", json.dumps(data), scenex_headers)
    response = connection.getresponse()
    response_data = response.read().decode("utf-8")
    
    connection.close()

    return json.loads(response_data)['result']

In [136]:
def print_output(image_files, sx_response):
    for file, sx_record in zip(image_files, sx_response):
        try:
            sx_answer = json.loads(sx_record['i18n']['en'])
            print(file)
        
            pprint(sx_answer)    
        except:
            print(f"{file} failed")
            pprint(sx_record)
            
        print("-"*10)

## Schema definition

The schema defines the information we want from SceneXplain and the JSON output format we desire

In [108]:
schema = {
  "type": "object",
  "properties": {
    "alt_tag": {
      "type": "string",
      "description": "the most concise description possible of the image’s purpose. If the image is purely decorative (e.g. part of the website's design, not content), leave empty. Do not include text like 'this image contains' or 'image depicts'"
    },
    # "caption_transcript": {
    #   "type": "string",
    #   "description": "if the image conveying significant information (like an infographic) provide captions or transcripts for images. Otherwise leave empty"
    # },
    "color_blind_safe": {
      "type": "boolean",
      "description": "Can a color-blind person tell what is in this image?"
    },
    "good_contrast": {
        "type": "boolean",
        "description": "Given the image's color contrast, could a visually-impaired person easily understand what is in the image"
    },
    "a11y_recs": {
        "type": "array",
        "description": "A list of recommendations to improve the accessibility of this image"
    }
  }
}


## Putting SceneXplain to the A11y test

### Standard photos

Taken from pexels.com

In [121]:
image_files = generate_image_list('photos')

In [122]:
scenex_data = generate_scenex_data(image_files, json_schema=json.dumps(schema), features=features)

In [123]:
scenex_response = process_scenex(scenex_data)

In [124]:
print_output(image_files, scenex_response)

./images/photos/free-photo-of-leaves-on-the-branch.jpeg
{'a11y_recs': ['Use higher contrast between the leaves and the background for '
               'better definition.',
               'Add descriptive text for the image that explains the seasonal '
               'changes represented by the leaf colors.'],
 'alt_tag': 'Close-up of deciduous tree leaves transitioning from green to '
            'yellow and brown, amidst ferns',
 'color_blind_safe': False,
 'good_contrast': True}
----------
./images/photos/free-photo-of-a-bowl-of-granola-with-fruit-and-nuts-on-a-wooden-cutting-board.jpeg
{'a11y_recs': [],
 'alt_tag': 'Bowl of granola with pomegranate seeds and strawberry slices on a '
            'bamboo board',
 'color_blind_safe': True,
 'good_contrast': True}
----------
./images/photos/free-photo-of-pose-woman-dress-in-the-desert-gold-light-curly-hair.jpeg
{'a11y_recs': ['Increase contrast between subject and background',
               'Provide detailed description for screen rea

### Color blindness tests

Several [Ishihara tests](https://en.wikipedia.org/wiki/Ishihara_test) scraped from the web.

In [130]:
image_files = generate_image_list('color-blind')

In [131]:
scenex_data = generate_scenex_data(image_files, json_schema=json.dumps(schema), features=features)

In [138]:
scenex_response = process_scenex(scenex_data)

In [139]:
print_output(image_files, scenex_response)

./images/color-blind/ColorBlindness_share.jpg failed
{'error': {'message': 'Internal error', 'status': 'INTERNAL'}}
----------
./images/color-blind/images.jpg failed
{'error': {'message': 'Internal error', 'status': 'INTERNAL'}}
----------
./images/color-blind/pexels-photo-1400172.jpeg failed
{'error': {'message': 'Internal error', 'status': 'INTERNAL'}}
----------
./images/color-blind/pexels-photo-531910.jpeg failed
{'error': {'message': 'Internal error', 'status': 'INTERNAL'}}
----------
./images/color-blind/pexels-photo-5529540.png failed
{'error': {'message': 'Internal error', 'status': 'INTERNAL'}}
----------
./images/color-blind/Ishihara_9.svg.png failed
{'error': {'message': 'Internal error', 'status': 'INTERNAL'}}
----------
./images/color-blind/ishihara-color-plate-test.png failed
{'error': {'message': 'Internal error', 'status': 'INTERNAL'}}
----------


### Low contrast tests

Mostly edited versions of the photos from pexels.com

In [116]:
image_files = generate_image_list('low-contrast', 3)

In [117]:
image_files

['./images/low-contrast/free-photo-of-a-bowl-of-granola-with-fruit-and-nuts-on-a-wooden-cutting-board.jpeg',
 './images/low-contrast/free-photo-of-pose-woman-dress-in-the-desert-gold-light-curly-hair.jpeg',
 './images/low-contrast/free-photo-of-holida-christmas-party-drinks-ornaments.jpeg']

In [118]:
scenex_data = generate_scenex_data(image_files, json_schema=json.dumps(schema), features=features)

In [119]:
scenex_response = process_scenex(scenex_data)

In [120]:
print_output(image_files, scenex_response)

./images/low-contrast/free-photo-of-a-bowl-of-granola-with-fruit-and-nuts-on-a-wooden-cutting-board.jpeg
{'a11y_recs': ['Increase overall image contrast to help visually-impaired '
               'individuals better distinguish the details',
               'Add a clear label or caption to describe the content for those '
               'who cannot see the image'],
 'alt_tag': 'Bowl of granola with pomegranate seeds and sliced strawberries',
 'color_blind_safe': True,
 'good_contrast': False}
----------
./images/low-contrast/free-photo-of-pose-woman-dress-in-the-desert-gold-light-curly-hair.jpeg
{'a11y_recs': ['Increase image contrast to make the features more '
               'distinguishable',
               'Provide descriptive alternative text that conveys contextual '
               'information'],
 'alt_tag': 'Person with curly hair partially obscuring their face, wearing a '
            'beige coat',
 'color_blind_safe': True,
 'good_contrast': False}
----------
./images/low-cont