In this quickstart, you'll analyze a remotely stored image to extract visual features using the Computer Vision REST API. With the Analyze Image method, you can extract visual features based on image content.
You can run this quickstart in a step-by step fashion using a Jupyter Notebook on MyBinder. To launch Binder, select the following button:
- An Azure subscription - Create one for free
- Python and the following packages:
- requests
- matplotlib
- pillow
- Once you have your Azure subscription, create a Computer Vision resource in the Azure portal to get your key and endpoint. After it deploys, click Go to resource.
- You will need the key and endpoint from the resource you create to connect your application to the Computer Vision service. You'll paste your key and endpoint into the code below later in the quickstart.
- You can use the free pricing tier (
F0
) to try the service, and upgrade later to a paid tier for production.
- Create environment variables for the key and endpoint URL, named
COMPUTER_VISION_KEY
andCOMPUTER_VISION_ENDPOINT
, respectively.
To create and run the sample, do the following steps:
- Copy the following code into a text editor.
- Optionally, replace the value of
image_url
with the URL of a different image that you want to analyze. - Save the code as a file with an
.py
extension. For example,analyze-image.py
. - Open a command prompt window.
- At the prompt, use the
python
command to run the sample. For example,python analyze-image.py
.
import requests
# If you are using a Jupyter Notebook, uncomment the following line.
# %matplotlib inline
import matplotlib.pyplot as plt
import json
from PIL import Image
from io import BytesIO
# Add your Computer Vision key and endpoint to your environment variables.
if 'COMPUTER_VISION_KEY' in os.environ:
subscription_key = os.environ['COMPUTER_VISION_KEY']
else:
print("\nSet the COMPUTER_VISION_KEY environment variable.\n**Restart your shell or IDE for changes to take effect.**")
sys.exit()
if 'COMPUTER_VISION_ENDPOINT' in os.environ:
endpoint = os.environ['COMPUTER_VISION_ENDPOINT']
analyze_url = endpoint + "vision/v3.1/analyze"
# Set image_url to the URL of an image that you want to analyze.
image_url = "https://upload.wikimedia.org/wikipedia/commons/thumb/1/12/" + \
"Broadway_and_Times_Square_by_night.jpg/450px-Broadway_and_Times_Square_by_night.jpg"
headers = {'Ocp-Apim-Subscription-Key': subscription_key}
params = {'visualFeatures': 'Categories,Description,Color'}
data = {'url': image_url}
response = requests.post(analyze_url, headers=headers,
params=params, json=data)
response.raise_for_status()
# The 'analysis' object contains various fields that describe the image. The most
# relevant caption for the image is obtained from the 'description' property.
analysis = response.json()
print(json.dumps(response.json()))
image_caption = analysis["description"]["captions"][0]["text"].capitalize()
# Display the image and overlay it with the caption.
image = Image.open(BytesIO(requests.get(image_url).content))
plt.imshow(image)
plt.axis("off")
_ = plt.title(image_caption, size="x-large", y=-0.1)
plt.show()
A successful response is returned in JSON. The sample webpage parses and displays a successful response in the command prompt window, similar to the following example:
{
"categories": [
{
"name": "outdoor_",
"score": 0.00390625,
"detail": {
"landmarks": []
}
},
{
"name": "outdoor_street",
"score": 0.33984375,
"detail": {
"landmarks": []
}
}
],
"description": {
"tags": [
"building",
"outdoor",
"street",
"city",
"people",
"busy",
"table",
"walking",
"traffic",
"filled",
"large",
"many",
"group",
"night",
"light",
"crowded",
"bunch",
"standing",
"man",
"sign",
"crowd",
"umbrella",
"riding",
"tall",
"woman",
"bus"
],
"captions": [
{
"text": "a group of people on a city street at night",
"confidence": 0.9122243847383961
}
]
},
"color": {
"dominantColorForeground": "Brown",
"dominantColorBackground": "Brown",
"dominantColors": [
"Brown"
],
"accentColor": "B54316",
"isBwImg": false
},
"requestId": "c11894eb-de3e-451b-9257-7c8b168073d1",
"metadata": {
"height": 600,
"width": 450,
"format": "Jpeg"
}
}
Next, explore a Python application that uses Computer Vision to perform optical character recognition (OCR); create smart-cropped thumbnails; plus detect, categorize, tag, and describe visual features in images.
-
To rapidly experiment with the Computer Vision API, try the Open API testing console.