Skip to content

Python: feature: support gpt-image-1 #12621

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

ymuichiro
Copy link
Contributor

@ymuichiro ymuichiro commented Jun 29, 2025

This pull request was created in response to the issue #12500 (comment). The current AzureTextToImage implementation only works with DALL-E 3. For gpt-image-1, the response format has changed from a URL to base64 only, which the current code does not support.

Additionally, gpt-image-1 introduces a new image editing feature that also needs to be supported. Since breaking changes are required, new methods have been added.

Minimal code that reproduces the problem

service = AzureTextToImage(
        service_id=service_id,
        deployment_name="image-1",
        endpoint=AZURE_OPENAI_IMAGE_ENDPOINT,
        api_key=AZURE_OPENAI_IMAGE_API_KEY,
    )
)

settings = service.get_prompt_execution_settings_class()(service_id="image1")
settings.prompt = "sky"
settings.size = ImageSize(width=1024, height=1024)
settings.quality = "low"
r = await service.generate_image(settings=settings)

Example of use with newly added methods

from semantic_kernel.connectors.ai.open_ai import AzureTextToImage
service = AzureTextToImage(
    service_id="image1",
    deployment_name="gpt-image-1",
    endpoint=AZURE_OPENAI_IMAGE_ENDPOINT,
    api_key=AZURE_OPENAI_IMAGE_API_KEY,
    api_version="2025-04-01-preview",
)
settings = service.get_prompt_execution_settings_class()(service_id="image1")
settings.n = 3
images_b64 = await service.generate_images("A cute cat wearing a whimsical striped hat", settings=settings)
  from semantic_kernel.connectors.ai.open_ai import AzureTextToImage
  service = AzureTextToImage(
      service_id="image1",
      deployment_name="gpt-image-1",
      endpoint=AZURE_OPENAI_IMAGE_ENDPOINT,
      api_key=AZURE_OPENAI_IMAGE_API_KEY,
      api_version="2025-04-01-preview",
  )
  file_paths = ["./new_images/img_1.png", "./new_images/img_2.png"]
  settings = service.get_prompt_execution_settings_class()(service_id="image1")
  settings.n = 2
  results = await service.edit_image(
      prompt="Make the cat wear a wizard hat",
      image_paths=file_paths,
      settings=settings,
  )

Problems Identified

  1. Assumption of URL-based responses. The current implementation assumes a response format that includes an image url, which is not the case for gpt-image-1. See:
    raise ServiceResponseException("Failed to generate image.")

@ymuichiro ymuichiro requested a review from a team as a code owner June 29, 2025 02:16
@markwallace-microsoft markwallace-microsoft added the python Pull requests for the Python Semantic Kernel label Jun 29, 2025
@ymuichiro ymuichiro force-pushed the python/feature/support-gpt-image-1 branch from b8e5e58 to 8aa209b Compare June 29, 2025 02:40
@moonbox3
Copy link
Contributor

Thanks for working on this, @ymuichiro. Are there unit tests we can add so we have coverage for the new code?

@ymuichiro ymuichiro force-pushed the python/feature/support-gpt-image-1 branch from 4ea4f29 to 8aa209b Compare June 30, 2025 09:36
@ymuichiro
Copy link
Contributor Author

@moonbox3

I've added unit tests!
ymuichiro@5e0269d

I also made some minor adjustments to other parts of the code as I found issues in existing test code.

@ymuichiro
Copy link
Contributor Author

@moonbox3

The test was failing, so I fixed it.

  1. An error was occurring on the AzureTextToImage side. This has been resolved.
  2. responses.usage is assumed to always return True for hasattr, but a safety check has been added just in case.

@markwallace-microsoft
Copy link
Member

Python Test Coverage

Python Test Coverage Report •
FileStmtsMissCoverMissing
connectors/ai/open_ai/prompt_execution_settings
   open_ai_text_to_image_execution_settings.py45295%54, 63
connectors/ai/open_ai/services
   open_ai_handler.py1201984%150–151, 156–159, 164, 172–173, 189–190, 202, 211–212, 225–229
   open_ai_text_to_image_base.py1011585%47, 51, 55, 63, 112, 114, 119, 128, 136–137, 139, 142, 206, 240, 244
TOTAL26507452982% 

Python Unit Test Overview

Tests Skipped Failures Errors Time
3649 22 💤 0 ❌ 0 🔥 1m 53s ⏱️

@moonbox3 moonbox3 requested a review from eavanvalkenburg July 3, 2025 08:43
@eavanvalkenburg eavanvalkenburg added this pull request to the merge queue Jul 4, 2025
Merged via the queue into microsoft:main with commit 93a14d5 Jul 4, 2025
27 checks passed
@github-project-automation github-project-automation bot moved this to Sprint: Done in Semantic Kernel Jul 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
python Pull requests for the Python Semantic Kernel
Projects
Status: Sprint: Done
Development

Successfully merging this pull request may close these issues.

4 participants