PixTalk: Controlling Photorealistic Image Processing and Editing with Language (ICCV 2025)

Marcos V. Conde, Zihao Lu, Radu Timofte

Computer Vision Lab, University of Wuerzburg

- Demos upcoming

TL;DR: quickstart

Text-guided image generation and editing is emerging as a fundamental problem in computer vision. However, most approaches lack control, and the generated results are far from professional photography quality standards. In this work, we propose the first approach that introduces language and explicit control into the image processing and editing pipeline. PixTalk is a vision-language multi-task image processing model, guided using text instructions. Our method is able to perform over 40 transformations --the most popular techniques in photography--, delivering results as professional photography editing software. Our model can process 12MP images on consumer GPUs in real-time (under 1 second). As part of this effort, we propose a novel dataset and benchmark for new research on multi-modal image processing and editing.

Other relevant stuff

Repo updates

HuggingFace demo using Gradio
Full code release
Dataset request form

(we are currently in Hawaii at ICCV 2025, getting feedback before the big release )

Contacts

For any inquiries contact Marcos V. Conde: marcos.conde[at]uni-wuerzburg.de

This work has been patented worldwide at the European Patent Office (EPO). If you would like to use this work in commercial applications, please contact us :) There are no limitations for open non-profit research and academic research.

Citation

If you find our work interesting, you use our ideas or dataset, please cite properly our works.

@InProceedings{Conde_2025_ICCV,
    author    = {Conde, Marcos V. and Lu, Zihao and Timofte, Radu},
    title     = {PixTalk: Controlling Photorealistic Image Processing and Editing with Language},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2025},
    pages     = {19269-19279}
}

PixTalk is inspired in InstructIR (ECCV 2024).

@inproceedings{conde2024high,
  title={InstructIR: High-Quality Image Restoration Following Human Instructions},
  author={Conde, Marcos V and Geigle, Gregor and Timofte, Radu},
  booktitle    = {Proceedings of the European Conference on Computer Vision (ECCV)},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
ICCV PixTalk Poster.png		ICCV PixTalk Poster.png
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

PixTalk: Controlling Photorealistic Image Processing and Editing with Language (ICCV 2025)

TL;DR: quickstart

Other relevant stuff

Repo updates

Contacts

Citation

About

Uh oh!

Releases

Packages

mv-lab/pixtalk

Folders and files

Latest commit

History

Repository files navigation

PixTalk: Controlling Photorealistic Image Processing and Editing with Language (ICCV 2025)

TL;DR: quickstart

Other relevant stuff

Repo updates

Contacts

Citation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages