NetImage

A from-scratch SDXL (Stable Diffusion XL) text-to-image pipeline built entirely in C# using ONNX Runtime with DirectML GPU acceleration. No Python runtime needed at inference time.

Features

Full SDXL text-to-image generation pipeline in .NET
GPU-accelerated inference via DirectML (works with NVIDIA, AMD, and Intel GPUs on Windows)
BPE tokenization with dual CLIP text encoders
Euler discrete noise scheduler
Classifier-free guidance support
Configurable resolution, steps, guidance scale, and seed

Requirements

.NET 10 SDK
Windows 10/11 (DirectML requirement)
GPU with DirectX 12 support (any modern NVIDIA, AMD, or Intel GPU)
SDXL model exported to ONNX format (see Model Setup)

Quick Start

# Build
dotnet build

# Generate an image
dotnet run --project Net-Image -- \
  --model ./sdxl-onnx \
  --prompt "a photo of an astronaut riding a horse on mars" \
  --output result.png

CLI Options

Option	Short	Default	Description
`--model`	`-m`	(required)	Path to SDXL ONNX model directory
`--prompt`	`-p`	(required)	Text prompt for image generation
`--negative-prompt`	`-n`	`""`	Negative prompt
`--output`	`-o`	`output.png`	Output file path
`--steps`	`-s`	`30`	Number of denoising steps
`--guidance-scale`	`-g`	`7.5`	Classifier-free guidance scale
`--seed`		`42`	Random seed for reproducibility
`--width`		`1024`	Output width in pixels
`--height`		`1024`	Output height in pixels

Model Setup

The pipeline requires an SDXL model converted to ONNX format. You'll need Python for the one-time conversion:

pip install optimum[onnxruntime]
optimum-cli export onnx --model stabilityai/stable-diffusion-xl-base-1.0 --task text-to-image ./sdxl-onnx

This produces the following directory structure:

sdxl-onnx/
├── text_encoder/model.onnx
├── text_encoder_2/model.onnx
├── tokenizer/vocab.json, merges.txt
├── tokenizer_2/vocab.json, merges.txt
├── unet/model.onnx
└── vae_decoder/model.onnx

Architecture

Program.cs (CLI)
    |
    v
StableDiffusionPipeline        orchestrates the full generation loop
    |
    |-- ClipTokenizer (x2)     tokenizes prompt text via BPE
    |-- TextEncoder (x2)       token IDs -> text embeddings (ONNX Runtime)
    |-- EulerDiscreteScheduler manages noise schedule and timesteps
    |-- UNet                   predicts noise at each step (ONNX Runtime)
    +-- VaeDecoder             latent -> RGB pixels (ONNX Runtime)
            |
            v
        ImageSharp -> output PNG

Pipeline Flow

Tokenize the prompt with two CLIP tokenizers (77 tokens max each)
Encode with two text encoders, concatenate hidden states (768 + 1280 = 2048 dimensions), extract pooled output from encoder 2
Generate random latent noise (1x4x128x128 for 1024x1024 output)
Denoise loop — for each scheduler timestep:
- Duplicate latent for classifier-free guidance (unconditional + conditional)
- Run U-Net with text embeddings, timestep, and micro-conditioning
- Apply guidance: noise_pred = uncond + scale * (cond - uncond)
- Scheduler step to update latent
Decode final latent with VAE decoder
Save as PNG

Tips

Higher guidance scale (8-12) makes the model follow the prompt more closely
More steps (40-50) produces finer detail at the cost of longer generation time
Different seeds produce completely different compositions
SDXL works best at 1024x1024 resolution

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
Net-Image		Net-Image
.gitattributes		.gitattributes
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
NetImage.slnx		NetImage.slnx
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NetImage

Features

Requirements

Quick Start

CLI Options

Model Setup

Architecture

Pipeline Flow

Tips

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

NetImage

Features

Requirements

Quick Start

CLI Options

Model Setup

Architecture

Pipeline Flow

Tips

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages