## Open notebook in:
| Colab                                 
:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Nicolepcx/transformers-the-definitive-guide/blob/master/CH05/ch05_LTX.ipynb)                                             

# About this Notebook

This notebook demonstrates **text-to-video** and **image-conditioned video generation** using the **LTX-Video** family of models developed by Lightricks. These models leverage efficient transformer-based architectures and support **quantized inference (8-bit)** for resource-optimized deployment on modern GPUs. The notebook uses both the standard `LTXPipeline` and `LTXConditionPipeline` for flexible video generation workflows.

### Steps Included:

1. **Conditional Video Generation (Image + Text)**:
   The notebook then demonstrates the use of `LTXConditionPipeline` for **image-conditioned video synthesis**. An image (e.g., of a woman walking on a tree-lined street) is downloaded and used as the visual anchor. This image, along with a descriptive prompt, is used to guide motion and scene synthesis. A **negative prompt** is added to suppress visual artifacts like blurriness or jitter.

2. **Fine-Grained Control Parameters**:

   * **Condition strength** allows for controlling how much the input image influences the generation.
   * **Image conditioning noise scale** introduces stochasticity while preserving structural alignment.
   * **Guidance scale** adjusts how strongly the model follows the text prompt.



# Installs

In [1]:
!pip install diffusers==0.34.0 gdown==5.2.0 -qqq

In [2]:
!pip install -U bitsandbytes==0.46.0 -qqq

[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m67.0/67.0 MB[0m [31m36.6 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m363.4/363.4 MB[0m [31m3.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m13.8/13.8 MB[0m [31m110.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m24.6/24.6 MB[0m [31m90.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m883.7/883.7 kB[0m [31m56.5 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m664.8/664.8 MB[0m [31m1.7 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m211.5/211.5 MB[0m [31m11.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m56.3/56.3 MB[0m [31m41.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

# Imports

In [3]:
import gdown
import torch
from diffusers import BitsAndBytesConfig as DiffusersBitsAndBytesConfig, LTXVideoTransformer3DModel, LTXPipeline
from diffusers.utils import export_to_video, load_video, load_image
from transformers import BitsAndBytesConfig as BitsAndBytesConfig, T5EncoderModel
from diffusers.pipelines.ltx.pipeline_ltx_condition import LTXConditionPipeline, LTXVideoCondition

# Get image for conditional video generation

In [4]:

file_id = "1UMBjsG2AqyGAtdaKCepjw97XOeWZqLVp"
gdown.download(id=file_id, output="image.jpg", quiet=False)


Downloading...
From: https://drive.google.com/uc?id=1UMBjsG2AqyGAtdaKCepjw97XOeWZqLVp
To: /content/image.jpg
100%|██████████| 4.66M/4.66M [00:00<00:00, 265MB/s]


'image.jpg'

# Load model and generate video

In [5]:
# Load pipeline
pipe = LTXConditionPipeline.from_pretrained(
    "Lightricks/LTX-Video-0.9.7-distilled",
    torch_dtype=torch.bfloat16
)
pipe.to("cuda")

# Load input image
image = load_image("image.jpg")

# Set conditioning
condition1 = LTXVideoCondition(
    image=image,
    frame_index=0,
    strength=1.0  # Full conditioning
)

# Set prompts
prompt = "A woman walking on a street with trees left and right. Above her, the vibrant green leaves shimmer in the dappled sunlight, their color almost fluorescent against the darker trunks. The air is alive with energy, sunlight flickering through the canopy, painting shifting patterns across the trees."
negative_prompt = "worst quality, inconsistent motion, blurry, jittery, distorted"

# Set reproducible generator
generator = torch.Generator("cuda").manual_seed(42)

# Run generation
video = pipe(
    conditions=[condition1],
    prompt=prompt,
    negative_prompt=negative_prompt,
    width=704,
    height=512,
    num_frames=48,  # 2 seconds at 24 FPS
    frame_rate=24,
    num_inference_steps=40,
    guidance_scale=5.5,
    image_cond_noise_scale=0.15,
    generator=generator
).frames[0]

# Export video
export_to_video(video, "walking.mp4", fps=24)


model_index.json:   0%|          | 0.00/421 [00:00<?, ?B/s]

Fetching 22 files:   0%|          | 0/22 [00:00<?, ?it/s]

scheduler_config.json:   0%|          | 0.00/487 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/740 [00:00<?, ?B/s]

model-00003-of-00004.safetensors:   0%|          | 0.00/4.87G [00:00<?, ?B/s]

model.safetensors.index.json: 0.00B [00:00, ?B/s]

added_tokens.json: 0.00B [00:00, ?B/s]

model-00001-of-00004.safetensors:   0%|          | 0.00/4.99G [00:00<?, ?B/s]

model-00002-of-00004.safetensors:   0%|          | 0.00/5.00G [00:00<?, ?B/s]

model-00004-of-00004.safetensors:   0%|          | 0.00/4.19G [00:00<?, ?B/s]

special_tokens_map.json: 0.00B [00:00, ?B/s]

tokenizer_config.json: 0.00B [00:00, ?B/s]

spiece.model:   0%|          | 0.00/792k [00:00<?, ?B/s]

(…)pytorch_model-00001-of-00006.safetensors:   0%|          | 0.00/4.87G [00:00<?, ?B/s]

config.json:   0%|          | 0.00/501 [00:00<?, ?B/s]

(…)pytorch_model-00002-of-00006.safetensors:   0%|          | 0.00/4.97G [00:00<?, ?B/s]

(…)pytorch_model-00003-of-00006.safetensors:   0%|          | 0.00/4.97G [00:00<?, ?B/s]

(…)pytorch_model-00004-of-00006.safetensors:   0%|          | 0.00/4.97G [00:00<?, ?B/s]

(…)pytorch_model-00005-of-00006.safetensors:   0%|          | 0.00/4.97G [00:00<?, ?B/s]

(…)pytorch_model-00006-of-00006.safetensors:   0%|          | 0.00/1.34G [00:00<?, ?B/s]

(…)ion_pytorch_model.safetensors.index.json: 0.00B [00:00, ?B/s]

config.json: 0.00B [00:00, ?B/s]

diffusion_pytorch_model.safetensors:   0%|          | 0.00/2.49G [00:00<?, ?B/s]

Loading pipeline components...:   0%|          | 0/5 [00:00<?, ?it/s]

Loading checkpoint shards:   0%|          | 0/4 [00:00<?, ?it/s]

Loading checkpoint shards:   0%|          | 0/6 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

'walking.mp4'