#**Text to Video**

##**0. Install Dependencies**

In [1]:
! pip install torch



In [2]:
! pip install diffusers transformers accelerate



##**1. Implementation**
We require a GPU to generate video clips

Menu Bar --> Runtime --> Change Runtime --> T4 GPU

We will use diffusion model (damo-vilab/text-to-video-ms-1.7b) from Huggingface


In [3]:
import torch
import numpy as np
from PIL import Image
from diffusers import DiffusionPipeline, DPMSolverMultistepScheduler
from diffusers.utils import export_to_video
import gc

# Load the pipeline with reduced memory usage
pipe = DiffusionPipeline.from_pretrained(
    "damo-vilab/text-to-video-ms-1.7b",
    torch_dtype=torch.float16,
    variant="fp16",
    low_cpu_mem_usage=True  # Important for reducing RAM
)

# Check if GPU is available and use it
device = "cuda" if torch.cuda.is_available() else "cpu"
pipe.to(device)  # Move the model to the GPU if available

# Setup scheduler and CPU offload for memory efficiency
pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)
pipe.enable_model_cpu_offload()

# Generate video frames from prompt
prompt = "Spiderman is surfing"
output = pipe(prompt, num_inference_steps=20)  # Reduce steps for lighter memory

frames_array = output.frames  # It should be a numpy ndarray of shape (1, 16, 256, 256, 3)

# Check if frames_array is a numpy ndarray with expected shape
if isinstance(frames_array, np.ndarray) and frames_array.ndim == 5:
    frames_array = frames_array[0]  # Extract the frames from the first batch (shape: 16, 256, 256, 3)
else:
    raise ValueError("Unexpected format for frames:", type(frames_array), frames_array.shape)

# Convert float32 frames to RGB uint8 images
rgb_frames = []
for frame in frames_array:
    frame = (frame * 255).clip(0, 255).astype(np.uint8)
    rgb_frames.append(Image.fromarray(frame))

# Save to video file
video_path = export_to_video(rgb_frames)
print("✅ Video saved at:", video_path)

# Clean up memory to avoid crashes
gc.collect()
if torch.cuda.is_available():
    torch.cuda.empty_cache()


The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


Loading pipeline components...:   0%|          | 0/5 [00:00<?, ?it/s]

  0%|          | 0/20 [00:00<?, ?it/s]

✅ Video saved at: /tmp/tmptdg09whk.mp4
