<a href="https://colab.research.google.com/github/reshmi56/ML-Projects/blob/main/Text_to_video.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
# Step 2: Install Required Libraries
!pip install spacy transformers torch torchvision torchaudio moviepy gtts
!python -m spacy download en_core_web_sm
!pip install diffusers

# Step 3: Text Analysis and Understanding
import spacy

# Load NLP model
nlp = spacy.load("en_core_web_sm")

def extract_entities_actions(text):
    doc = nlp(text)
    entities = [(ent.text, ent.label_) for ent in doc.ents]
    actions = [token.lemma_ for token in doc if token.pos_ == 'VERB']
    return entities, actions

text = "A cat sits on the roof while the sun sets in the background."
entities, actions = extract_entities_actions(text)
print("Entities:", entities)
print("Actions:", actions)

# Step 4: Scene Generation
from diffusers import StableDiffusionPipeline  # Import from diffusers instead of transformers
import torch
from PIL import Image
import numpy as np

# Load models and tokenizer
model_id = "CompVis/stable-diffusion-v1-4"
pipe = StableDiffusionPipeline.from_pretrained(model_id)
pipe = pipe.to("cuda")

def generate_image_from_text(description):
    image = pipe(description).images[0]
    return image

description = "A cat sitting on a roof during sunset"
image = generate_image_from_text(description)
image.show()

# Step 5: Animation and Motion
from moviepy.editor import ImageSequenceClip

def create_video_from_frames(frames, fps=24):
    frames = [np.array(frame) for frame in frames]  # Convert PIL images to NumPy arrays
    clip = ImageSequenceClip(frames, fps=fps)
    clip.write_videofile("output_video.mp4", codec="libx264")

# Example frame generation (replace with actual frame generation)
frames = [generate_image_from_text(description) for _ in range(10)]
create_video_from_frames(frames)

# Step 6: Audio and Speech Generation
from gtts import gTTS
import os

def generate_speech_from_text(text, filename="output_audio.mp3"):
    tts = gTTS(text)
    tts.save(filename)

text = "A cat sits on the roof while the sun sets in the background."
generate_speech_from_text(text)
os.system("output_audio.mp3")

# Step 7: Integration and Rendering
from moviepy.editor import VideoFileClip, AudioFileClip

def combine_video_and_audio(video_path, audio_path, output_path="final_output.mp4"):
    video = VideoFileClip(video_path)
    audio = AudioFileClip(audio_path)
    final_clip = video.set_audio(audio)
    final_clip.write_videofile(output_path, codec="libx264")

combine_video_and_audio("output_video.mp4", "output_audio.mp3")

Collecting gtts
  Downloading gTTS-2.5.2-py3-none-any.whl.metadata (4.1 kB)
Collecting nvidia-cuda-nvrtc-cu12==12.1.105 (from torch)
  Using cached nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-manylinux1_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-runtime-cu12==12.1.105 (from torch)
  Using cached nvidia_cuda_runtime_cu12-12.1.105-py3-none-manylinux1_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-cupti-cu12==12.1.105 (from torch)
  Using cached nvidia_cuda_cupti_cu12-12.1.105-py3-none-manylinux1_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cudnn-cu12==8.9.2.26 (from torch)
  Using cached nvidia_cudnn_cu12-8.9.2.26-py3-none-manylinux1_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cublas-cu12==12.1.3.1 (from torch)
  Using cached nvidia_cublas_cu12-12.1.3.1-py3-none-manylinux1_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cufft-cu12==11.0.2.54 (from torch)
  Using cached nvidia_cufft_cu12-11.0.2.54-py3-none-manylinux1_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-curand-cu1

The cache for model files in Transformers v4.22.0 has been updated. Migrating your old cache. This is a one-time only operation. You can interrupt this and resume the migration later on by calling `transformers.utils.move_cache()`.


0it [00:00, ?it/s]

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


model_index.json:   0%|          | 0.00/541 [00:00<?, ?B/s]

Fetching 16 files:   0%|          | 0/16 [00:00<?, ?it/s]

(…)ature_extractor/preprocessor_config.json:   0%|          | 0.00/342 [00:00<?, ?B/s]

(…)kpoints/scheduler_config-checkpoint.json:   0%|          | 0.00/209 [00:00<?, ?B/s]

text_encoder/config.json:   0%|          | 0.00/592 [00:00<?, ?B/s]

tokenizer/merges.txt:   0%|          | 0.00/525k [00:00<?, ?B/s]

scheduler/scheduler_config.json:   0%|          | 0.00/313 [00:00<?, ?B/s]

safety_checker/config.json:   0%|          | 0.00/4.56k [00:00<?, ?B/s]

tokenizer/special_tokens_map.json:   0%|          | 0.00/472 [00:00<?, ?B/s]

tokenizer/tokenizer_config.json:   0%|          | 0.00/806 [00:00<?, ?B/s]

vae/config.json:   0%|          | 0.00/551 [00:00<?, ?B/s]

unet/config.json:   0%|          | 0.00/743 [00:00<?, ?B/s]

tokenizer/vocab.json:   0%|          | 0.00/1.06M [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/1.22G [00:00<?, ?B/s]

diffusion_pytorch_model.safetensors:   0%|          | 0.00/3.44G [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/492M [00:00<?, ?B/s]

diffusion_pytorch_model.safetensors:   0%|          | 0.00/335M [00:00<?, ?B/s]

Loading pipeline components...:   0%|          | 0/7 [00:00<?, ?it/s]

  0%|          | 0/50 [00:00<?, ?it/s]

  0%|          | 0/50 [00:00<?, ?it/s]

  0%|          | 0/50 [00:00<?, ?it/s]

  0%|          | 0/50 [00:00<?, ?it/s]

  0%|          | 0/50 [00:00<?, ?it/s]

  0%|          | 0/50 [00:00<?, ?it/s]

  0%|          | 0/50 [00:00<?, ?it/s]

  0%|          | 0/50 [00:00<?, ?it/s]

  0%|          | 0/50 [00:00<?, ?it/s]

  0%|          | 0/50 [00:00<?, ?it/s]

  0%|          | 0/50 [00:00<?, ?it/s]

Moviepy - Building video output_video.mp4.
Moviepy - Writing video output_video.mp4





Moviepy - Done !
Moviepy - video ready output_video.mp4
Moviepy - Building video final_output.mp4.
MoviePy - Writing audio in final_outputTEMP_MPY_wvf_snd.mp3




MoviePy - Done.
Moviepy - Writing video final_output.mp4






Moviepy - Done !
Moviepy - video ready final_output.mp4
