<a href="https://colab.research.google.com/github/bleakcim/videoGenerator/blob/main/MusicVideo_Complete.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# üéµ Music Video Generator - Gerador Autom√°tico de V√≠deos Musicais

### Crie v√≠deos completos para suas m√∫sicas usando IA!

**O que esse notebook faz:**
1. üéµ Analisa sua m√∫sica (intro, verso, refr√£o, ponte, outro)
2. üñºÔ∏è Gera imagens √∫nicas para cada se√ß√£o
3. üé¨ Cria v√≠deos animados de cada imagem
4. üéûÔ∏è Monta tudo sincronizado com a m√∫sica
5. üé• Exporta v√≠deo pronto para YouTube!

---

## ‚ö†Ô∏è IMPORTANTE: Ativar GPU!

1. V√° em **Runtime ‚Üí Change runtime type**
2. Selecione **GPU** como Hardware accelerator
3. Clique em **Save**

---

## üì¶ Passo 1: Instalar Depend√™ncias

Isso vai demorar ~3-5 minutos.

In [1]:
print("üì¶ Instalando depend√™ncias...\n")
print("‚ö†Ô∏è BUG CONHECIDO: diffusers 0.35.1 + transformers 4.57.0 s√£o INCOMPAT√çVEIS!")
print("   Solu√ß√£o: Instalar vers√µes anteriores est√°veis que funcionam.\n")

# Desinstalar vers√µes conflitantes
!pip uninstall -y transformers diffusers accelerate -q

# Instalar vers√µes EST√ÅVEIS TESTADAS (sem o bug offload_state_dict)
# Refer√™ncia: https://github.com/huggingface/diffusers/issues/12483
!pip install -q transformers==4.44.2 diffusers==0.30.2 accelerate==0.34.2
!pip install -q safetensors
!pip install -q librosa soundfile moviepy imageio imageio-ffmpeg scipy
!pip install -q gradio

print("\n‚úÖ Instala√ß√£o conclu√≠da!")
print("üìã Vers√µes instaladas (EST√ÅVEIS, SEM BUG):")
print("   ‚Ä¢ transformers: 4.44.2")
print("   ‚Ä¢ diffusers: 0.30.2")
print("   ‚Ä¢ accelerate: 0.34.2")
print("\n‚ö†Ô∏è OBRIGAT√ìRIO: Reinicie o Runtime AGORA!")
print("   Menu: Runtime ‚Üí Restart runtime")
print("   (Sem restart, vai continuar usando as vers√µes bugadas!)")

üì¶ Instalando depend√™ncias...

[2K   [90m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m [32m117.2/117.2 MB[0m [31m8.8 MB/s[0m eta [36m0:00:00[0m
[?25h
‚úÖ Instala√ß√£o conclu√≠da!


## üîç Passo 2: Verificar GPU

In [2]:
import torch

if torch.cuda.is_available():
    print("‚úÖ GPU Dispon√≠vel!")
    print(f"üéÆ GPU: {torch.cuda.get_device_name(0)}")
    print(f"üíæ VRAM: {torch.cuda.get_device_properties(0).total_memory / 1e9:.2f} GB")
else:
    print("‚ùå GPU N√ÉO dispon√≠vel!")
    print("‚ö†Ô∏è V√° em Runtime ‚Üí Change runtime type ‚Üí Selecione GPU")

‚úÖ GPU Dispon√≠vel!
üéÆ GPU: Tesla T4
üíæ VRAM: 15.83 GB


## üé¨ Passo 3: C√≥digo do Gerador de V√≠deos Musicais

In [3]:
import os
import json
import numpy as np
from datetime import datetime
from PIL import Image
import torch
from diffusers import StableVideoDiffusionPipeline, StableDiffusionXLPipeline
from diffusers.utils import export_to_video
import librosa
from moviepy.editor import VideoFileClip, AudioFileClip, concatenate_videoclips

class MusicVideoGenerator:
    def __init__(self):
        self.device = "cuda" if torch.cuda.is_available() else "cpu"
        self.dtype = torch.float16 if torch.cuda.is_available() else torch.float32
        self.video_pipeline = None
        self.image_pipeline = None
        print(f"üíª Dispositivo: {self.device.upper()}")

    def load_models(self):
        """Carrega os modelos necess√°rios"""
        try:
            # Verificar vers√µes instaladas
            import diffusers
            import transformers
            print(f"üîç Verificando vers√µes instaladas...")
            print(f"   ‚Ä¢ transformers: {transformers.__version__}")
            print(f"   ‚Ä¢ diffusers: {diffusers.__version__}")

            # Verificar se s√£o as vers√µes problem√°ticas
            if diffusers.__version__ == "0.35.1" and transformers.__version__.startswith("4.57"):
                print("\n‚ùå ERRO: Vers√µes com BUG CONHECIDO detectadas!")
                print("   Bug: https://github.com/huggingface/diffusers/issues/12483")
                print("\n‚ö†Ô∏è Voc√™ DEVE voltar e executar a c√©lula de instala√ß√£o!")
                print("   Depois reinicie: Runtime ‚Üí Restart runtime")
                return False

            # Avisar se est√° usando vers√µes muito novas (podem ter outros bugs)
            diff_version = tuple(map(int, diffusers.__version__.split('.')[:2]))
            if diff_version >= (0, 35):
                print("\n‚ö†Ô∏è AVISO: Voc√™ est√° usando diffusers muito novo (pode ter bugs)")
                print("   Recomendado: diffusers 0.30.2 + transformers 4.44.2")
                print("   Continuando mesmo assim...\n")
            else:
                print("‚úÖ Vers√µes OK!\n")

            print("‚è≥ Carregando modelos (5-10 min na primeira vez)...\n")

            # Modelo para gerar imagens (SDXL)
            print("üì• 1/2: Baixando modelo de gera√ß√£o de imagens (SDXL)...")
            self.image_pipeline = StableDiffusionXLPipeline.from_pretrained(
                "stabilityai/stable-diffusion-xl-base-1.0",
                torch_dtype=self.dtype,
                use_safetensors=True
            )
            self.image_pipeline.to(self.device)

            if self.device == "cuda":
                self.image_pipeline.enable_model_cpu_offload()
                self.image_pipeline.enable_vae_slicing()

            print("‚úÖ Modelo de imagens carregado!\n")

            # Modelo para gerar v√≠deos (SVD)
            print("üì• 2/2: Baixando modelo de gera√ß√£o de v√≠deos (SVD)...")
            self.video_pipeline = StableVideoDiffusionPipeline.from_pretrained(
                "stabilityai/stable-video-diffusion-img2vid-xt",
                torch_dtype=self.dtype
            )
            self.video_pipeline.to(self.device)

            if self.device == "cuda":
                self.video_pipeline.enable_model_cpu_offload()

            self.video_pipeline.enable_vae_slicing()

            print("‚úÖ Modelo de v√≠deos carregado!")
            print("\nüéâ Todos os modelos prontos para uso!\n")
            return True

        except Exception as e:
            print(f"‚ùå Erro ao carregar modelos: {str(e)}")
            import traceback
            traceback.print_exc()
            return False

    def analyze_music(self, audio_path):
        """Analisa a m√∫sica e identifica se√ß√µes"""
        print(f"\nüéµ Analisando m√∫sica: {audio_path}...")

        # Carregar √°udio
        y, sr = librosa.load(audio_path)

        # Detectar batidas e tempo
        tempo, beats = librosa.beat.beat_track(y=y, sr=sr)

        # Converter tempo para float (pode retornar array)
        if isinstance(tempo, np.ndarray):
            tempo = float(tempo[0] if len(tempo) > 0 else tempo)
        else:
            tempo = float(tempo)

        # Detectar mudan√ßas de se√ß√£o
        boundaries = librosa.segment.agglomerative(
            librosa.feature.mfcc(y=y, sr=sr),
            k=8  # n√∫mero de se√ß√µes
        )
        boundary_times = librosa.frames_to_time(boundaries, sr=sr)

        # Analisar energia
        rms = librosa.feature.rms(y=y)[0]

        # Criar estrutura de se√ß√µes
        sections = []
        for i in range(len(boundary_times) - 1):
            start_time = boundary_times[i]
            end_time = boundary_times[i + 1]
            duration = end_time - start_time

            # Calcular energia m√©dia
            start_frame = int(start_time * sr / 512)
            end_frame = int(end_time * sr / 512)
            avg_energy = np.mean(rms[start_frame:end_frame])

            section_type = self._guess_section_type(i, len(boundary_times), avg_energy)

            sections.append({
                'index': i,
                'start': float(start_time),
                'end': float(end_time),
                'duration': float(duration),
                'energy': float(avg_energy),
                'type': section_type
            })

            print(f"  {i+1}. {section_type:12s} | {start_time:6.1f}s - {end_time:6.1f}s | Dura√ß√£o: {duration:5.1f}s | Energia: {avg_energy:.3f}")

        total_duration = float(librosa.get_duration(y=y, sr=sr))

        print(f"\nüìä Total: {len(sections)} se√ß√µes | {total_duration:.1f}s | {tempo:.0f} BPM\n")

        return {
            'sections': sections,
            'tempo': tempo,
            'total_duration': total_duration
        }

    def _guess_section_type(self, index, total_sections, energy):
        """Identifica o tipo de se√ß√£o"""
        if index == 0:
            return "Intro"
        elif index == total_sections - 2:
            return "Outro"
        elif energy > 0.15:
            return "Refr√£o"
        elif energy > 0.10:
            return "Ponte"
        else:
            return "Verso"

    def generate_scene_prompts(self, sections, character, theme, style):
        """Gera prompts visuais para cada se√ß√£o"""
        print("üìù Gerando roteiro visual...\n")

        base_style = f"{style} style, cinematic, high quality, 4k, detailed"
        char = character if character else "person"

        templates = {
            "Intro": [
                f"{char} in dramatic opening scene, {theme}, mysterious atmosphere, {base_style}",
                f"Close-up of {char}, {theme}, cinematic lighting, {base_style}",
                f"Silhouette of {char}, {theme}, atmospheric, {base_style}"
            ],
            "Verso": [
                f"{char} in storytelling scene, {theme}, narrative moment, {base_style}",
                f"{char} expressing emotion, {theme}, artistic shot, {base_style}",
                f"Medium shot of {char}, {theme}, cinematic composition, {base_style}"
            ],
            "Refr√£o": [
                f"{char} in energetic scene, {theme}, dynamic action, vibrant colors, {base_style}",
                f"{char} dramatic performance, {theme}, high energy, intense, {base_style}",
                f"Wide shot of {char}, {theme}, spectacular, {base_style}"
            ],
            "Ponte": [
                f"{char} contemplative moment, {theme}, atmospheric, {base_style}",
                f"Artistic shot of {char}, {theme}, creative angle, {base_style}",
                f"{char} transition scene, {theme}, smooth, {base_style}"
            ],
            "Outro": [
                f"{char} closing scene, {theme}, resolution, fading light, {base_style}",
                f"Final shot of {char}, {theme}, memorable ending, {base_style}",
                f"{char} walking away, {theme}, cinematic finale, {base_style}"
            ]
        }

        prompts = []
        for i, section in enumerate(sections):
            section_type = section['type']
            section_templates = templates.get(section_type, templates["Verso"])
            prompt = section_templates[i % len(section_templates)]

            # Ajustar pela energia
            if section['energy'] > 0.15:
                prompt += ", vibrant, energetic, bold colors"
            elif section['energy'] < 0.08:
                prompt += ", calm, peaceful, soft lighting"

            prompts.append({
                'section_index': i,
                'section_type': section_type,
                'prompt': prompt,
                'negative_prompt': "ugly, distorted, low quality, blurry, watermark, text, bad anatomy"
            })

            print(f"  {i+1}. {section_type:12s}: {prompt[:80]}...")

        print()
        return prompts

    def generate_image(self, prompt, negative_prompt, seed=-1):
        """Gera uma imagem"""
        if self.image_pipeline is None:
            raise Exception("‚ö†Ô∏è Modelo de imagens n√£o carregado! Execute load_models() primeiro.")

        generator = None
        if seed != -1:
            generator = torch.Generator(device=self.device).manual_seed(seed)

        image = self.image_pipeline(
            prompt=prompt,
            negative_prompt=negative_prompt,
            num_inference_steps=30,
            guidance_scale=7.5,
            generator=generator
        ).images[0]

        return image

    def generate_video_from_image(self, image, duration, motion_intensity=127):
        """Gera v√≠deo de uma imagem"""
        if self.video_pipeline is None:
            raise Exception("‚ö†Ô∏è Modelo de v√≠deos n√£o carregado! Execute load_models() primeiro.")

        image = image.resize((1024, 576))
        num_frames = max(14, min(50, int(duration * 7)))

        frames = self.video_pipeline(
            image,
            height=576,
            width=1024,
            num_frames=num_frames,
            num_inference_steps=25,
            fps=7,
            motion_bucket_id=motion_intensity,
            decode_chunk_size=4
        ).frames[0]

        return frames

    def create_music_video(
        self,
        audio_path,
        character_description,
        theme,
        style="cinematic",
        seed=-1
    ):
        """Cria o v√≠deo musical completo"""

        # Verificar se modelos foram carregados
        if self.image_pipeline is None or self.video_pipeline is None:
            print("‚ùå ERRO: Modelos n√£o foram carregados!")
            print("‚ö†Ô∏è Execute a c√©lula 'Carregar os Modelos de IA' primeiro!")
            return None

        print("\n" + "="*60)
        print("üé¨ INICIANDO GERA√á√ÉO DO V√çDEO MUSICAL")
        print("="*60)

        output_dir = "/content/outputs/music_videos"
        os.makedirs(output_dir, exist_ok=True)
        timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")

        # 1. Analisar m√∫sica
        music_analysis = self.analyze_music(audio_path)
        sections = music_analysis['sections']

        # 2. Gerar prompts
        scene_prompts = self.generate_scene_prompts(
            sections, character_description, theme, style
        )

        # 3. Gerar imagens e v√≠deos
        print("üé¨ Gerando cenas...\n")
        video_clips = []

        for i, (section, scene) in enumerate(zip(sections, scene_prompts)):
            print(f"[{i+1}/{len(sections)}] üé® Gerando imagem para {section['type']}...")

            # Gerar imagem
            image = self.generate_image(
                scene['prompt'],
                scene['negative_prompt'],
                seed if seed != -1 else -1
            )

            # Salvar imagem
            image_path = f"{output_dir}/scene_{timestamp}_{i:02d}.png"
            image.save(image_path)
            print(f"         ‚úÖ Imagem salva: {image_path}")

            # Gerar v√≠deo
            print(f"         üéûÔ∏è  Gerando v√≠deo ({section['duration']:.1f}s)...")
            motion_intensity = int(100 + (section['energy'] * 500))
            motion_intensity = min(200, max(80, motion_intensity))

            frames = self.generate_video_from_image(
                image, section['duration'], motion_intensity
            )

            # Salvar v√≠deo tempor√°rio
            temp_video = f"{output_dir}/temp_{timestamp}_{i:02d}.mp4"
            export_to_video(frames, temp_video, fps=7)

            # Ajustar dura√ß√£o
            clip = VideoFileClip(temp_video)
            if clip.duration < section['duration']:
                clip = clip.loop(duration=section['duration'])
            else:
                clip = clip.subclip(0, section['duration'])

            video_clips.append(clip)
            print(f"         ‚úÖ V√≠deo gerado!\n")

        # 4. Montar v√≠deo final
        print("üéûÔ∏è  Montando v√≠deo final...")
        final_video = concatenate_videoclips(video_clips, method="compose")

        # 5. Adicionar √°udio
        print("üîä Adicionando √°udio...")
        audio = AudioFileClip(audio_path)

        if final_video.duration > audio.duration:
            final_video = final_video.subclip(0, audio.duration)
        elif final_video.duration < audio.duration:
            audio = audio.subclip(0, final_video.duration)

        final_video = final_video.set_audio(audio)

        # 6. Salvar
        output_path = f"{output_dir}/music_video_{timestamp}.mp4"
        print(f"üíæ Salvando v√≠deo final em {output_path}...")

        final_video.write_videofile(
            output_path,
            codec='libx264',
            audio_codec='aac',
            fps=24,
            preset='medium',
            threads=4
        )

        # Limpar
        for clip in video_clips:
            clip.close()
        final_video.close()
        audio.close()

        print("\n" + "="*60)
        print("üéâ V√çDEO MUSICAL CRIADO COM SUCESSO!")
        print("="*60)
        print(f"\nüìÅ Arquivo: {output_path}")
        print(f"‚è±Ô∏è  Dura√ß√£o: {music_analysis['total_duration']:.1f}s")
        print(f"üé¨ Cenas: {len(sections)}")
        print(f"üéµ BPM: {music_analysis['tempo']:.0f}")
        print()

        return output_path

# Criar inst√¢ncia do gerador
generator = MusicVideoGenerator()

print("‚úÖ Classe MusicVideoGenerator carregada!")

  IMAGEMAGICK_BINARY = r"C:\Program Files\ImageMagick-6.8.8-Q16\magick.exe"
  lines_video = [l for l in lines if ' Video: ' in l and re.search('\d+x\d+', l)]
  rotation_lines = [l for l in lines if 'rotate          :' in l and re.search('\d+$', l)]
  match = re.search('\d+$', rotation_line)
  if event.key is 'enter':



üíª Dispositivo: CUDA
‚úÖ Classe MusicVideoGenerator carregada!


## üîÑ Passo 4: Carregar os Modelos de IA

**‚ö†Ô∏è IMPORTANTE:** Isso vai baixar ~14 GB de modelos na primeira execu√ß√£o (5-10 min)

In [4]:
# Carregar modelos
success = generator.load_models()

if success:
    print("üöÄ Tudo pronto para gerar v√≠deos!")
else:
    print("‚ùå Erro ao carregar modelos. Verifique os erros acima.")


‚è≥ Carregando modelos (isso pode demorar 5-10 min na primeira vez)...

üì• 1/2: Baixando modelo de gera√ß√£o de imagens (SDXL)...


The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.



model_index.json:   0%|          | 0.00/609 [00:00<?, ?B/s]

Fetching 19 files:   0%|          | 0/19 [00:00<?, ?it/s]

special_tokens_map.json:   0%|          | 0.00/472 [00:00<?, ?B/s]

scheduler_config.json:   0%|          | 0.00/479 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/575 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/737 [00:00<?, ?B/s]

merges.txt: 0.00B [00:00, ?B/s]

config.json:   0%|          | 0.00/565 [00:00<?, ?B/s]

text_encoder_2/model.fp16.safetensors:   0%|          | 0.00/1.39G [00:00<?, ?B/s]

text_encoder/model.fp16.safetensors:   0%|          | 0.00/246M [00:00<?, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/460 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/725 [00:00<?, ?B/s]

config.json: 0.00B [00:00, ?B/s]

unet/diffusion_pytorch_model.fp16.safete(‚Ä¶):   0%|          | 0.00/5.14G [00:00<?, ?B/s]

config.json:   0%|          | 0.00/642 [00:00<?, ?B/s]

vae_1_0/diffusion_pytorch_model.fp16.saf(‚Ä¶):   0%|          | 0.00/167M [00:00<?, ?B/s]

vae/diffusion_pytorch_model.fp16.safeten(‚Ä¶):   0%|          | 0.00/167M [00:00<?, ?B/s]

Loading pipeline components...:   0%|          | 0/7 [00:00<?, ?it/s]

`torch_dtype` is deprecated! Use `dtype` instead!


‚ùå Erro ao carregar modelos: CLIPTextModelWithProjection.__init__() got an unexpected keyword argument 'offload_state_dict'
‚ùå Erro ao carregar modelos. Verifique os erros acima.


Traceback (most recent call last):
  File "/tmp/ipython-input-1348765211.py", line 27, in load_models
    self.image_pipeline = StableDiffusionXLPipeline.from_pretrained(
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/diffusers/pipelines/pipeline_utils.py", line 1025, in from_pretrained
    loaded_sub_model = load_sub_model(
                       ^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/diffusers/pipelines/pipeline_loading_utils.py", line 849, in load_sub_model
    loaded_sub_model = load_method(os.path.join(cached_folder, name), **loading_kwargs)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/transformers/modeling_utils.py", line 277, i

## üéµ Passo 5: Upload da sua M√∫sica

Fa√ßa upload do arquivo de √°udio (MP3, WAV, etc)

In [None]:
from google.colab import files

print("üì§ Fa√ßa upload da sua m√∫sica:")
uploaded = files.upload()

# Pegar o nome do arquivo
audio_file = list(uploaded.keys())[0]
print(f"\n‚úÖ Arquivo carregado: {audio_file}")

üì§ Fa√ßa upload da sua m√∫sica:


## üé® Passo 6: Configurar o V√≠deo

In [None]:
# ===== CONFIGURE AQUI =====

# Descri√ß√£o do personagem principal
# Exemplos:
# - "young woman with long black hair, wearing elegant dress"
# - "muscular man with tattoos, wearing leather jacket"
# - "anime girl with blue hair and magical powers"
CHARACTER = "beautiful person with flowing hair"

# Tema/Ambiente do v√≠deo
# Exemplos:
# - "cyberpunk city with neon lights"
# - "enchanted forest with mystical atmosphere"
# - "futuristic space station"
# - "urban streets at sunset"
THEME = "urban landscape at golden hour"

# Estilo visual
# Op√ß√µes: cinematic, anime, photorealistic, oil painting, digital art, watercolor, 3D render
STYLE = "cinematic"

# Seed (use -1 para aleat√≥rio, ou n√∫mero fixo para resultados reproduz√≠veis)
SEED = -1

print("‚úÖ Configura√ß√£o definida:")
print(f"  üë§ Personagem: {CHARACTER}")
print(f"  üåç Tema: {THEME}")
print(f"  üé® Estilo: {STYLE}")
print(f"  üé≤ Seed: {SEED}")

## üöÄ Passo 7: GERAR O V√çDEO MUSICAL!

**‚è±Ô∏è Tempo estimado:**
- M√∫sica de 3 min com GPU T4: ~15-20 minutos
- Cada cena leva ~2-3 minutos

**‚òï Pegue um caf√© enquanto a IA trabalha!**

In [None]:
import time

start_time = time.time()

# GERAR O V√çDEO MUSICAL!
output_video_path = generator.create_music_video(
    audio_path=audio_file,
    character_description=CHARACTER,
    theme=THEME,
    style=STYLE,
    seed=SEED
)

elapsed_time = time.time() - start_time
print(f"\n‚è±Ô∏è  Tempo total: {elapsed_time/60:.1f} minutos")
print(f"\nüéâ Seu v√≠deo musical est√° pronto!")
print(f"üìÅ Localiza√ß√£o: {output_video_path}")

## üé• Passo 8: Visualizar o V√≠deo

In [None]:
from IPython.display import Video

# Mostrar o v√≠deo
Video(output_video_path, width=800)

## üì• Passo 9: Baixar o V√≠deo

In [None]:
from google.colab import files

# Baixar o v√≠deo final
print("üì• Baixando v√≠deo...")
files.download(output_video_path)

print("‚úÖ Download iniciado! O arquivo ser√° salvo na sua pasta de Downloads.")

## üì¶ (Opcional) Baixar Todas as Imagens Geradas

In [None]:
import glob
from google.colab import files

# Encontrar todas as imagens
images = glob.glob("/content/outputs/music_videos/scene_*.png")

print(f"üì∏ Encontradas {len(images)} imagens\n")

# Baixar cada imagem
for img in sorted(images):
    print(f"üì• {os.path.basename(img)}")
    files.download(img)

print("\n‚úÖ Todas as imagens foram baixadas!")

---

## üí° Dicas e Exemplos

### üé§ Exemplos de Configura√ß√µes

#### 1. V√≠deo Pop Energ√©tico:
```python
CHARACTER = "young woman with colorful hair, wearing trendy outfit"
THEME = "vibrant city lights, neon signs, urban nightlife"
STYLE = "cinematic"
```

#### 2. V√≠deo Anime/J-Pop:
```python
CHARACTER = "anime girl with pink hair and magical girl outfit"
THEME = "fantasy world with floating islands and cherry blossoms"
STYLE = "anime"
```

#### 3. V√≠deo Rock/Metal:
```python
CHARACTER = "rock musician with long hair and leather jacket"
THEME = "industrial wasteland, apocalyptic atmosphere, dramatic sky"
STYLE = "cinematic"
```

#### 4. V√≠deo Eletr√¥nica/EDM:
```python
CHARACTER = "silhouette of DJ with glowing elements"
THEME = "futuristic club, laser lights, holographic effects, neon colors"
STYLE = "digital art"
```

#### 5. V√≠deo Lo-fi/Chill:
```python
CHARACTER = "person studying at desk with headphones"
THEME = "cozy room, rainy window, warm lighting, peaceful atmosphere"
STYLE = "watercolor"
```

### üé® Dicas de Personagens:
- Seja **espec√≠fico**: cor de cabelo, roupas, acess√≥rios
- Mantenha **consistente** em todo v√≠deo
- Adicione caracter√≠sticas marcantes

### üåç Dicas de Tema:
- Combine local + atmosfera + ilumina√ß√£o
- Use adjetivos visuais (neon, mystical, dramatic)
- Pense na est√©tica que combina com sua m√∫sica

### ‚ö° Performance:
- **GPU T4 (Colab)**: ~2-3 min por cena
- **M√∫sica de 3 min**: ~8 cenas = 15-20 min total
- **M√∫sica de 5 min**: ~12 cenas = 25-30 min total

### üé¨ Para YouTube:
O v√≠deo √© gerado em:
- **Resolu√ß√£o**: 1024x576 (16:9)
- **FPS**: 24
- **Codec**: H.264
- Pronto para upload direto!

---

## üÜò Problemas?

### ‚ùå "Out of memory":
- Certifique-se que GPU est√° ativada
- Reinicie o runtime: Runtime ‚Üí Restart runtime
- M√∫sica muito longa? Tente uma mais curta primeiro

### ‚è∞ "Tempo esgotado":
- Colab gr√°tis tem 12h de limite
- Salve o v√≠deo antes do tempo acabar

### üñºÔ∏è Imagens n√£o ficaram boas:
- Melhore a descri√ß√£o do CHARACTER
- Seja mais espec√≠fico no THEME
- Tente um SEED diferente
- Experimente outro STYLE

---

**üéâ Desenvolvido com ‚ù§Ô∏è usando Stable Diffusion XL + Stable Video Diffusion**