<a href="https://colab.research.google.com/github/25-2-GenerativeAI/ebook-to-bgm/blob/main/genai_llm.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
import transformers

In [2]:
from typing import List, Tuple
import torch

from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline, AutoModelForCausalLM

In [3]:
model_name = "j-hartmann/emotion-english-roberta-large"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
sentiment_analyzer = pipeline("sentiment-analysis", model=model, tokenizer=tokenizer)

def predict_emotion(text: str) -> str:
    result = sentiment_analyzer(text)

    return result[0]["label"]

def sliding_window(text: str, window_size: int, stride: int = 25) -> List[str]:
    tokens = text.split()
    windows = []
    for i in range(0, len(tokens), stride):
        window = " ".join(tokens[i:i + window_size])
        windows.append(window)
    return windows

def merge_by_emotions(windows: List[str]) -> List[Tuple[str, str]]:
    merged = []
    prev_emotion = None
    buffer = []

    for w in windows:
        emotion = predict_emotion(w)
        if prev_emotion is None:
            prev_emotion = emotion
            buffer.append(w)
        elif emotion == prev_emotion:
            buffer.append(w)
        else:
            merged.append((" ".join(buffer), prev_emotion))
            buffer = [w]
            prev_emotion = emotion

    if buffer:
        merged.append((" " .join(buffer), prev_emotion))
        buffer = []
    return merged

if __name__ == "__main__":
    text = "Today I was happy. However becasue of the gloomy weather I was depressed. But I met my friend and I laughed again."

    windows = sliding_window(text, window_size=5, stride=5)
    segments = merge_by_emotions(windows)

    for seg, emo in segments:
        print(f"[{emo}] {seg}")

tokenizer_config.json:   0%|          | 0.00/328 [00:00<?, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

merges.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/239 [00:00<?, ?B/s]

config.json: 0.00B [00:00, ?B/s]

pytorch_model.bin:   0%|          | 0.00/1.42G [00:00<?, ?B/s]

Device set to use cuda:0


model.safetensors:   0%|          | 0.00/1.42G [00:00<?, ?B/s]

[joy] Today I was happy. However
[sadness] becasue of the gloomy weather I was depressed. But I
[joy] met my friend and I laughed again.


**mistral-7b**

In [6]:
prompt_model_name = "aws-prototyping/MegaBeam-Mistral-7B-512k"
prompt_tokenizer = AutoTokenizer.from_pretrained(prompt_model_name)
prompt_model = AutoModelForCausalLM.from_pretrained(prompt_model_name)

def generate_prompt(emotion: str, segment: str) -> str:

    instruction = f"You are text generator. Given the original test '{segment}' and the emotion '{emotion}', write a detailed prompt for this text's background music generation. In the prompt, describe adaquate genre, bpm, instrument, and mood"
    inputs = prompt_tokenizer(instruction, return_tensors="pt").to(prompt_model.device)

    with torch.no_grad():
        outputs = prompt_model.generate(
            **inputs,
            max_new_tokens=80,
            temperature=0.8,
            top_p=0.9
        )

    result = prompt_tokenizer.decode(outputs[0], skip_special_tokens=True)
    return result

for seg, emo in segments:
    music_prompt = generate_prompt(emo)
    print(music_prompt)

tokenizer_config.json: 0.00B [00:00, ?B/s]

tokenizer.model:   0%|          | 0.00/493k [00:00<?, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/72.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/680 [00:00<?, ?B/s]

model.safetensors.index.json: 0.00B [00:00, ?B/s]

Fetching 3 files:   0%|          | 0/3 [00:00<?, ?it/s]

model-00002-of-00003.safetensors:   0%|          | 0.00/5.00G [00:00<?, ?B/s]

model-00003-of-00003.safetensors:   0%|          | 0.00/4.54G [00:00<?, ?B/s]

model-00001-of-00003.safetensors:   0%|          | 0.00/4.94G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/3 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/111 [00:00<?, ?B/s]

The following generation flags are not valid and may be ignored: ['temperature', 'top_p']. Set `TRANSFORMERS_VERBOSITY=info` for more details.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


Given the emotion 'joy', write a detailed prompt for background music generation. Describe instruments, mood, and atmosphere.

Background Music Prompt: Joyful Symphony

Instruments: The background music for this joyful scene should feature a lively and uplifting orchestra. The main instruments to include are the violins, cellos, and trumpets. The violins should play a fast, joyful melody with a bright and cheerful tone. The cellos should provide a


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


Given the emotion 'sadness', write a detailed prompt for background music generation. Describe instruments, mood, and atmosphere.

Background Music Prompt: Sadness

Instruments: The background music for sadness should feature a mix of strings and piano, with a touch of soft percussion. The strings should be played with a melancholic tone, using a slow tempo to evoke a sense of sorrow and longing. The piano should be used sparingly, with notes that are simple
Given the emotion 'joy', write a detailed prompt for background music generation. Describe instruments, mood, and atmosphere.

Background Music Prompt: Joyful Symphony

Instruments: The background music for this joyful scene should feature a lively and uplifting orchestra. The main instruments to include are the violins, cellos, and trumpets. The violins should play a fast, joyful melody with a bright and cheerful tone. The cellos should provide a


**llama 3-7b**

In [None]:
import transformers
import torch

model_id = "meta-llama/Llama-3.1-8B"

pipeline = transformers.pipeline(
    "text-generation", model=model_id, model_kwargs={"torch_dtype": torch.bfloat16}, device_map="auto"
)

pipeline("Hey how are you doing today?")

In [14]:
model_id = "meta-llama/Meta-Llama-3-8B-Instruct"

pipeline = transformers.pipeline(
    "text-generation",
    model=model_id,
    model_kwargs={"torch_dtype": torch.bfloat16},
    device_map="auto",
)

def generate_prompt(emotion: str, segment: str) -> str:
  messages = [
    {"role": "system", "content": f"You are a text generator! Given the original text '{segment}' and emotion '{emotion}', write a detailed prompt for this text's background music generation. This prompt should contain genre, bpm, instruments, and mood that describes this text and its emotion"}
  ]

  terminators = [
      pipeline.tokenizer.eos_token_id,
      pipeline.tokenizer.convert_tokens_to_ids("<|eot_id|>")
  ]

  outputs = pipeline(
      messages,
      max_new_tokens=256,
      eos_token_id=terminators,
      do_sample=True,
      temperature=0.6,
      top_p=0.9,
  )

  return outputs[0]["generated_text"][-1]


Loading checkpoint shards:   0%|          | 0/4 [00:00<?, ?it/s]

Device set to use cuda:0


In [16]:
for seg, emo in segments:
    music_prompt = generate_prompt(emo, seg)
    print(music_prompt)

Setting `pad_token_id` to `eos_token_id`:128009 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:128009 for open-end generation.


{'role': 'assistant', 'content': 'Here is a detailed prompt for generating background music for the given text and emotion:\n\n**Text:** "Today I was happy. However"\n\n**Emotion:** Joy\n\n**Background Music Generation Prompt:**\n\nCreate a lively and uplifting background music piece that captures the joyful emotion expressed in the text. The music should be a blend of energetic and playful elements, evoking a sense of happiness and optimism.\n\n**Genre:** Indie-Folk/Pop with a hint of Upbeat Acoustic\n\n**BPM:** 120-125 beats per minute (medium-fast tempo)\n\n**Instruments:**\n\n* Acoustic Guitar: Main melody instrument, providing a bouncy and upbeat foundation\n* Ukulele: Adding a bright and cheerful touch, often playing rhythmic patterns and chord progressions\n* Piano: Providing a subtle yet effective harmonic accompaniment, occasionally taking center stage with melodic phrases\n* Light Percussion: Adding a playful and carefree feel with instruments such as tambourine, shaker, and 

Setting `pad_token_id` to `eos_token_id`:128009 for open-end generation.


{'role': 'assistant', 'content': 'Based on the provided text and emotion, I\'ve created a detailed prompt for background music generation:\n\n**Prompt:**\n\nCreate a melancholic and introspective background music piece that complements the text "Because of the gloomy weather I was depressed. But I..." and its underlying emotion of sadness.\n\n**Genre:** Ambient/Instrumental/Contemporary Classical\n\n**BPM:** 60-70 (a moderate, steady pace that evokes a sense of contemplation)\n\n**Instruments:**\n\n* Piano: The primary melodic instrument, playing a simple, yet emotive melody that mirrors the narrator\'s sadness.\n* Acoustic Guitar: Adding a subtle, gentle texture to the piano, providing a sense of intimacy and vulnerability.\n* Strings (Violins, Violas, Cellos): A minimalist string section, playing soft, legato lines that enhance the emotional impact of the piano and guitar.\n* Woodwinds (Flute, Clarinet): Adding a touch of melancholy and longing, with soft, mournful phrases that compl