# Forcing 2 AIs to Sing Still Alive From Portal

By: [Jaden McElvey](https://twitter.com/JadenMcelvey)

Last Updated: August 2020

This notebook is intended to accompany my video on the 2009 puzzle platformer Portal. You can find the video explaining why I did this [here](https://youtu.be/QYBkCBUvu6w) and you can find my completed song [here](https://youtu.be/QvWH579lAPI).

---

Let's start off with the important stuff. I'm no expert on AI or machine learning or anything like that. I am but a lowly CS undergrad, so I got a lot of help from the internet and open source projects. If you're interested in seeing the work that helped me complete this project you should check the [References](#References) section at the end of this document. Throughout this notebook I'll use [$^{footnotes}$
](#Footnotes) that link to the bottom of the notebook and explain how smart people choosing to share information made this possible.

## What Happened Here
If your reading this its because you wanted a more in depth insight into what [this](https://) is and why/how it was made.
### What?
That video linked up there is a recreation of a song from a video game called Portal. In the game an artificial intelligence named GlaDOS sings you a song. I decided to teach a machine learning model to recreate this song. This did not go well.
### Why?
I make videos about why I think video games are art. My youtube channel is [here](https://www.youtube.com/channel/UCD9pjdMgEQ0YfKzNXcFZm9g/). Recently I decided to make a video about the beloved video game Portal, so I replayed the game and started working on a concept for a youtube video. The final video can be found [here](https://). I decided to make this video about music created by artificial intelligence. I then had the great(read time consuming) idea of remaking the iconic song *Still Alive*(which is sung in game by an AI) by training some machine learning models on the original music.
### How?
**WARNING INCREDIBLY NERDY INFO BELOW FEEL FREE TO SKIP THIS SECTION BECAUSE IT ISN'T CRUCIAL TO UNDERSTANDING THIS NOTEBOOK**

The following is a list of tools used in this notebook that together will produce the final song.

* [OpenAI's](https://openai.com/) [GPT-2](https://openai.com/blog/better-language-models/)**:** This is the machine learning model that will be used to generate the lyrics to the song
* [Google's Magenta](https://magenta.tensorflow.org/)**:** This is a model trained on music that will be used to create a semi-original composition based on the music from *Still Alive*.
* [gTTS](https://pypi.org/project/gTTS/)**:** This python library will allow us to convert text to speech for the final song
* [Max Woolfs](https://minimaxir.com/) [aitextgen](https://github.com/minimaxir/aitextgen)**:** This is a tool created by Max Woolf that makes it easy for us to generate text with GPT-2
* [FFmpeg](https://ffmpeg.org/)**:** This is a utility that will let us do a bit of audio processing later
* [pydub](https://github.com/jiaaro/pydub)**:** This is a python library that allows us to combine and overlay audio segments(obviously super useful for making a song).


# Creating the Lyrics
As previously stated we'll be using aitextgen to generate our lyrics. If you want more information on aitextgen see Max Woolf's notebook [aitextgen — Train a GPT-2 Text-Generating Model w/ GPU](https://colab.research.google.com/drive/15qBZx5y9rdaQSyWpsreMDnTiZ5IlN0zD?usp=sharing).

## Getting Training Data
Ok, so we're trying to teach GPT-2 how to make new lyrics that are similar to the lyrics from *Still Alive* from portal. The first step here is to get all the lyrics to *Still Alive* so we can teach our model what the lyrics to this song are like.

In [None]:
import requests
import os
# Create a folder named "lyrics" to contain all the vocal audio segments and lyric training data
lyricFilePath = "/content/lyrics"
if not os.path.exists(lyricFilePath):
  try:
      os.mkdir(lyricFilePath)
  except OSError:
      print ("Creation of the directory %s failed" % lyricFilePath)
  else:
      print ("Successfully created the directory %s " % lyricFilePath)

# Get the text from my github repo that has our training data
url = "https://raw.githubusercontent.com/JadenMcElvey/AIs-Sing-Still-Alive-From-Portal/master/Songs.txt"
songText = requests.get(url)

# Create a new file called Songs.txt and write the text to the file
filePath = "lyrics/Songs.txt"
file = open(filePath, "w")
file.write(songText.text)
file.close()
shuffled = False

So if you actually look at that file that we just created called "Songs.txt" you'll notice that it has additional text that is not part of *Still Alive*. This additional text is from two other times that GlaDOS has sang in video games. GlaDOS sang Want You Gone in Portal 2 and You Wouldn't Know in LEGO Dimensions. This data is added because it gives our training data a little extra variety while still being relevant. Unfortunately having all of this data in the order that it is provided in the song would make the model write lyrics exactly as written in the song so we need even more variety in our data. This will be accomplished by shuffling all of the paragraphs and appending them to the end of the file.[$^{1}$](#Footnotes)

In [None]:
import random

if not shuffled:
  shuffled = True
  # Read and shuffle the paragraphs from Songs.txt
  with open(filePath, mode='r', encoding='utf-8') as f:
          data = f.read()

          # Split on \n\n
          paragraphs = data.split('\n\n')

          # Shuffle splits
          random.shuffle(paragraphs)

  # Append the paragraphs to the end of Songs.txt
  with open(filePath,  mode='a', encoding='utf-8') as output:
      output.write('\n\n')
      for paragraph in paragraphs:
          output.write(paragraph)

          # Add the line break
          output.write('\n\n')

## Finetuning the Model
Now that we've got our data prepared we're ready to actually do some machine learning. First we're going to import/install everything we need to finetune a machine learning model. This next block of code also downloads the 124M GPT-2 model.[$^{2}$
](#Footnotes)

In [None]:
# Freeze versions of dependencies for now
!pip install transformers==2.9.1

!pip install -q aitextgen

import logging
logging.basicConfig(
        format="%(asctime)s — %(levelname)s — %(name)s — %(message)s",
        datefmt="%m/%d/%Y %H:%M:%S",
        level=logging.INFO
    )

from aitextgen import aitextgen

ai = aitextgen(tf_gpt2="124M", to_gpu=True)

Next we train the model on the songs and wait. This may take 5-10 minutes to run. This model is being trained with relatively few steps because training it on with more steps pretty much guarantees that the model will be overfit. Every one hundred steps you can expect a sample of the text that the model has learned to generate. These samples should become more and more like *Still Alive* as the model trains.

In [None]:
# Train the model on Songs.txt
ai.train(file_name,
         line_by_line=False,
         from_cache=False,
         num_steps=500,
         generate_every=100,
         save_every=100,
         save_gdrive=False,
         learning_rate=.00001,
         batch_size=1, 
         )

Now that the ai knows what we want it to do we can ask it to write song lyrics for us. The following code will generate 5 samples of text at a time. They are being prompted with the phrase "This was a triumph" so that the ai knows to make something similar to *Still Alive*. Each sample should be approximately song length. For best results you should run the code below multiple times and pick your favorite lines. I did this when making my [video](https://).[$^{3}$
](#Footnotes)

**Important Note: GPT-2 is a general language processing model trained on text from the internet. It is likely that the model will say things you don't like that have nothing to do with portal songs. This is the nature of the model and can't really be helped.**

In [None]:
# Generate 5 "songs"
ai.generate(n=5,
            batch_size=5,
            prompt="This was a triumph.",
            max_length=350,
            temperature=0.8)

## Singing the Lyrics
Copy and paste the lyrics generated above into the string below. The first line of your lyrics should be on line 2.

**Important Note: It's best if your lyrics are split up into paragraphs of no more than 20 lines. This improves the audio quality of the vocal track.**

In [None]:
lyrics = """
This was a triumph.
I'm making a note here:
HUGE SUCCESS.
It's hard to overstate
My satisfaction.
Aperture Science
Aperture Science
We don't need anyone now
We find you. you get the point, you just leave
You just keep on trying
Till you run out of cake.

This was a triumph.
I'm making a note here It's your problem,
It shows on your wrist It displays a certain amount of willpower
When you're done you'll be gone
And the Science you've got
Is working.
For the people who are
Still alive.

Well done, Yar, you're on your own.
Now we're only a few things.
Now we're making a new goal
We'll reach our goal of reaching your body.
So you're being given a substance that will make you sick
Now we're making a new goal
We'll reach your goal of reaching your body.
Now we're releasing you.
Now we're releasing you. you're dead.
Let's go ahead and buy you a drink
Now we're doing live science.
We're testing your body. You'll be given a substance that will make you sick
Think of all the things you've got
For the better,
Some will make you wish you had stayed,
Others will make you wish you'd stayed,
Still alive.

We'll be releasing on time.
So I'm GLaD. I got burned.
Think of all the things we learned
For the people who are
HUGE thanks for the great jobs we're doing
This was a triumph.
I'm GLaD. I'm GLaD. I got burned.
I don't need anyone now.
We just keep on trying
Til we run out of cake.
And the Science gets done.
And you make a neat gun.
For the people who are
powerful.
And the people who are
Still alive.

This was a triumph.
We will gladly take less of you.
And cause your data to A) be Here
And B) be at your Aperture Science Goodnight.
As such, this design A) contains
Is 'Cosmonautically' Allowed.
It was tested and released.
It worked.
It worked as planned.
"""

Now that we've got our beautifully totally coherent ai generated lyrics. we just need to convert lyrics to singing. The following code creates a new folder called lyrics and an mp3 file for each paragraph called #-0lyrics.mp3. You can download and listen to these if you want, but they still need some more audio processing before they sound like GlaDOS.

In [None]:
!pip install gTTS
from gtts import gTTS

# Split the Lyrics into paragraphs
lyricParagraphs = lyrics.split("\n\n")

# Use text to speech to create an audio file of every paragraph
for i in range(len(lyricParagraphs)):
  checkPath = "/content/lyrics/{}-0lyric.mp3".format(i)
  if os.path.exists(checkPath):
    os.remove(checkPath)

  tts = gTTS(lyricParagraphs[i])
  tts.save(lyricFilePath + "/{}-0lyric.mp3".format(i))

So now that we've got each paragraph we need to make it sound like a robot. This is going to sound weird but the first step in making a robot voice is to create a second copy of the audio that is sped up by 0.1%. The following code creates a sped up version of each paragraph's audio file named #-1lyric.mp3.

In [None]:
# Get all of the original text to speech audio files
lyricFiles = os.listdir("/content/lyrics")
lyricFiles = [x for x in lyricFiles if x[2] == "0"]

# Create a slightly sped up version of each audio file
for inputFileName in lyricFiles:
  outputFileName = "lyrics/" + inputFileName[:2] + "1" + inputFileName[3:]
  inputFileName = "lyrics/" + inputFileName

  checkPath = "/content/" + outputFileName
  if os.path.exists(checkPath):
    os.remove(checkPath)

  !ffmpeg -hide_banner -loglevel panic -i {inputFileName} -filter:a "atempo=1.001" {outputFileName}

Next we just need to combine the original audio with the slighlty sped up audio to create the final robot vocal sounds. The following code creates a robot sounding version of each paragraph's audio file named #-2lyric.mp3.

In [None]:
!pip install pydub
from pydub import AudioSegment

# Create lists of all of the audio files
lyricFiles = os.listdir("/content/lyrics")
lyricFiles0 = ["/content/lyrics/" + x for x in lyricFiles if x[2] == "0"]
lyricFiles1 = ["/content/lyrics/" + x for x in lyricFiles if x[2] == "1"]
lyricFiles0.sort()
lyricFiles1.sort()

# Combine the audio files together to make the audio sound more robot like
for i in range(len(lyricFiles0)):
  audio0 = AudioSegment.from_file(lyricFiles0[i])
  audio1 = AudioSegment.from_file(lyricFiles1[i])
  audio2 = audio0.overlay(audio1)

  outputFileName = "/content/lyrics/{}-2.lyric.mp3".format(i)
  audio2.export(outputFileName, format="mp3")

Lastly we combine all of the audio files into one final complete vocal track.

In [None]:
# Make a list of all of the robot like audio files
lyricFiles = os.listdir("/content/lyrics")
lyricFiles2 = ["/content/lyrics/" + x for x in lyricFiles if x[2] == "2"]
lyricFiles2.sort()

# Combine the auido files together into one complete vocal track called vocalTrack.wav
vocalTrack = AudioSegment.empty()
for file in lyricFiles2:
  vocalTrack += AudioSegment.from_file(file)
  vocalTrack += AudioSegment.silent(duration=4000) 

vocalTrack.export("/content/vocalTrack.wav", format="wav")

We did it we've got a GlaDOS like voice singing our ai generated lyrics! Now on to generating the accompaniment!

# Creating the Musical Accompaniment
Now that we've got our vocal track complete we need to get some music to accompany our lovely robotic singing. We're going to generate the music using Google Magenta. Magenta is a project that lets us use machine learning to generate new melodies. Our melody is going to be based off of a midi file of Stay Alive. First lets download and install Magenta. If you want more information on using Magenta you should check out [Hello Magenta](https://colab.research.google.com/drive/1TiqYnRPWrgm_odG_6wdILEvK38L1Bqhx#scrollTo=7Y0VkNafNKLP).[$^{4}$
](#Footnotes)

## Setting up Magenta
The following code may take a few minutes to run.

In [None]:
#@test {"output": "ignore"}
print('Installing dependencies...')
!apt-get update -qq && apt-get install -qq libfluidsynth1 fluid-soundfont-gm build-essential libasound2-dev libjack-dev
!pip install -qU pyfluidsynth pretty_midi

!pip install -qU magenta

# Hack to allow python to pick up the newly-installed fluidsynth lib. 
# This is only needed for the hosted Colab environment.
import ctypes.util
orig_ctypes_util_find_library = ctypes.util.find_library
def proxy_find_library(lib):
  if lib == 'fluidsynth':
    return 'libfluidsynth.so.1'
  else:
    return orig_ctypes_util_find_library(lib)
ctypes.util.find_library = proxy_find_library

print('Importing libraries and defining some helper functions...')
from google.colab import files

import magenta
import note_seq
import tensorflow

print('🎉 Done!')
print(magenta.__version__)
print(tensorflow.__version__)

Magenta is actually the name of a large project containing many different machine learning models that are capable of different music related tasks. For our purposes we'll use the basic MelodyRNN model to extrapolate our musical accompaniment. The code below initializes our model.

In [None]:
# Create a folder called music to store the music files
musicFilePath = "/content/music"
if not os.path.exists(musicFilePath):
  try:
      os.mkdir(musicFilePath)
  except OSError:
      print ("Creation of the directory %s failed" % musicFilePath)
  else:
      print ("Successfully created the directory %s " % musicFilePath)

print('Downloading model bundle. This will take less than a minute...')
note_seq.notebook_utils.download_bundle('basic_rnn.mag', '/content/music/')

# Import dependencies.
from magenta.models.melody_rnn import melody_rnn_sequence_generator
from magenta.models.shared import sequence_generator_bundle
from note_seq.protobuf import generator_pb2
from note_seq.protobuf import music_pb2

# Initialize the model.
print("Initializing Melody RNN...")
bundle = sequence_generator_bundle.read_bundle_file('/content/music/basic_rnn.mag')
generator_map = melody_rnn_sequence_generator.get_generator_map()
melody_rnn = generator_map['basic_rnn'](checkpoint=None, bundle=bundle)
melody_rnn.initialize()

print('🎉 Done!')

## Generating Music
The model still needs data to extrapolate our new melody from. For this purpose we'll use a midi version of *Still Alive*. The next code block downloads the song and converts it into a form that can be used by Magenta.

In [None]:
from pathlib import Path
import requests

# Download a midi file of Still Alive from my github
url = "https://github.com/JadenMcElvey/AIs-Sing-Still-Alive-From-Portal/blob/master/Still_Alive.mid?raw=true"
stillAliveMidi = requests.get(url)

# Save the file in the music folder
filePath = "music/Still_Alive.mid"
file = open(filePath, "wb")
file.write(stillAliveMidi.content)
file.close()

# Create a note sequence from the Still Alive midi and set the tempo to 120qpm
stillAliveMidi = Path("/content/music/Still_Alive.mid").read_bytes()
stillAliveSeq = note_seq.midi_to_note_sequence(stillAliveMidi)
del stillAliveSeq.tempos[:]
stillAliveSeq.tempos.add(qpm=120)

Now that we've got our source music its time to generate some new music. Below we calculate how long the new music should be, generate the music, and save our music as music/musicTrack.mid

In [None]:
import math

# Tell the model to train on the Still Alive note sequence and set the temperature to 1.0
input_sequence = stillAliveSeq
temperature = 1.0

# Set the start time to begin on the next step after the last note ends.
last_end_time = (max(n.end_time for n in input_sequence.notes)
                  if input_sequence.notes else 0)
qpm = input_sequence.tempos[0].qpm
seconds_per_step = 60.0 / qpm / melody_rnn.steps_per_quarter

# Calculate the number of steps needed to generate music that matches the length of the vocal track
seq_steps = (math.ceil(stillAliveSeq.total_time)) / seconds_per_step
vocal_steps = math.ceil(len(vocalTrack) / 1000) / seconds_per_step
num_steps = seq_steps + vocal_steps
total_seconds = num_steps * seconds_per_step

# Set generator options
generator_options = generator_pb2.GeneratorOptions()
generator_options.args['temperature'].float_value = temperature
generate_section = generator_options.generate_sections.add(
  start_time=last_end_time + seconds_per_step,
  end_time=total_seconds)

# Ask the model to continue the sequence.
sequence = melody_rnn.generate(input_sequence, generator_options)

# Trim the note sequence to only include the ai generated music
sequence = note_seq.extract_subsequence(sequence, stillAliveSeq.total_time, sequence.total_time)

# Play and save the ai generated music as musicTrack.mid in the music folder
note_seq.plot_sequence(sequence)
note_seq.play_sequence(sequence, synth=note_seq.fluidsynth)
note_seq.sequence_proto_to_midi_file(sequence, "music/musicTrack.mid")

And just like that our ai has generated original music. You can listen to it above if you're interested. Next we convert the music into a wav file for creating the final mix.

In [None]:
# Install fluidsynth and use fluidsynth to convert musicTrack to a wav file
!apt install fluidsynth
!cp /usr/share/sounds/sf2/FluidR3_GM.sf2 music/font.sf2
!fluidsynth -ni music/font.sf2 music/musicTrack.mid -F musicTrack.wav -r 32000

# The Final Song
We've got everything we need to put together our ai generated song, complete with vocals and accompaniment. Below we overlay the vocals over the music and apply a fade to the end of the song.

In [None]:
# Open the the vocal and music tracks
vocals = AudioSegment.from_wav("vocalTrack.wav")
music = AudioSegment.from_wav("musicTrack.wav")

# Increase the music volume and make the music fade out at the end
musicAmplified = music.apply_gain(8)
musicWithFade = musicAmplified.fade_out(4000)

# Overlay the vocals over the music and save the final song as AI_Still_Alive.wav
finalSong = musicWithFade.overlay(vocals)
finalSong.export("AI_Still_Alive.wav", format="wav")

You made it! Run the code below and press play to here your amazing new ai generated song. I'm sure your song has beautiful, coherent, meaningful lyrics and that your musical composition is a masterpiece.

In [None]:
# Listen to the final audio!
from IPython.display import Audio
Audio("AI_Still_Alive.wav")

# Troubleshooting
This notebook is best completed from start to finish if you encounter an error I would double check that you haven't skipped any the code blocks. If your issue persists I would recomment resetting the notebook. Reset the notebook at the top by clicking **Runtime > Factory Reset Runtime > Yes**. This will delete any lyrics you have saved so copy them somewhere else if you don't want to loose them. Good Luck!

# References
I'm not an expert on AI or machine learning or anything really, so I got help from these places.

1. [How To Make Custom AI-Generated Text With GPT-2](https://minimaxir.com/2019/09/howto-gpt2/#:~:text=Speaking%20of%20generation%2C%20once%20you,By%20default%2C%20the%20gpt2.&text=You%20can%20download%20the%20generated,and%20share%20the%20generated%20texts.)
2. [aitextgen github](https://github.com/minimaxir/aitextgen)
3. [aitextgen — Train a GPT-2 Text-Generating Model w/ GPU Colab Notebook](https://colab.research.google.com/drive/15qBZx5y9rdaQSyWpsreMDnTiZ5IlN0zD?usp=sharing)
4. [How to build and deploy a lyrics generation model — framework agnostic](https://towardsdatascience.com/how-to-build-and-deploy-a-lyrics-generation-model-framework-agnostic-589f3026fd53)
5. [Hello Magenta](https://colab.research.google.com/notebooks/magenta/hello_magenta/hello_magenta.ipynb#scrollTo=dPkdg9jTjkTd)
6. [Magenta Github](https://github.com/magenta)



# Footnotes
1. Originally I was shuffling each line instead of shuffling each paragraph. This in fact, was a bad idea and the model produced results that weren't organized into paragraphs anymore. [This](https://towardsdatascience.com/how-to-build-and-deploy-a-lyrics-generation-model-framework-agnostic-589f3026fd53) project gave me the idea of shuffling based on paragraphs. It also gave me the idea of shuffling with a script instead of shuffling manually. I know how to write code I just forgot that solving my problems with code was better than solving stuff by hand. I also used a script from this article to shuffle the paragraphs.
2. This section on generating lyrics is almost identical to Woolf's original notebook. Some parameters have been tweaked and unnecessary bits of code have been removed but otherwise its the same content. You really should give the [original notebook](https://colab.research.google.com/drive/15qBZx5y9rdaQSyWpsreMDnTiZ5IlN0zD?usp=sharing) a look.
3. In the interest of full transparency it should be noted that the lyrics from my [video](https://) are curated. I spent approximately 2 hours generating different lyrics and chose the best paragraphs to create my song. This is how I got my lyrics to mirror the structure of the original song. It is important to remember that given the model configuration in this notebook it is **not** possible(or at least highly improbable) to generate lyrics that closely match the structure of the original song. The lyrics for my video are heavily curated.
4. If you want a real tutorial on Magenta you should at least read through [Hello Magenta](https://colab.research.google.com/drive/1TiqYnRPWrgm_odG_6wdILEvK38L1Bqhx#scrollTo=7Y0VkNafNKLP). This is where I learned to use Magenta. Most of the code from this section is from that notebook.

# License
MIT License

Copyright (c) 2020 Jaden McElvey

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.