GitHub - Rohan397/one_word

Branches Tags

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
corpora		corpora
OneWordStories.py		OneWordStories.py
README		README
improv_helper.py		improv_helper.py

Repository files navigation

My project is to build a program that can help with Improv Theater. As a student with a passion for theater I wanted to use machine learning to help performers improve their technical skills. One important and hard to train skill is the ability to improvise - usually this requires working with someone else who will throw curve-balls at you and force you to make things up on the spot. However, practicing with another person can be hard for a number of reasons including differences in improv experience, confidence, and logistical problems like scheduling. Thus being able to build a program to act as an improv partner would really help an actor overcoming these boundaries while trying to improve their improvisation skills.
A common improvisation game to help actors think on their feet is called ‘One Word Stories’ - it is a game in which each player can only say one word at a time to collaboratively build a story with each other. It improves an individual’s ability to listen and really tune in to what others are saying. It also helps individuals work on forming creative responses and creating narrative twists while they are on the spot and don’t have much time to think things through (which is all of improv!). I wanted to create a program that can play One Word Stories with a user. I think especially for students at university who have challenging schedules this program would allow them to improve their improvisation skills in their own time and at their own pace.

This project is relevant to existing works in multiple contexts - there are multiple cases of programs that use machine learning to generate stories. These include what we used in class - cyborg writer (https://cyborg.tenso.rs/), which uses a neural text synthesizer, and the project by Hay Kranen (http://projects.haykranen.nl/markov/demo/), which uses Markov Chains. These projects are built around story generation that is definitely relevant to my project, but at the same time, they are not built with theatrical uses in mind, which is where the projects defer.
There are cases of individuals trying to use machine learning and AI in theatre. Piotr Mirowski is a researcher at DeepMind who consistently tries to integrate robots in his improv theater performances. In his words he looks into “the use of AI for artistic human and machine-based co-creation.” In fact he often performs in London with his robot as an improv partner. One of his aims is to pass an ‘improve turing test’ so to speak where he asks the audience to distinguish between segments of performance where the robot’s speech was being controlled by a human behind the scenes, and when it was actually being controlled by AI algorithms he developed. So far he hasn’t passed that Turing test and audiences are able to distinguish between when the robot is being controlled by a human and when it isn’t. While his
work focuses mainly on integrating robots with human performers in improv, my project focuses on using machine intelligence to improve human performances in improv. The idea of co-creating theater remains constant, just the context in which theater is being created in my case is slightly different - it is more of a rehearsal application than an actual on-stage application.

Since my project at its heart is about text generation using machine learning, I decided that I wanted to use Markov Chains. In some of the sample reading done for class we looked at using Markov Chains to generate the next character in a sequence of characters, to generate drum beats given a corpus of sounds, and to generate a new piece of art from existing pieces. The versatility of Markov Chains for generating new art based on corpora of different forms was what initially inspired me to use those for my story generating project. On a more practical level, I also wanted to use Markov Chains because One Word Stories is a ‘state’ based game in which both players create a story by adding a new word, or ‘state’, at a time and as each state is added the story makes more and more sense. This is exactly what Markov models outline and so it seemed like a perfect fit.
I went through a number of approaches on what data I would model my Markov chains on. For my first approach I modeled my Markov chains off the words in my sample corpus. This involved processing the corpus initially and building ngrams where I mapped n-consecutive words to the next word that followed the sequence. Unfortunately this approach was not very effective. The problem was that this approach made the program extremely dependent on the corpus - if a user entered a word that didn’t appear in the corpus, the program wouldn’t be able to generate a response to that word and the game would crash. Similarly, if a sequence of words was developed between the user and the program that didn’t appear in the corpus, the game would also crash. This also made it very hard to set the value of n to be high. I wanted to have a high n because the higher the number of previous words being looked at, the more sense the computer generated word would make! However, the higher the value of n, the lower the chances of that sequence of words appearing in the corpus and again, the higher the chances of the game crashing.

Since modelling with Markov chains with words made the program very volatile and made it hard to develop an original and wacky story (which is the point of both improv comedy and this improv game) I decided to look at a different aspect of the corpus I was analyzing. My second approach involved looking at the sentence structure of the corpus instead of the actual words in the corpus itself. Each word in a sentence has a specific ‘position_tag’ which indicates whether it is a verb, noun, preposition etc. So now instead of looking at the actual words in the corpus, I would look at the position tags of the words and build an ngram mapping n-consecutive position tags to the next position tag in the sequence. At the same time I also built a dictionary mapping each position tag to all the words that held that tag. Thus using this approach, the program would use Markov Chains to figure out the position tag of the next word to be generated and then randomly pull a word with that tag out from the dictionary. This approach solved a lot of the problems of the word based approach. The program was now a lot more durable and less dependant on the corpus. When a word appeared that was not in the corpus, the program could still figure out its position tag and then generate another word using that. In addition I could raise the value of n higher because the sequences of position tags were far more likely to appear in the corpus than the sequences of exact words. Despite these advantages, this approach also had disadvantages. The words generated by the program made far less sense than in my first model. This was because lots of words had the same tag but didn’t necessarily fit the same in a sentence. A good example is ‘dog’ vs. ‘dogs’ - both have the tag of noun but if the previous sequence is ‘there was one’ then only ‘dog’ makes sense; ‘dogs’ throws things out of order. The singular vs. plural problem is just one example of the issues of irrelevant words being generated that came with this model.

My final implementation was a mix of analyzing both these aspects of the corpus. I used a try-except format to first try generating the next word of the story by using n-grams formed purely out of the words in the story; if that worked then great, but if there was no n-gram mapping that sequence of words to the next word, I would catch the exception and transition into looking at the underlying position tags of that sequence of words and use that to generate the next word. This approach capitalized on the ability of the word-based model to generate relevant material and then covered up the volatility of that model by using a much more stable model as a back-up in case the first model failed. I was reasonably happy with the effectiveness of this hybrid approach of analyzing the two different aspects of the corpus.
In terms of datasets, or corpora, that I used, I tried a variety. I started off trying to use stand-up comedy scripts from comedians like Kevin Hart. These types of corpora were not ideal however because the flow of speech in stand-up comedy is a lot more conversation-like than story-like. As a result the text generated by the program after analyzing these corpora was not very relevant and was hard to work with. Because corpora of stand-up comedy wasn’t very effective I switched over to using stories as corpora. Some of the stories I used were ‘The Wind in the Willows’ and ‘Sherlock Holmes’. These were far more effective - because they were more conventional stories, the flow of speech in them was more conducive to collaborative storytelling. Because these corpora used slightly older styles of writing though having been written in the late 19th - early 20th century, there were times when some words like “ ‘s ” would come up which was slightly strange but at some level also added to the challenge of coming a response word which was fun. I also tested out using a more simple story, like Cinderella as the corpus. This was the most effective for me because the english used was modern and simple, while the flow of writing was still conventionally that of a story. As a result this became my favourite corpus type because the program would usually generate words that I knew and that kind of made sense in the given context.

The project was coded in Python 2.7 - a decision I made based on the language’s relative ease to learn and pick up as well as for its multiple helpful features and libraries. Because the application wasn’t web-based I decided this was more suitable than using javascript. I made use of multiple python libraries for some of the core functioning of my program. These are listed and talked about below:
1) Nltk - the nltk, or Natural language toolkit, library for python was very important in my project. I used it’s ‘word_tokenize’ method to convert my corpora into lists of the words in the corpora. I then used its ‘pos_tag’ method to analyze the list of words and generate a new list of tuples - the tuples were made up of a word and its position tag. This was incredibly important for me as it allowed me to process the corpora and in terms of their structures and not just their words. As I outlined before, this was necessary in order to build a durable system and that functionality of recognizing each word’s position tag came from the nltk library.
2) Pyttsx3 - pyttsx3 is a python library that can convert text to speech. It, along with google text to speech (gTTS), was one of the most recommended text to speech libraries available for python. I needed to use this so that the program could actually ‘speak’ the word it generated and not just print it on the screen. The whole point of the One word stories game is to say the story with your collaborator and so this was a necessary library and feature to include. I chose pyttsx3 over gTTS because gTTS requires programs to convert text into mp3 files and then plays those mp3 files whereas pyttsx3 has a ‘say’ method that plays a word out loud without first converting it into an mp3 file. This seemed more efficient to me.
3) SpeechRecognition - SpeechRecognition is a python library that can convert speech to text. I needed this to enable the functionality of the user speaking to the computer. The functionality in this library allowed me to convert the user's speech into text that could then be processed by the program to formulate a response word. There were a number of ‘recognizers’ available in this library that would convert spoken word to text. I chose to use google’s recognizer because when I used it because it seemed to be fairly accurate.

The third part sources I used for my corpora are as follows:
1) Cinderella - https://www.worldoftales.com/Cinderella.html
2) The Wind in the Willows - https://www.gutenberg.org/files/27805/27805-h/27805-h.htm
3) The Adventures of Sherlock Holmes -
https://www.gutenberg.org/files/1661/1661-h/1661-h.htm
Please note that number 2 and 3 were retrieved from project Gutenberg which has free and copyright-free ebooks for users in the US. I am not sure what the copyright laws and use laws are like for these books in the UK.
The Third Party Python Libraries I used are as follows:
1) Nltk - https://www.nltk.org/
2) Pyttsx3 - https://pypi.org/project/pyttsx3/
3) SpeechRecognition - https://pypi.org/project/SpeechRecognition/
ALL THE CODE FOR THIS PROJECT IN THE FOLLOWING FILES WAS DEVELOPED BY ME:
1) improv_helper.py 2) OneWordStories.py

While all the code was developed by me I did use the following tutorials to help me understand how to work with Markov Chains and use certain libraries:
1) Implementing Markov Chain tutorial: https://www.youtube.com/watch?v=eGFJ8vugIWA&t=1369s
2) Using speech to text conversion with SpeechRecognition: https://medium.com/@rahulvaish/speech-to-text-python-77b510f06de
3) Using text to speech conversion with pyttsx3: https://pypi.org/project/pyttsx3/
6. Provide instructions for compiling and running your project.

The main project is stored in improv_helper.py. To run the project you will need to have a computer with a working microphone and speaker capabilities (necessary for the audio interface between person and program). You will also need python2.7 installed on your machine.
To set up requirements you will need to install the nltk library, SpeechRecognition library and pyttsx3 library. Note the SpeechRecognition requires you to have pyaudio installed as well. It is

recommended you have these installed to a virtual environment. Please see the documentation linked in section 3 above for installation guides.
Once these are set up you can just launch the program from the terminal by typing python improv_helper.py. Do not use python3, as I found the libraries I was using to be incompatible with that version.
To modify the corpus being used, simply go into the improv_helper.py’s main() function (line 103) and then change the file path to the corpus you want. To add another corpus you can go into the corpora folder and add a new text file with the corpus you want. Then you can use that corpus using the before-mentioned steps.