# Pincelate tutorial and cookbook

By [Allison Parrish](http://www.decontextualize.com/)

This notebook shows you how to use [Pincelate](https://pincelate.readthedocs.io/) and how to do some interesting things with it.

Pincelate is a Python library that provides a simple interface for a machine learning model that can sound out English words and spell English words based on how they sound. "Sounding out" here means converting letters ("orthography") to sounds ("phonemes"), and "spelling" means converting sounds to letters (phonemes to orthography). The model is trained on the [CMU Pronouncing Dictionary](http://www.speech.cs.cmu.edu/cgi-bin/cmudict), which means it generally sounds words out as though speaking "standard" North American English, and spells words according to "standard" North American English rules (at least as far as the model itself is accurate).

## Preliminaries

Loading various required modules, plus the language model and Pincelate. To run these experiments, you'll need to install Pincelate. Type the following at a command prompt:

    pip install tensorflow  # or tensorflow-gpu
    pip install pincelate
    
(Installing Pincelate will also install Pronouncing, which we'll use at various points in the experiments below.)

Other libraries you'll need for this notebook: `numpy` and `scipy`. If you're using Anaconda, you already have these libraries. If not, install them like so:

    pip install numpy scipy
    
Importing numpy and Pronouncing:

In [1]:
import numpy as np
import pronouncing as pr

Now import Pincelate and instantiate a Pincelate object. (This will load the pre-trained model provided with the package.)

In [2]:
from pincelate import Pincelate

Using TensorFlow backend.


In [3]:
pin = Pincelate()

Later in the notebook, I'm going to use some of Jupyter Notebook's interactive features, so I'll import the libraries here:

In [4]:
import ipywidgets as widgets
from IPython.display import display
from ipywidgets import interact, interactive_output, Layout, HBox, VBox

## Sounding out and spelling

The CMU Pronouncing Dictionary provides a database of tens of thousands of English words along with their pronunciations. I made a Python library called [Pronouncing](https://github.com/aparrish/pronouncingpy) to make it easier to look up words in dictionary. Here's how it works. To get the pronunciation of a word:

In [5]:
pr.phones_for_word("alphabet")[0]

'AE1 L F AH0 B EH2 T'

The CMU Pronouncing Dictionary provides pronunciations as a list of phonemes in a phonetic transcription scheme called [Arpabet](https://en.wikipedia.org/wiki/ARPABET), in which each unique sound in English is given a different symbol.

If you want to find words that have a particular pronunciation, you can look them up in the CMU Pronouncing Dictionary like so:

In [6]:
pr.search("^F L AW1 ER0$")

['flour', 'flower']

That all seems pretty straightforward! The problem arises when you want to spell a word that *isn't* in the CMU Pronouncing Dictionary. You'll get an error:

In [7]:
pr.phones_for_word("mimsy")[0]

IndexError: list index out of range

Likewise, if you've just invented a new word and have a pronunciation in mind, the CMU Pronouncing Dictionary won't be able to help you spell it:

In [8]:
pr.search("^B L AH1 R F$")

[]

This is where Pincelate comes in handy. Pincelate's machine learning model can provide phonemes for words that aren't in the CMU Pronouncing Dictionary, and produce plausible spellings of arbitrary sequences of phonemes. To sound out a word, use the `.soundout()` method:

In [9]:
pin.soundout("mimsy")

['M', 'IH1', 'M', 'S', 'IY0']

... and to produce a plausible spelling for a word whose sounds you just made up, use the `.spell()` method, passing it a list of Arpabet phonemes:

In [10]:
pin.spell(['B', 'L', 'AH1', 'R', 'F'])

'blurf'

It's important to note that Pincelate's `.soundout()` method will *only* work with letters that appear the CMU Pronouncing Dictionary's vocabulary. (You need to use lowercase letters only.) So the following will throw an error:

In [11]:
pin.spell("étui")

KeyError: 'é'

### Example: phoneme frequency analysis

Using Pincelate's model, we can do phonetic analysis on texts, even texts that contain words that aren't in the CMU Pronouncing Dictionary. For example, let's find out what the most common phonemes are in Lewis Carroll's "Jabberwocky," whose text I've included in this repository as `jabberwocky.txt`. Here's the full text:

In [12]:
text = open("jabberwocky.txt").read()
print(text)

'Twas brillig, and the slithy toves
      Did gyre and gimble in the wabe:
All mimsy were the borogoves,
      And the mome raths outgrabe.

"Beware the Jabberwock, my son!
      The jaws that bite, the claws that catch!
Beware the Jubjub bird, and shun
      The frumious Bandersnatch!"

He took his vorpal sword in hand;
      Long time the manxome foe he sought---
So rested he by the Tumtum tree
      And stood awhile in thought.

And, as in uffish thought he stood,
      The Jabberwock, with eyes of flame,
Came whiffling through the tulgey wood,
      And burbled as it came!

One, two! One, two! And through and through
      The vorpal blade went snicker-snack!
He left it dead, and with its head
      He went galumphing back.

"And hast thou slain the Jabberwock?
      Come to my arms, my beamish boy!
O frabjous day! Callooh! Callay!"
      He chortled in his joy.

'Twas brillig, and the slithy toves
      Did gyre and gimble in the wabe:
All mimsy were the borogoves,
      And the m

First, parse the text into words and convert them to lower case:

In [13]:
import re
words = [item.lower() for item in re.findall(r"\b(\w+)\b", text)]

Here's a random sample of the words just to ensure that we've got what we wanted:

In [14]:
import random
random.sample(words, 10)

['the', 'claws', 'that', 'the', 'the', 'toves', 'foe', 'the', 'did', 'gyre']

Now, we'll use `.soundout()` to get a list of phonemes for each item, and feed them to a `Counter()` object:

In [15]:
from collections import Counter
phoneme_count = Counter()
for word in words:
    phoneme_count.update(pin.soundout(word))

And now print out the most common phonemes:

In [16]:
phoneme_count.most_common(12)

[('D', 37),
 ('AE1', 33),
 ('T', 32),
 ('N', 32),
 ('B', 28),
 ('IY1', 28),
 ('M', 26),
 ('R', 25),
 ('IH1', 25),
 ('L', 24),
 ('S', 22),
 ('W', 20)]

So a lot of alveolar sounds (`D`, `T`, `N`) and vowels at the front of the mouth (`AE1`, `IY1`, `IH1`). To know if this distribution is unusual, we could compare it with the most common phonemes in all of the CMU Pronouncing Dictionary:

In [17]:
cmu_phoneme_count = Counter()
for word, phones in pr.pronunciations:
    cmu_phoneme_count.update(phones.split())
cmu_phoneme_count.most_common(12)

[('AH0', 63136),
 ('N', 61235),
 ('S', 50435),
 ('L', 49964),
 ('T', 49075),
 ('R', 46469),
 ('K', 43079),
 ('D', 32559),
 ('IH0', 30200),
 ('M', 29741),
 ('Z', 28217),
 ('ER0', 23954)]

We could do a more formal analysis and make claims about how Lewis Carroll's *Jabberwocky* differs significantly from typical English from a phonetic standpoint, but just from a quick look we can see that "Jabberwocky" is heavy on the `AE1`s (i.e., the vowel sound in "hand") and `B`s.

### Example: Spelling words from random phonemes

Having just counted up all of the phonemes in the CMU Pronouncing Dictionary, we can now invent somewhat plausible neologisms by drawing phonemes at random according to their frequency and gluing them together. ("Neologism" is a fancy word for "made-up word.") The following code normalizes the phoneme frequencies so we can use them in numpy's `np.random.choice` function:

In [18]:
all_phonemes = list(cmu_phoneme_count.keys())
phoneme_frequencies = np.array(list(cmu_phoneme_count.values()), dtype=np.float32)
phoneme_frequencies /= phoneme_frequencies.sum()

And then this function will return a random neologism, created from phonemes drawn at random based on their frequency in English words:

In [19]:
def neologism_phonemes():
    return [np.random.choice(all_phonemes, p=phoneme_frequencies)
            for item in range(random.randrange(3,10))]

Here's a handful, just to get a taste:

In [20]:
for i in range(5):
    print(neologism_phonemes())

['P', 'R', 'N', 'K', 'N', 'AO1']
['P', 'N', 'OW2', 'N', 'N']
['IY1', 'R', 'AE1', 'T', 'L', 'K', 'AA1', 'UW2']
['Z', 'Z', 'AH0', 'AH0', 'AH0', 'S', 'IH0', 'IY0', 'AH1']
['L', 'AA1', 'IH0', 'S', 'IH1', 'R', 'AH0', 'UW1']


That's all well and good! Try sounding out some of these on your own (consult the [Arpabet](https://en.wikipedia.org/wiki/ARPABET) table to find the English sound corresponding to each symbol).

But how do you *spell* these neologisms? Why, with Pincelate's `.spell()` method of course:

In [21]:
pin.spell(neologism_phonemes())

'hmow'

Here's a for loop that generates neologisms and prints them along with their spellings:

In [22]:
for i in range(12):
    phonemes = neologism_phonemes()
    print(pin.spell(phonemes), phonemes)

hado ['EH1', 'AO1', 'D']
embervadera ['IY0', 'M', 'B', 'V', 'ER0', 'AH0', 'D', 'ER0', 'AE1']
tareltpent ['T', 'AA1', 'R', 'EH1', 'L', 'T', 'P', 'T', 'N']
tlinley ['T', 'N', 'N', 'L', 'IY0']
urrair ['ER1', 'EH1', 'R']
arpgleck ['AH0', 'R', 'P', 'G', 'AH0', 'L', 'K']
dubster ['D', 'HH', 'B', 'AH0', 'S', 'R', 'T', 'IH0']
klunglawough ['K', 'T', 'N', 'K', 'L', 'W', 'AH0', 'UW2', 'AO1']
tterdeeker ['T', 'S', 'D', 'ER0', 'Y', 'K', 'R']
snon ['N', 'S', 'AA2', 'N']
mfur ['M', 'F', 'ER0']
pae ['P', 'AH0', 'EH2']


## Phoneme features

The examples above use the phoneme as the basic unit of English phonetics. But each phoneme itself has characteristics, and many phonemes have characteristics in common. For example, the phoneme `/B/` has the following characteristics:

* *bilabial*: you put your lips together when you say it
* *stop*: airflow from the lungs is completely obstructed
* *voiced*: your vocal cords are vibrating while you say it

The phoneme `/P/` shares two out of three of these characteristics (it's *bilabial* and a *stop*, but is not voiced). The phoneme `/AE/`, on the other hand, shares *none* of these characteristics. Instead, it has these characteristics:

* *vowel*: your mouth doesn't stop or occlude airflow when making this sound
* *low*: your tongue is low in the mouth
* *front*: your tongue is advanced forward in the mouth
* *unrounded*: your lips are not rounded

These characteristics of phonemes are traditionally called "features." You can look up the features for particular phonemes using the `phone_feature_map` variable in Pincelate's `featurephone` module:

In [23]:
from pincelate.featurephone import phone_feature_map

For example, to get the features for the vowel `/UW/` (vowel sound in "toot"):

In [24]:
phone_feature_map['UW']

('hgh', 'bck', 'rnd', 'vwl')

The features are referred to here with short three-letter abbreviations. Here's a full list:

* `alv`: alveolar
* `apr`: approximant
* `bck`: back
* `blb`: bilabial
* `cnt`: central
* `dnt`: dental
* `fnt`: front
* `frc`: fricative
* `glt`: glottal
* `hgh`: high
* `lat`: lateral
* `lbd`: labiodental
* `lbv`: labiovelar
* `lmd`: low-mid
* `low`: low
* `mid`: mid
* `nas`: nasal
* `pal`: palatal
* `pla`: palato-alveolar
* `rnd`: rounded
* `rzd`: rhoticized
* `smh`: semi-high
* `stp`: stop
* `umd`: upper-mid
* `unr`: unrounded
* `vcd`: voiced
* `vel`: velar
* `vls`: voiceless
* `vwl`: vowel

Additionally, there are two special phoneme features:

* `beg`: beginning of word
* `end`: end of word

... which are found and the beginnings and endings of words.

Internally, Pincelate's model operates on these *phoneme features*, instead of directly on whole phonemes. This allows the model to capture and predict underlying similarities between phonemes.

Pincelate's `.phonemefeatures()` method works a lot like `.spell()`, except instead of returning a list of phonemes, it returns a [numpy](https://numpy.org/) array of *phoneme feature probabilities*. This array has one row for each predicted phoneme, and one column for the probability (between 0 and 1) of a phoneme feature being a component of each phoneme. To illustrate, here I get the feature array for the word `cat`:

In [25]:
cat_feats = pin.phonemefeatures("cat")

This array has the following shape:

In [26]:
cat_feats.shape

(5, 32)

... which tells us that there are five predicted phonemes. (The `32` is the total number of possible features.) The word `cat`, of course, has only three phonemes (`/K AE T/`)—the extra two are the special "beginning of the word" and "end of the word" phonemes at the beginning and end, respectively.

### Examining predicted phoneme features

Let's look at the feature probabilities for the first phoneme (after the special "beginning of the word" token at index 0):

In [27]:
cat_feats[1]

array([1.09146349e-03, 2.64442605e-07, 1.21863934e-07, 4.83202034e-10,
       9.05446296e-09, 1.22701867e-05, 3.87583965e-09, 2.57467754e-08,
       3.10348078e-05, 1.12015674e-04, 1.80950082e-07, 1.23362767e-07,
       2.53906479e-10, 7.39023420e-09, 7.97239996e-10, 2.34866529e-05,
       1.97903224e-04, 2.42027836e-05, 1.02661081e-10, 3.20427802e-07,
       2.47776796e-07, 2.94954656e-08, 8.40330561e-09, 3.23472051e-07,
       9.99906063e-01, 6.39497084e-05, 1.69826153e-09, 2.14709959e-04,
       1.71634947e-05, 9.99885082e-01, 9.99977589e-01, 1.45659375e-04])

You can look up the index in this array associated with a particular phoneme feature using Pincelate's `.featureidx()` method:

In [28]:
cat_feats[1][pin.featureidx('vel')]

0.999885082244873

This tells us that the `vel` (velar) feature for this phoneme is predicted with almost 100% probability—which makes sense, since the phoneme we'd anticipate—`/K/` is a voiceless velar stop.

### Example: Spelling from phoneme features

The Pincelate class has another method, `.spellfeatures()`, which works like `.spell()` except it takes an array of phoneme features (such as that returned from `.phonemefeatures()`) instead of a list of Arpabet phonemes. You can use this to re-spell phoneme feature arrays that you have manipulated. In the following cell, I get the phoneme feature probability array for the word `pug`, then overwrite the probability of the "voiced" feature for its first phoneme, then respell:

In [34]:
pug = pin.phonemefeatures("pug")
pug[1][pin.featureidx('vcd')] = 1
pin.spellfeatures(pug)

'bug'

Or, you can spell from completely random feature probabilities:

In [35]:
for i in range(12):
    print(pin.spellfeatures(np.random.uniform(0, 1, size=(12,32))))

fhuolesiiw
whoilliewsh
hwhultingiw
hhholtioshow
kkhollishowigh
kholthiechoog
khwallishiowe
klwolsioskoo
klolsinshoog
khullsiowivey
kwhalekishooe
khwolkiewschoo


... which (weirdly) seems like someone trying to imitate the sound of white noise.

Or, you might want to build up neologism from scratch, specifying their phoneme features by hand. To do this, use Pincelate's `.vectorizefeatures()` method, passing it an array of tuples of phoneme features. It returns a phoneme probability array that you can then send to `.spellfeatures()`.

In [36]:
bee = pin.vectorizefeatures([
    ['beg'], ['blb', 'stp', 'vcd'], ['hgh', 'fnt', 'vwl'], ['end']
])
pin.spellfeatures(bee)

'bie'

### Example: Resizing feature probability arrays

Once you have the phonetic feature probability arrays, you can treat them the same way you'd treat any other numpy array. One thing I like to do is use scipy's image manipulation functions and use them resample the phonetic feature arrays. This lets us use the same phonetic information to spell a shorter or longer word. In particular, `scipy.ndimage.interpolation` has a handy [zoom](https://docs.scipy.org/doc/scipy-0.15.1/reference/generated/scipy.ndimage.interpolation.zoom.html) function that resamples an array and interpolates it. Normally you'd use this to resize an image, but nothing's stopping us from using it to resize our phonetic feature array.

First, import the function:

In [38]:
from scipy.ndimage.interpolation import zoom

Then get some phoneme feature probabilities:

In [39]:
feats = pin.phonemefeatures("alphabet")

Then resize with `zoom()`. The second parameter to `zoom()` is a tuple with the factor by which to scale the dimensions of the incoming array. We only want to scale along the first axis (i.e., the phonemes), keeping the second axis (i.e., the features) constant.

A shorter version of the word:

In [40]:
shorter = zoom(feats, (0.67, 1))
pin.spellfeatures(shorter)

'albe'

A longer version:

In [41]:
longer = zoom(feats, (2.0, 1))
pin.spellfeatures(longer)

'all-phabeate'

If you've downloaded this notebook and you're following along running the code, the following cell will create an interactive widget that lets you "stretch" and "shrink" the words that you type into the text box by dragging the slider.

In [42]:
import warnings
warnings.filterwarnings('ignore')
@interact(words="how to spell expressively", factor=(0.1, 4.0, 0.1))
def stretchy(words, factor=1.0):
    out = []
    for word in words.split():
        word = word.lower()
        vec = pin.phonemefeatures(word)
        if factor < 1.0:
            order = 3
        else:
            order = 0
        zoomed = zoom(vec, (factor, 1), order=order)
        out.append(pin.spellfeatures(zoomed))
    print(" ".join(out))

interactive(children=(Text(value='how to spell expressively', description='words'), FloatSlider(value=1.0, des…

## Round-trip spelling manipulation

Pincelate actually consists of *two* models: one that knows how to sound out words based on how they're spelled, , and another that knows how to spell words from sounds. Pincelate's `.manipulate()` function does a "round trip" re-spelling of a word, passing it through both models to return back to the original word. Try it out:

In [43]:
pin.manipulate("spelling")

'spelling'

On the surface, this isn't very interesting! You don't need Pincelate to tell you how to spell a word that you already know how to spell. But the `.manipulate()` has a handful of parameters that allow you to mess around with the model's internal workings in fun and interesting ways. The first is the `temperature` parameter, which artificially increases or decreases the amount of randomness in the model's output probabilities.

### Spelling temperature

When the temperature is close to zero, the model will always pick the most likely spelling of the word at each step.

In [44]:
pin.manipulate("spelling", temperature=0.01)

'spelling'

As you increase the temperature to 1.0, the model starts picking values at random according to the underlying probabilities.

In [45]:
pin.manipulate("spelling", temperature=1.0)

'spelling'

At temperatures above 1.0, the model has a higher chance of picking from letters with lower probabilities, producing a more unlikely spelling:

In [46]:
pin.manipulate("spelling", temperature=1.5)

'spelliz'

At a high enough temperature, the model's spelling feels essentially random:

In [47]:
pin.manipulate("spelling", temperature=3.0)

'la-zwaslf'

The following interactive widget lets you play with the `temperature` parameter:

In [163]:
@interact(s="getting hot in here", temp=(0.05, 2.5, 0.05))
def tempadjust(s, temp):
    return ' '.join([pin.manipulate(w.lower(), temperature=temp) for w in s.split()])

interactive(children=(Text(value='getting hot in here', description='s'), FloatSlider(value=1.2500000000000002…

### Example: Manipulating letter frequencies

The `.manipulate()` method takes keyword arguments `letters` and `features`. The provided values are used to
attenuate or emphasize the probability of the given letters and phonetic features at each step of the spelling and sounding out process. Specifically, the decoded probability of the given item (letter or feature) is raised to the power of ``np.exp(n)``, where ``n`` is the provided value. A value of 0 will affect no change; negative values will increase the probability, positive values will decrease the probability. (A good range to try out is -10 to +10.)

For example, passing the following parameter for `letters` attenuates the probability that the model will decode the letter `e`. Instead, it finds another alternative:

In [23]:
pin.manipulate("spelling", letters={'e': 10})

'spilling'

You can include multiple key/value pairs. For example, to remove all vowels:

In [49]:
pin.manipulate("spelling", letters={'a': 10, 'e': 10, 'i': 10, 'o': 10, 'u': 10})

'sph-lyng'

### Example: Manipulating sounds

The `features` parameter does the same thing as `letters`, except it affects the probability of particular phonetic features being predicted at each step of the decoding. This lets you expressively add or remove phonetic features from a word. For example, adding in more nasalness:

In [50]:
pin.manipulate("spelling", features={'nas': -10})

'smnenging'

Adding some voice and frication and removing voicelessness:

In [76]:
pin.manipulate("spelling", features={'vcd': -5, 'frc': -5, 'vls': 10})

'sbezgizgs'

### Interactive manipulation tool

The code below builds an ipython widgets interface that lets you play around with the `manipulate()` function:

In [77]:
import ipywidgets as widgets
from IPython.display import display
from ipywidgets import interact, interactive_output, Layout, HBox, VBox

In [78]:
def manipulate(instr="allison", temp=0.25, **kwargs):
    return ' '.join([
        pin.manipulate(
            w,
            letters={k: v*-1 for k, v in kwargs.items()
                  if k in pin.orth2phon.src_vocab_idx_map.keys()},
            features={k: v*-1 for k, v in kwargs.items()
                      if k in pin.orth2phon.target_vocab_idx_map.keys()},
            temperature=temp
        ) for w in instr.split()]
    )

In [79]:
orth_sliders = {}
phon_sliders = {}
for ch in pin.orth2phon.src_vocab_idx_map.keys():
    if ch in "'-.": continue
    orth_sliders[ch] = widgets.FloatSlider(description=ch,
                               continuous_update=False,
                               value=0,
                               min=-20,
                               max=20,
                               step=0.5,
                               layout=Layout(height="10px"))
for feat in pin.orth2phon.target_vocab_idx_map.keys():
    if feat in ("beg", "end", "cnt", "dnt"): continue
    phon_sliders[feat] = widgets.FloatSlider(description=feat,
                               continuous_update=False,
                               value=0,
                               min=-20,
                               max=20,
                               step=0.5,
                               layout=Layout(height="10px"))
instr = widgets.Text(description='input', value="spelling words with machine learning")
tempslider = widgets.FloatSlider(description='temp', continuous_update=False, value=0.3, min=0.01, max=5, step=0.05)
left_box = VBox(tuple(orth_sliders.values()) + (tempslider,))
right_box = VBox(tuple(phon_sliders.values()))
all_sliders = HBox([left_box, right_box])

out = interactive_output(lambda *args, **kwargs: print(manipulate(*args, **kwargs)),
                         dict(instr=instr, temp=tempslider, **orth_sliders, **phon_sliders))
out.layout.height = "100px"
display(VBox([all_sliders, instr]), out)

VBox(children=(HBox(children=(VBox(children=(FloatSlider(value=0.0, continuous_update=False, description='$', …

Output(layout=Layout(height='100px'))

## Phoneme states

Pincelate gives you access to a particular intermediary value in the neural network: the hidden state of phonetic encoder's RNN. This is a vector that has what amounts to a "compressed" representation of the phonetic content of a given string of characters. You can get this vector for a given string using the `.phonemestate()` method, passing the string whose phoneme state you want to capture. This value can be used for any number of downstream tasks, such as calculating phonetic similarity between two words, or as an initial embedding value for training classifiers on phonetics.

We're going to use it for something that's a little bit more fun: generating nonsense words. Pincelate also supplies a method `.spellstate()` that takes the phoneme state and passes it to the orthographic decoder. So you can take an arbitrary value and spell a word from it!

### Example: Interpolation (blending words)

The first example of this I'll show is *interpolating* between two words. To do this, calculate the phoneme state for both words in a pair, then take the average of the two (essentially the midpoint of the line that connects the two vectors), then decode from that vector using `.spellstate()`. The code below demonstrates:

In [80]:
pairs = [('paper', 'plastic'),
         ('kitten', 'puppy'),
         ('birthday', 'anniversary'),
         ('artificial', 'intelligence'),
         ('allison', 'parrish'),
         ('moses', 'middletown'),
         ('day', 'night'),
         ('january', 'december')]

In [121]:
for start_s, end_s in pairs:
    start = pin.phonemestate(start_s)
    end = pin.phonemestate(end_s)
    mid = (start + end) / 2
    mid_s = pin.spellstate(mid)
    print(" → ".join([start_s, mid_s, end_s]))

paper → palper → plastic
kitten → cuptie → puppy
birthday → artherday → anniversary
artificial → interifical → intelligence
allison → aarishen → parrish
moses → midelsown's → middletown
day → night → night
january → denceure → december


### Example: Phonetic resizing of texts

If you calculate the phoneme state for each word in a text, you essentially have a time series, sort of like samples in an audio file, or even like pixels in an image. Many of `scipy`'s image and audio manipulation functions work on 256-dim data as well as they work on 1d audio samples or 3d pixels, like the `zoom` function that we used above. In the cell below, I've made a small interactive widget for "zooming" in on a text—i.e., calculating the phoneme states for each word in a text, then resampling that array so it's longer, then decoding from the resampled array. In the process, you introduce interpolated data points, whose corresponding spellings seem to "smear" sound:

In [162]:
@interact(s="phonetic resizing is fun",
          factor=widgets.FloatSlider(
              min=0.0, max=4, step=0.1, value=1.0,
              continuous_update=False))
def resizer(s, factor=1.0):
    orig = np.array(
        [pin.phonemestate(tok.lower()) for tok in s.split()]
    )
    resized = zoom(orig, (factor, 1), order=4)
    return " ".join([pin.spellstate(vec) for vec in resized])

interactive(children=(Text(value='phonetic resizing is fun', description='s'), FloatSlider(value=1.0, continuo…

### Example: Phonetic variation

You can create random words by decoding from a random phoneme state:

In [144]:
for i in range(12):
    print(pin.spellstate(np.random.randn(256)))

am-zows
soche
tron
utsties
ixi
rair's
liufs
gille
x.
yoove
froos
beas's


However, you don't get a lot of variation in words generated this way, since the distribution of the RNN hidden state doesn't match any particular random distribution. So there are large parts of the "space" of possible words that will never be produced by picking numbers from (e.g.) a normal distribution, as above.

A better strategy is to pick a word, and then generate variations on it by adding random noise to its phoneme state:

In [139]:
s = "python"
state = pin.phonemestate(s)
for i in range(12):
    # add normal noise with the same number of dims as the state
    noisy = state + np.random.randn(state.shape[0])
    print(pin.spellstate(noisy))

t-thine
pyt
pytin
peyhehthon
pitheume
phohthen
pathane
ptthengens
phyhthen
peiturenen
piththen
petheram


In the following cell, I've written a little interactive widget that applies the same noise to each word in the input. You can adjust the multiplier on the noise to make the words more or less close to the original state, and pick the random seed to use when generating the noise.

In [160]:
@interact(s="how doth the little crocodile improve his shining tail",
          multiplier=widgets.FloatSlider(
              min=0, max=2.0, step=0.1, value=1.0, continuous_update=False),
          seed=widgets.IntSlider(min=0, max=10000, continuous_update=False))
def noisytext(s, multiplier, seed):
    # keep random number consistent for same string
    np.random.seed(seed)
    noise = np.random.randn(256)*multiplier
    out = []
    for w in s.split():
        out.append(pin.spellstate(pin.phonemestate(w)+noise))
    print(" ".join(out))

interactive(children=(Text(value='how doth the little crocodile improve his shining tail', description='s'), F…