<a href="https://colab.research.google.com/github/Akilesh1989/generating-baby-names/blob/main/Generating_baby_names.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## PURPOSE
To generate baby names all by yourself. This notebook(that is what this is called) is designed for someone with absolutely no programming experience. 
All you have to do is follow a simple instructions here to try this notebook for yourself. I hope you have some fun.

## Inputs
**Input 1:** First off, you need to provide some names. The program will look at these names and build a model using it. There are 49 names that I have already added. To add a name, scroll down and in the next cell you will see a list of names. Simply add a name at the end of the list just before the square bracket ends. 
For example, if you want to add the name "Brutus" to the list of names, you can do so like this,
```
'yazhini',
'Brutus',
]
```
If you are adding more than one name, you will have to add a comma after each name. For example, if you want to add 'Cassius', 'Sam' and 'Gopal' to the list, you can do so, like,
```
'yazhini',
'Cassius',
'Sam',
'Gopal',
]
```
Please note that every name you add should be enclosed with single quotes and the comma should be after the single quote. To remove a name, simply remove that line from the list. You can remove the entire list and replace it with your own list, but be sure to add the single quotes and commas.
The model performs decently if you give it 50 names, more the merrier.

**Input 2:** The second input you will need to provide is to tell the model if you want your name to start with a specific letter and specific second letter. For example, if you want the predicted names to start with "a" followed by "k", then set,
```
first_letter = 'a'
second_letter = 'k'
```
If you only want the first letter, then,
```
first_letter = 'a'
second_letter = None
```
If you do not want any conditions, then,
```
first_letter = None
second_letter = None
```
**Note:**: You **cannot** set only the second letter. You **CANNOT** do this, atleast for the time being.
```
first_letter = None
second_letter = 'k'
```

**Input 3:** The third input is the minimum length of the name. The default is 2.

**Input 4:** The fourth input will be the number of names you will want to see in the output. The default is 10. You can modify that by modifying the `num_names` in the cell below. The default has been set to 10. Sometimes, in the output, the number of names you see will be less than the number you provide here. This is because the algorithm cross checks the generated name with the names you provided in the first cell and shows you only the name that is unique to what the model generated.

**Input 5:** The fifth input is slightly techinical, but nothing serious. `epochs` are the number of times the model should iterate to train the model. Higher the epoch, the better the model will be trained. But too high an epoch the model might not perform well. Welcome to my world. An epoch of around 2000 was found to be doing well. Play with this parameter to see which gives you the best results.

### RUNNING THE PROGRAM
Once you are done entering the details, click on "Runtime" in the menu bar and click on "Restart and run all".
Scroll all the way to the bottom of this notebook to see your results.

Hope you find a name that you like. Good luck. :)

In [1]:
names = [
'adhi',
 'agira',
 'amirtha',
 'amutha',
 'ananthi',
 'anbarasi',
 'arulmozhi',
 'bagyam',
 'dharuna',
 'ezhil',
 'gnalam',
 'indumathi',
 'jeevitha',
 'kadal',
 'kalai',
 'kalaiselvi',
 'kamatchi',
 'kanimozhi',
 'kanmani',
 'kavitha',
 'kayalvizhi',
 'letchumi',
 'malarvizhi',
 'mani',
 'manimekalai',
 'manimozhi',
 'mathivathani',
 'mekala',
 'nagai',
 'navanithy',
 'nithura',
 'oviya',
 'praveena',
 'ranjitham',
 'sankavi',
 'saraniya',
 'tamilarasi',
 'tamilselvi',
 'thangam',
 'thenmozhi',
 'vennila',
 'vinitha',
 'yazhini',
]

In [2]:
# The characters should be enclosed in single quotes.
# If you do not want to assign a first or second letter, replace 'a' or 'k' with None
first_letter = 'a'
second_letter = 'r'
minimum_name_length = 3 # The default value is 2
num_names = 30 # change this to the desired number of names you want to see in the output.
epochs = 1000

# CODE ALERT
You would not want to expand this.

In [3]:
import numpy as np
from keras.models import Sequential
from keras.layers import LSTM, Dense, Bidirectional
from keras.callbacks import LambdaCallback

unique_names = list(set(names))
names = []
for name in unique_names:
  if name.endswith("."):
    names.append(name.lower())
  else:
    name = name + '.'
    names.append(name.lower())

unique_characters = []
for name in names:
  for character in name:
    if character not in unique_characters:
      unique_characters.append(character)

unique_characters = sorted(unique_characters)

# Convert from character to index
char_to_index = dict( (unique_characters[i], i) for i in range(0, len(unique_characters)))

# Convert from index to character
index_to_char = dict( (i, unique_characters[i]) for i in range(0, len(unique_characters)))

# maximum number of characters in Pokémon names
# this will be the number of time steps in the RNN
max_char = len(max(names, key=len))

# number of elements in the list of names, this is the number of training examples
m = len(names)

# number of potential characters, this is the length of the input of each of the RNN units
char_dim = len(char_to_index)
X = np.zeros((m, max_char, char_dim))
Y = np.zeros((m, max_char, char_dim))

for i in range(m):
    name = list(names[i])
    for j in range(len(name)):
        X[i, j, char_to_index[name[j]]] = 1
        if j < len(name)-1:
            Y[i, j, char_to_index[name[j+1]]] = 1


model = Sequential([
          LSTM(128, input_shape=(max_char, char_dim), return_sequences=True),
          # LSTM(128, input_shape=(max_char, char_dim), return_sequences=True),
          Dense(char_dim, activation='softmax')
])

model.compile(loss='categorical_crossentropy', optimizer='adam')

def make_name(model):
    name = []
    x = np.zeros((1, max_char, char_dim))
    end = False
    i = 0
    
    while end==False:
      predicted = model.predict(x)
      predicted_sliced = predicted[0,i]
      probs = list(predicted_sliced)
      probs = probs / np.sum(probs)
      index = np.random.choice(range(char_dim), p=probs)
      if i == max_char-2:
          character = '.'
          end = True
      else:
          character = index_to_char[index]
      name.append(character)
      x[0, i+1, index] = 1
      i += 1
      if character == '.':
          end = True
    
    name = [char for char in name if char != "."]
    # print("".join(name))
    return name

def generate_name_loop(epoch, _):
    if epoch % 25 == 0:
        print('Names generated after epoch %d:' % epoch)
        for i in range(5):
            make_name(model)
        
name_generator = LambdaCallback(on_epoch_end = generate_name_loop)
## Training the model
model.fit(X, Y, batch_size=64, epochs=epochs, verbose=0)

def make_name_with_starting_letter(model, starting_letter, second_letter):
    name = []
    x = np.zeros((1, max_char, char_dim))
    if first_letter:
      x[0, 0, char_to_index[starting_letter]] = 1
      first_character = index_to_char[char_to_index[starting_letter]]
      name.append(first_character)
    if second_letter:
      x[0, 1, char_to_index[second_letter]] = 1
      second_character = index_to_char[char_to_index[second_letter]]
      name.append(second_character)
    end = False
    i = 0
    
    while end==False:
      predicted = model.predict(x)
      predicted_sliced = predicted[0,i]
      probs = list(predicted_sliced)
      probs = probs / np.sum(probs)
      index = np.random.choice(range(char_dim), p=probs)
      if i == max_char-2:
          character = '.'
          end = True
      else:
          character = index_to_char[index]
      name.append(character)
      x[0, i+1, index] = 1
      i += 1
      if character == '.':
          end = True    
    name = [char for char in name if char != "."]
    return name

generated_names = []
while True:
  name = make_name_with_starting_letter(model, first_letter, second_letter)
  if len(name) < minimum_name_length:
    continue
  name = ''.join(name)
  name = name + "."
  if name not in names:
    generated_names.append(name)
  if len(generated_names) < num_names:
    continue
  else:
    break

unique_generated_names = list(set(generated_names))
for name in unique_generated_names:
  print(name.replace(".", ""))

arnulaozhi
arnanthi
arnuthuzhi
armankka
armultaa
arnanthe
ardali
armantha
armulthala
ardulm
armultama
arkalli
arduli
ardani
arganya
armanjeha
armuthaa
ardulmt
armuthama
arrulmozhi
armultoa
