### Text generation using tensorflow

 - we are going to  create a neural network that can predict text ,that when given a corpus of text, understands the patterns of the text.When given a new text  called seed , it can be able to predict words that come next.

##### Step 1 : Turning sequences to input sequences 

- Training neural networks with an input sequences needs you to have a feature and a label .
- 

In [1]:
import tensorflow as tf
from keras.preprocessing.text import Tokenizer
from keras.preprocessing.sequence import pad_sequences
from keras.models import Sequential 
from keras.layers import Embedding ,LSTM, Bidirectional ,Dense
import numpy as np


2024-01-03 10:41:11.329294: I tensorflow/core/util/port.cc:111] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-01-03 10:41:11.840262: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


In [5]:
tokenizer = Tokenizer()

data = " Ah, yes, note must be made of the first oddity of this dreadful May evening. \nThere was not a single person to be seen, not only by the stand,but also along the whole walk parallel to Malaya Bronnaya Street. \nAt that hour when it seemed no longer possible to breathe, when the sun, having scorched Moscow, was collapsing in a dry haze somewhere beyond Sadovoye Ring, no one came under the lindens, no one sat on a bench, the walk was empty. \n‘Give us seltzer,’ Berlioz asked. \n‘There is no seltzer,’ the woman in the stand said, and for some reason became offended. \n ‘Is there beer?’ Homeless inquired in a rasping voice. \n ‘Beer’ll be delivered towards evening,’ the woman replied. \n ‘Then what is there?’ asked Berlioz. \n ‘Apricot soda, only warm,’ said the woman. \n‘Well, let’s have it, let’s have it! ...’ \nThe soda produced an abundance of yellow foam, and the air began to smell of a barber-shop. \nHaving finished drinking, the writers immediately started to hiccup, paid, and sat down on a bench face to the pond and back to Bronnaya. \nHere the second oddity occurred, touching Berlioz alone. He suddenly stopped hiccupping, his heart gave a thump and dropped away some- where for an instant, then came back, but with a blunt needle lodged in it. \n Besides that, Berlioz was gripped by fear, groundless, yet so strong that he wanted to flee the Ponds at once without looking back. \nBerlioz looked around in anguish, not understanding what had frightened him. \nHe paled, wiped his forehead with a handkerchief, thought: \n‘What’s the matter with me? This has never happened before. \nMy heart’s acting up... I’m overworked... \nMaybe it’s time to send it all to the devil and go to Kislovodsk...’ \n And here the sweltering air thickened before him, and a transparent citizen of the strangest appearance wove himself out of it. \nA peaked jockey’scap on his little head, a short checkered jacket also made of air. \n...A citizen seven feet tall, but narrow in the shoulders, unbelievably thin, and, kindly note, with a jeering physiognomy. \nThe life of Berlioz had taken such a course that he was unaccustomed to extraordinary phenomena. \nTurning paler still, he goggled his eyes and thought in consternation: \n‘This can’t be!...’ \nBut, alas, it was, and the long, see-through citizen was swaying before him to the left and to the right without touching the ground. \nHere terror took such possession of Berlioz that he shut his eyes. \nWhen he opened them again, he saw that it was all over, the phantasm had dissolved, the checkered one had vanished, and with that the blunt needle had popped out of his heart. \n‘Pah, the devil!’ exclaimed the editor. ‘You know, Ivan, I nearly had heat stroke just now! \nThere was even something like a hallucination...’ \nHe attempted to smile, but alarm still jumped in his eyes and his hands trembled. \nHowever, he gradually calmed down, fanned himself with his handkerchief and, having said rather cheerfully: \n‘Well, and so...’ went on with the conversation interrupted by their soda-drinking. \nThis conversation, as was learned afterwards, was about Jesus Christ.The thing was that the editor had commissioned from the poet a long anti-religious poem for the next issue of his journal. \nIvan Nikolaevich had written this poem, and in a very short time, but unfortunately the editor was not at all satisfied with it. \nHomeless had portrayed the main character of his poem - that is, Jesus - in very dark colours, but nevertheless the whole poem, in the editor’s opinion, had to be written over again. \nAnd so the editor was now giving the poet something of a lecture on Jesus, with the aim of underscoring the poet’s essential error. \nIt is hard to say what precisely had let Ivan Nikolaevich down - the descriptive powers of his talent or a total unfamiliarity with the question he was writing about - but his Jesus came out, well, completely alive, the once-existing Jesus, though, true, a Jesus furnished with all negative features.\nNow, Berlioz wanted to prove to the poet that the main thing was not how Jesus was, good or bad, but that this same Jesus, as a person, simply never existed in the world, and all the stories about him were mere fiction, the most ordinary mythology. \nIt must be noted that the editor was a well-read man and in his conversation very skillfully pointed to ancient historians - for instance, the famous Philo of Alexandria  and the brilliantly educated Flavius Josephus 7 - who never said a word about the existence of Jesus. Displaying a solid erudition, Mikhail Alexandrovich also informed the poet, among other things, that the passage in the fifteenth book of Tacitus’s famous Annals, the forty-fourth chapter, where mention is made of the execution of Jesus, was nothing but a later spurious interpolation."

corpus = data.lower().split("\n")
tokenizer.fit_on_texts(corpus)
total_words = len(tokenizer.word_index)+1
print(total_words)

396


##### Step 2 :Splitting the sentences into smaller sequences

In [6]:
input_sequences = []
for line in corpus:
    token_list = tokenizer.texts_to_sequences([line])[0]
    for i in range(1,len(token_list)):
        n_gram_sequence = token_list[:i+1]
        input_sequences.append(n_gram_sequence)

print(input_sequences[:5])

[[108, 109], [108, 109, 58], [108, 109, 58, 59], [108, 109, 58, 59, 18], [108, 109, 58, 59, 18, 33]]


- Since we have the input sequences, now we prepad them.

In [7]:
# finding the longest sentence in the input sequences

max_sequence_len = max([len(x) for x in input_sequences])

input_sequences = np.array(pad_sequences(input_sequences,maxlen=max_sequence_len,padding = 'pre'))


#### Step 3: Splitting the input sequences into features and labels

In [8]:
xs,labels = input_sequences[:,:-1],input_sequences[:,-1]

# to encode your labels into a set of Ys that you can use to train 

ys = tf.keras.utils.to_categorical(labels,num_classes=total_words)

##### Step 4 : Create a model

In [6]:
model = Sequential()
model.add(Embedding(total_words,8))
model.add(Bidirectional(LSTM(max_sequence_len-1)))
model.add(Dense(total_words,activation='softmax'))

# compile the model 

model.compile(loss='categorical_crossentropy',optimizer='adam',metrics='accuracy')

# train the model on 1000 epochs since it is a small dataset

history = model.fit(xs,ys,epochs=1000,verbose=1)

Epoch 1/1000
Epoch 2/1000
Epoch 3/1000
Epoch 4/1000
Epoch 5/1000
Epoch 6/1000
Epoch 7/1000
Epoch 8/1000
Epoch 9/1000
Epoch 10/1000
Epoch 11/1000
Epoch 12/1000
Epoch 13/1000
Epoch 14/1000
Epoch 15/1000
Epoch 16/1000
Epoch 17/1000
Epoch 18/1000
Epoch 19/1000
Epoch 20/1000
Epoch 21/1000
Epoch 22/1000
Epoch 23/1000
Epoch 24/1000
Epoch 25/1000
Epoch 26/1000
Epoch 27/1000
Epoch 28/1000
Epoch 29/1000
Epoch 30/1000
Epoch 31/1000
Epoch 32/1000
Epoch 33/1000
Epoch 34/1000
Epoch 35/1000
Epoch 36/1000
Epoch 37/1000
Epoch 38/1000
Epoch 39/1000
Epoch 40/1000
Epoch 41/1000
Epoch 42/1000
Epoch 43/1000
Epoch 44/1000
Epoch 45/1000
Epoch 46/1000
Epoch 47/1000
Epoch 48/1000
Epoch 49/1000
Epoch 50/1000
Epoch 51/1000
Epoch 52/1000
Epoch 53/1000
Epoch 54/1000
Epoch 55/1000
Epoch 56/1000
Epoch 57/1000
Epoch 58/1000
Epoch 59/1000
Epoch 60/1000
Epoch 61/1000
Epoch 62/1000
Epoch 63/1000
Epoch 64/1000
Epoch 65/1000
Epoch 66/1000
Epoch 67/1000
Epoch 68/1000
Epoch 69/1000
Epoch 70/1000
Epoch 71/1000
Epoch 72/1000
E

#### Predicting the next word

In [7]:
seed_text = 'it seemed no longer possible'
# tokenize
token_list = tokenizer.texts_to_sequences([seed_text])[0]

#pad
token_list=pad_sequences([token_list],maxlen=max_sequence_len-1,padding='pre')

predicted = np.argmax(model.predict(token_list),axis=-1)
print(predicted)

[5]


In [9]:
for word,index in tokenizer.word_index.items():
    if index == predicted:
        print(word)
        break 

to


In [10]:
# since it took  a while to train this model it better i save it locally 

model.save('/media/danlof/dan files/data_science_codes/project_3.2/text_gen_model.h5')

  saving_api.save_model(


In [9]:
# load
model = tf.keras.models.load_model('/media/danlof/dan files/data_science_codes/project_3.2/text_gen_model.h5')

In [12]:
# compounding predictions to make text

seed_text='it seemed no longer possible'
next_words = 10

for _ in range(next_words):
    token_list=tokenizer.texts_to_sequences([seed_text])[0]
    token_list = pad_sequences([token_list],maxlen= max_sequence_len-1,padding='pre')
    predicted = model.predict(token_list,verbose=0)[0]
    predicted_index = np.argmax(predicted)
    output_word=""

    for word ,index in tokenizer.word_index.items():
        if index==predicted_index:
            output_word=word
            break
    seed_text += " " + output_word

print(seed_text)

it seemed no longer possible to breathe when the sun having scorched moscow was collapsing
