Skip to content

irshadcc/Sentence_AutoEncoder

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

Sentence_AutoEncoder

Sentence AutoEncoder using Keras

Preprocessing Dataset

The text preprocessing filters can be edited in the function

class TextProcessor:
    def get_word_list(self,sent):

        < 
            

        return < list of word >

Dataset can be easily loaded from

  1. List of Files

    tokenizer = Tokenizer()
    data = tokenizer.process_text(file_names= <list of : filenames> )
  2. From a large text

    tokenizer = Tokenizer()
    data = tokenizer.process_text(text=<str : text>)
  3. Using a custom Function

    You can make your function to use Tokenizer object

Make sure to reshape the data into shape (batch_size,seq_length)

You can also save the tokenizer as pickle

AutoEncoder Model

To load encoder Model and generate latent representation

encoder = keras.models.load_model("sentence_encoder.h5")
encoder.predict(<(batch_length,seq_length)>)

About

Sentence AutoEncoder using Keras

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published