Sentence_AutoEncoder

Sentence AutoEncoder using Keras

Preprocessing Dataset

The text preprocessing filters can be edited in the function

class TextProcessor:
    def get_word_list(self,sent):

        < 
            

        return < list of word >

Dataset can be easily loaded from

List of Files

tokenizer = Tokenizer()
data = tokenizer.process_text(file_names= <list of : filenames> )

From a large text

tokenizer = Tokenizer()
data = tokenizer.process_text(text=<str : text>)

Using a custom Function

You can make your function to use Tokenizer object

Make sure to reshape the data into shape (batch_size,seq_length)

You can also save the tokenizer as pickle

AutoEncoder Model

To load encoder Model and generate latent representation

encoder = keras.models.load_model("sentence_encoder.h5")
encoder.predict(<(batch_length,seq_length)>)

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
AutoEncoder.ipynb		AutoEncoder.ipynb
README.md		README.md
model_summary.png		model_summary.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AutoEncoder.ipynb

AutoEncoder.ipynb

README.md

README.md

model_summary.png

model_summary.png

Repository files navigation

Sentence_AutoEncoder

Preprocessing Dataset

Make sure to reshape the data into shape (batch_size,seq_length)

You can also save the tokenizer as pickle

AutoEncoder Model

To load encoder Model and generate latent representation

About

Releases

Packages

Languages

irshadcc/Sentence_AutoEncoder

Folders and files

Latest commit

History

Repository files navigation

Sentence_AutoEncoder

Preprocessing Dataset

Make sure to reshape the data into shape (batch_size,seq_length)

You can also save the tokenizer as pickle

AutoEncoder Model

To load encoder Model and generate latent representation

About

Resources

Stars

Watchers

Forks

Languages