Skip to content

jrspatel/word_embeddings

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Text embeddings are essential for projects relying on textual context, as computers interpret data as numbers. This project explores converting textual information into machine-readable language using techniques like "TF-IDF," "One-Hot Encoding," "GloVe," and "BERT." Here, I implemented the "Word2Vec" model.

EXAMPLE : { Skip-gram model }

SENTENCE = "This is a word embeddings project".

words -> {word, embeddings, project} 
window size = 2  

central word = {embeddings}
   context , target     
{embeddings, word} 
{embeddings, project}
{embeddings, a} 

STEPS FOR PREDICTING THE TARGET WORDS GIVEN CONTEXT :

1. extract frequent words and remove noise
2. perform one-hot encoding for these words(vocabulary)
3. pass into a neural network model with 3 layers
        -> 1st layer - one-hot encoding layer
        -> 2nd layer - embeddings layer
            {one-hot encodings -> dense vectors}
        -> 3rd layer - softmax  layer
            {calculating the probability}
        -> Training the model
            {calculate the loss and backpropagation}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published