Skip to content

CBOW & SkipGram implementation in Tensorflow for Arabic & English

License

Notifications You must be signed in to change notification settings

Mahran-xo/CBOW-SKipGram

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

CBOW and Skip-Gram Word Embeddings Notebook

This is a notebook where I implement word embeddings for both Arabic and English languages using the CBOW (Continuous Bag-of-Words) and Skip-Gram algorithms. Word embeddings are a powerful technique for representing words as vectors in a high-dimensional space, which can be used to capture semantic relationships between words.

In this notebook, I explore how the CBOW and Skip-Gram algorithms can be used to generate word embeddings for both Arabic and English languages. I demonstrate how to preprocess text data, train the word embedding models, and visualize the resulting embeddings using t-SNE (t-Distributed Stochastic Neighbor Embedding) technique.

Here's a sample visualization of the word embeddings generated from the Skip-Gram algorithm on the English corpus:

image Fig.1. Visualization of the word embeddings generated from the Skip-Gram algorithm on the English corpus showing how the word "bigger" and "grown" are close to each other, implying the similarity between them.

image Fig.2. Visualization of the word embeddings generated from the Skip-Gram algorithm on the English corpus showing how the word "grown" is close to the word "bigger" (shown in Fig.1.), indicating their similarity.

About

CBOW & SkipGram implementation in Tensorflow for Arabic & English

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages