# GloVe (Global Vectors for Word Representation)
- GloVe (Global Vectors for Word Representation) is an unsupervised learning algorithm for obtaining vector representations for words.
- Unlike Word2Vec, which predicts local context, GloVe leverages global statistical information of a corpus by constructing a co-occurrence matrix and factorizing it to obtain word vectors.
- Developed by researchers at Stanford, GloVe captures both local and global statistical information about words and their co-occurrence in a corpus.

## Key Concepts of GloVe
- **Co-occurrence Matrix:** Captures how frequently words appear together in a given context across the entire corpus.
- **Word Vectors:** Obtained by factorizing the co-occurrence matrix, capturing semantic relationships between words.
- **Global Statistics:** Uses the global co-occurrence statistics to capture information not easily captured by local context alone.

## Implementing GloVe in Python

In [7]:
!pip install glove

Collecting glove
  Using cached glove-1.0.2.tar.gz (44 kB)
  Preparing metadata (setup.py): started
  Preparing metadata (setup.py): finished with status 'done'
Building wheels for collected packages: glove
  Building wheel for glove (setup.py): started
  Building wheel for glove (setup.py): finished with status 'error'
  Running setup.py clean for glove
Failed to build glove


  error: subprocess-exited-with-error
  
  python setup.py bdist_wheel did not run successfully.
  exit code: 1
  
  [85 lines of output]
  running bdist_wheel
  running build
  running build_py
  creating build
  creating build\lib.win-amd64-cpython-311
  creating build\lib.win-amd64-cpython-311\glove
  copying glove\glove.py -> build\lib.win-amd64-cpython-311\glove
  copying glove\__init__.py -> build\lib.win-amd64-cpython-311\glove
  running egg_info
  writing glove.egg-info\PKG-INFO
  writing dependency_links to glove.egg-info\dependency_links.txt
  writing requirements to glove.egg-info\requires.txt
  writing top-level names to glove.egg-info\top_level.txt
  reading manifest file 'glove.egg-info\SOURCES.txt'
  Traceback (most recent call last):
    File "<string>", line 2, in <module>
    File "<pip-setuptools-caller>", line 34, in <module>
    File "C:\Users\USER\AppData\Local\Temp\pip-install-sg3yu5yo\glove_8e6853109652471cb849e46f683f4c95\setup.py", line 61, in <module>
      s

In [6]:
from glove import Corpus, Glove

# Sample documents
documents = [
    "Cats are beautiful animals.",
    "Dogs are loyal and friendly animals.",
    "Cats and dogs are popular pets.",
    "I love my dog.",
    "My cat is very playful."
]

# Tokenize the documents
nltk.download('punkt')
tokenized_docs = [nltk.word_tokenize(doc.lower()) for doc in documents]

# Create a corpus object
corpus = Corpus()

# Train the corpus to create the co-occurrence matrix
corpus.fit(tokenized_docs, window=5)

# Create and train the GloVe model
glove = Glove(no_components=100, learning_rate=0.05)
glove.fit(corpus.matrix, epochs=20, no_threads=4, verbose=True)
glove.add_dictionary(corpus.dictionary)

# Access the vector for a specific word
cat_vector_glove = glove.word_vectors[glove.dictionary['cat']]
print("GloVe Vector for 'cat':\n", cat_vector_glove)

# Find the most similar words to 'cat'
similar_words_glove = glove.most_similar('cat')
print("Most similar words to 'cat' (GloVe):\n", similar_words_glove)


ModuleNotFoundError: No module named 'glove'