### GloVe


1. **Word Embeddings**:
    - GloVe (Global Vectors for Word Representation) is a pre-trained word embedding model that represents words as dense vectors in a high-dimensional space. These embeddings capture semantic relationships between words.

2. **Sentence Representation**:
    - Each sentence is represented as the average of the GloVe vectors of the words it contains. This is done using the `average_vector` function, which filters out words not present in the GloVe vocabulary and computes the mean vector.

3. **Cosine Similarity**:
    - The similarity between two sentence vectors is calculated using cosine similarity. This metric measures the cosine of the angle between two vectors, providing a value between -1 and 1, where 1 indicates identical vectors, 0 indicates orthogonality, and -1 indicates opposite vectors.

4. **Applications**:
    - This method is commonly used in natural language processing tasks such as text similarity, information retrieval, and clustering.


In [None]:
!pip install gensim

In [1]:
import gensim.downloader as api
from sklearn.metrics.pairwise import cosine_similarity
import numpy as np

In [2]:
def average_vector(sentence, model):
    words = [word.lower() for word in sentence.split() if word in model]
    if not words:
        return np.zeros(model.vector_size)
    return np.mean([model[word] for word in words], axis=0)

def glove_cosine(sent1, sent2):
    model = api.load('glove-wiki-gigaword-100')
    vec1 = average_vector(sent1, model)
    vec2 = average_vector(sent2, model)
    return cosine_similarity([vec1], [vec2])[0][0]


In [3]:
sent1 = "Dogs are wonderful pets."
sent2 = "Cats are amazing companions."
score = glove_cosine(sent1, sent2)
print(f"GloVe Cosine Similarity: {score:.4f}")

GloVe Cosine Similarity: 0.9518
