# Impact of Summary Length on Model Comparisons

The length of the summary (e.g., 50 vs. 100 vs. 200 words) significantly affects the evaluation metrics used for model comparison.  
Below, we analyze the effects for **Cosine Similarity**, **BERT embeddings**, and **Word2Vec** models.

---

## 1. Cosine Similarity (TF-IDF / Bag-of-Words)
- **How it works:** Measures similarity based on word frequency without capturing context.  
- **Effect of length:**
  - Short (<50): More precise but lacks context.  
  - Moderate (50–100): Best balance of precision & context.  
  - Long (100–200): May introduce noise and unrelated details.  

---

## 2. BERT Embeddings (Sentence Transformers)
- **How it works:** Converts sentences into dense embeddings, capturing meaning rather than just word overlap.  
- **Effect of length:**
  - Short (<50): Lacks context, similarity less reliable.  
  - Moderate (50–100): Best performance, balances meaning and detail.  
  - Long (100–200): Risk of losing focus due to irrelevant information.  

---

## 3. Word2Vec (Word-Level Embeddings)
- **How it works:** Captures word meanings but ignores word order and sentence structure.  
- **Effect of length:**
  - Short (<50): Similarity scores less stable due to fewer embeddings.  
  - Moderate (50–100): Best performance for word meaning-based comparisons.  
  - Long (100–200): More noise, redundant words affect results.  

---

##  Summary Table

| **Model**         | **Short (<50 words)**       | **Moderate (50–100 words)** | **Long (100–200 words)**       |
|--------------------|-----------------------------|------------------------------|--------------------------------|
| **Cosine Similarity** | Precise but lacks context |  Best balance of precision & context |  Noise from extra words       |
| **BERT**             |  Lacks full context        |  Best performance, balances detail   |  Loses focus, extra info adds noise |
| **Word2Vec**         |  Less stable similarity    |  Best word-based comparison         |  Redundant words introduce noise |

---

##  Final Verdict
The **best length range is 50–100 words**:
- Provides enough context for **BERT**.  
- Keeps precision intact for **Cosine Similarity**.  
- Balances representation in **Word2Vec**.  

For your project, **50–100 words** will yield the most reliable and comparable results across models.
