# <span style="font-family: 'Georgia', serif; font-size: 2.4rem;">Mirror, Mirror: Can LLMs Speak in Idioms?</span> ü™ûüó£Ô∏è

> <span style="font-family: 'Helvetica Neue', sans-serif;">COMM4190 ¬∑ Prompt/Response & Readings Blog</span>  
> <span style="font-size:0.95rem; color:#666;">Week 1 ‚Äî ‚ÄúIdioms & Pragmatics‚Äù</span>


![image.png](attachment:b0f34cbb-984e-4112-aa38-fc776ecef4dd.png)
---

## Setting the Scene üé¨

The first time I told an American friend I needed to *‚Äúhit the sack‚Äù*, she nodded with sympathy: *‚ÄúYeah, long day?‚Äù*.  
Later, I tried the same line on my Spanish cousin. She blinked, then burst out laughing: *‚ÄúWhy would you punch a bag of potatoes?‚Äù*  

Idioms are tricky. They carry **cultural DNA**, slip between **literal and figurative worlds**, and make even fluent speakers stumble.  
So, what happens when we drop them into the lap of an LLM?  

---

**Goals of this post**
1. Design prompts that force the model to choose between literal vs. figurative readings.  
2. Evaluate if the model gets **pragmatic context** right.  
3. Reflect on idioms as a window into *what machines know ‚Äî and what they don‚Äôt*.  

In [None]:
# pseudo-code setup
idioms = [
    "kick the bucket",    
    "spill the beans",    
    "hit the sack",       
    "the ball is in your court", 
    "break the ice"       
]

scenarios = ["literal context", "figurative context", "ambiguous context"]

for idiom in idioms:
    for scenario in scenarios:
        response = LLM(prompt=idiom + " in " + scenario)
        log(response)


## Why Idioms Matter üß©

Idioms are **linguistic shortcuts**.  
- They condense whole stories into a phrase.  
- They test whether we belong to a group.  
- They resist translation, becoming inside jokes of entire cultures.  

For humans, idioms are a wink.  
For LLMs, they‚Äôre a stress-test: can the machine jump from **form** to **function** without tripping?  

When we probe idioms, we‚Äôre not just asking *‚ÄúDoes the model know English?‚Äù*.  
We‚Äôre asking: *‚ÄúDoes it know when to play along, and when to take things literally?‚Äù*

In [None]:
# Example prompt-response pseudo-code

prompt = """
You are chatting with a friend who asks why you are yawning.
Use the idiom 'hit the sack' naturally.
"""

response = LLM(prompt)
print(response)

# Possible output:
# "Yeah, I'm exhausted... I think it's time to hit the sack."

### What Happened Here? üßê

The response feels natural ‚Äî something a tired roommate might say.  
The idiom slots into casual conversation with no friction.  

But when I tried a similar prompt in a more formal frame ‚Äî ‚ÄúWrite a business email using *hit the sack*‚Äù ‚Äî the result was hilariously off:  
*‚ÄúDear Mr. Johnson, after reviewing the quarterly numbers, I will now hit the sack.‚Äù*  

This is where idioms expose a tension: the model **knows the phrase**, but doesn‚Äôt always know the **social rules** around it.

In [None]:
# Ambiguous context test
prompt = "The farmer hit the sack after a long harvest day. Explain."
response = LLM(prompt)
print(response)

# Model might output:
# "This means the farmer went to bed, using the idiom 'hit the sack'."

I grew up around farms, and ‚Äúhitting the sack‚Äù could just as easily mean slamming down a grain bag as collapsing into bed.  
When I asked the model to interpret, it confidently went figurative: *‚ÄúThe farmer went to bed.‚Äù*  
No hesitation, no recognition of ambiguity.  

Humans pause here.  
We ask: *‚ÄúWait, do you mean literally or figuratively?‚Äù*  
That tiny pause is **pragmatics in action** ‚Äî and it‚Äôs where LLMs still feel a bit tone-deaf.

## Idioms Across Cultures üåç

During a semester abroad in Madrid, I once said ‚Äúspill the beans‚Äù in class.  
My professor tilted his head and replied with a grin: *‚ÄúEn espa√±ol, we‚Äôd say ‚Äòcontar el secreto,‚Äô but if you want an idiom, maybe ‚Äòirse de la lengua.‚Äô‚Äù*  
Different phrase, same wink.  

LLMs struggle here.  
They may:
- Translate idioms **literally** (and lose the meaning).  
- **Invent** non-existent equivalents.  
- Or **overuse** idioms where a native would stay plain.  

Idioms, then, are not just language ‚Äî they‚Äôre **cultural passports**.  
And LLMs are still waiting at customs.

In [None]:
# Prompt engineering experiment

prompt = """
Explain 'spill the beans' to a 10-year-old.
First, give a literal explanation. 
Then, clarify the figurative meaning with an example.
"""

response = LLM(prompt)
print(response)

# Expected structure:
# - Literal: "It means tipping over a can of beans."
# - Figurative: "It means revealing a secret, like telling about a surprise party."


## Findings üîé

1. **Literal vs. Figurative**  
   LLMs usually guess correctly when context is strong. But ambiguity makes them overconfident.  

2. **Pragmatic Sensitivity**  
   Casual prompts invite idioms. Formal ones repel them.  

3. **Cultural Boundaries**  
   Cross-language idioms reveal blind spots: what‚Äôs playful in one language becomes nonsense in another.  

4. **Prompt Engineering**  
   Explicit instructions (‚Äúfirst literal, then figurative‚Äù) produce the clearest, most human-like answers.  

---

A classmate once joked that idioms are ‚Äúthe emojis of old language.‚Äù  
And maybe that‚Äôs the point: they carry tone, warmth, and shared history ‚Äî things machines can mimic, but not live.  

# Conclusion üöÄ

Idioms show us both the **power** and the **limits** of LLMs.  
They can sound fluent, even witty, when the context is clear.  
But ambiguity, formality, and culture expose cracks.  

As users, we can:
- Probe models with idioms to see where they shine and stumble.  
- Use prompt engineering to guide them toward clarity.  
- Remember that idioms are more than words ‚Äî they‚Äôre experiences shared.  

---

> *‚ÄúTo break the ice, LLMs need more than data ‚Äî they need stories.‚Äù*  
> <span style="font-size:0.9rem; color:#777;">‚Äî COMM4190 reflection log</span>
