# <span style="font-family: 'Georgia', serif; font-size: 2.4rem;">Mirror, Mirror: Can LLMs Speak in Idioms?</span> 🪞🗣️

> <span style="font-family: 'Helvetica Neue', sans-serif;">COMM4190 · Prompt/Response & Readings Blog</span>  
> <span style="font-size:0.95rem; color:#666;">Week X — “Idioms & Pragmatics”</span>

![image.png](attachment:746a199d-41a9-4023-a3ef-48d067622e8a.png)

---

## Setting the Scene 🎬

The first time I told an American friend I needed to *“hit the sack”*, she nodded with sympathy: *“Yeah, long day?”*.  
Later, I tried the same line on my Spanish cousin. She blinked, then burst out laughing: *“Why would you punch a bag of potatoes?”*  

Idioms are tricky. They carry **cultural DNA**, slip between **literal and figurative worlds**, and make even fluent speakers stumble.  
So, what happens when we drop them into the lap of an LLM?  

---

**Goals of this post**
1. Design prompts that force the model to choose between literal vs. figurative readings.  
2. Evaluate if the model gets **pragmatic context** right.  
3. Reflect on idioms as a window into *what machines know — and what they don’t*.  

In [None]:
# pseudo-code setup
idioms = [
    "kick the bucket",    
    "spill the beans",    
    "hit the sack",       
    "the ball is in your court", 
    "break the ice"       
]

scenarios = ["literal context", "figurative context", "ambiguous context"]

for idiom in idioms:
    for scenario in scenarios:
        response = LLM(prompt=idiom + " in " + scenario)
        log(response)


## Why Idioms Matter 🧩

Idioms are **linguistic shortcuts**.  
- They condense whole stories into a phrase.  
- They test whether we belong to a group.  
- They resist translation, becoming inside jokes of entire cultures.  

For humans, idioms are a wink.  
For LLMs, they’re a stress-test: can the machine jump from **form** to **function** without tripping?  

When we probe idioms, we’re not just asking *“Does the model know English?”*.  
We’re asking: *“Does it know when to play along, and when to take things literally?”*

In [None]:
# Example prompt-response pseudo-code

prompt = """
You are chatting with a friend who asks why you are yawning.
Use the idiom 'hit the sack' naturally.
"""

response = LLM(prompt)
print(response)

# Possible output:
# "Yeah, I'm exhausted... I think it's time to hit the sack."

### What Happened Here? 🧐

The response feels natural — something a tired roommate might say.  
The idiom slots into casual conversation with no friction.  

But when I tried a similar prompt in a more formal frame — “Write a business email using *hit the sack*” — the result was hilariously off:  
*“Dear Mr. Johnson, after reviewing the quarterly numbers, I will now hit the sack.”*  

This is where idioms expose a tension: the model **knows the phrase**, but doesn’t always know the **social rules** around it.

In [None]:
# Ambiguous context test
prompt = "The farmer hit the sack after a long harvest day. Explain."
response = LLM(prompt)
print(response)

# Model might output:
# "This means the farmer went to bed, using the idiom 'hit the sack'."

I grew up around farms, and “hitting the sack” could just as easily mean slamming down a grain bag as collapsing into bed.  
When I asked the model to interpret, it confidently went figurative: *“The farmer went to bed.”*  
No hesitation, no recognition of ambiguity.  

Humans pause here.  
We ask: *“Wait, do you mean literally or figuratively?”*  
That tiny pause is **pragmatics in action** — and it’s where LLMs still feel a bit tone-deaf.

## Idioms Across Cultures 🌍

During a semester abroad in Madrid, I once said “spill the beans” in class.  
My professor tilted his head and replied with a grin: *“En español, we’d say ‘contar el secreto,’ but if you want an idiom, maybe ‘irse de la lengua.’”*  
Different phrase, same wink.  

LLMs struggle here.  
They may:
- Translate idioms **literally** (and lose the meaning).  
- **Invent** non-existent equivalents.  
- Or **overuse** idioms where a native would stay plain.  

Idioms, then, are not just language — they’re **cultural passports**.  
And LLMs are still waiting at customs.

In [None]:
# Prompt engineering experiment

prompt = """
Explain 'spill the beans' to a 10-year-old.
First, give a literal explanation. 
Then, clarify the figurative meaning with an example.
"""

response = LLM(prompt)
print(response)

# Expected structure:
# - Literal: "It means tipping over a can of beans."
# - Figurative: "It means revealing a secret, like telling about a surprise party."


## Findings 🔎

1. **Literal vs. Figurative**  
   LLMs usually guess correctly when context is strong. But ambiguity makes them overconfident.  

2. **Pragmatic Sensitivity**  
   Casual prompts invite idioms. Formal ones repel them.  

3. **Cultural Boundaries**  
   Cross-language idioms reveal blind spots: what’s playful in one language becomes nonsense in another.  

4. **Prompt Engineering**  
   Explicit instructions (“first literal, then figurative”) produce the clearest, most human-like answers.  

---

A classmate once joked that idioms are “the emojis of old language.”  
And maybe that’s the point: they carry tone, warmth, and shared history — things machines can mimic, but not live.  

# Conclusion 🚀

Idioms show us both the **power** and the **limits** of LLMs.  
They can sound fluent, even witty, when the context is clear.  
But ambiguity, formality, and culture expose cracks.  

As users, we can:
- Probe models with idioms to see where they shine and stumble.  
- Use prompt engineering to guide them toward clarity.  
- Remember that idioms are more than words — they’re experiences shared.  

---

> *“To break the ice, LLMs need more than data — they need stories.”*  
> <span style="font-size:0.9rem; color:#777;">— COMM4190 reflection log</span>
