# LLMs for Text Transformation

In this notebook, we will explore how to use Large Language Models for text transformation tasks such as language translation, spelling and grammar checking, tone adjustment, and format conversion.

## Setup

In [1]:
from openai import OpenAI
import os

from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file

OPENAI_API_KEY  = os.getenv('your_API_KEY_here')

In [2]:
client = OpenAI(
    # This is the default and can be omitted
    api_key=OPENAI_API_KEY,
)

def get_completion(prompt, model="gpt-3.5-turbo", temperature=0): 
    messages = [{"role": "user", "content": prompt}]
    response = client.chat.completions.create(
        model=model,
        messages=messages,
        temperature=temperature, 
    )
    return response.choices[0].message.content

## Translation

ChatGPT is trained with sources in many languages. This gives the model the ability to do translation. Here are some examples of how to use this capability.

In [3]:
prompt = f"""
Translate the following English text to Spanish: \ 
```Hi, I would like to order a blender```
"""
response = get_completion(prompt)
print(response)

Hola, me gustar√≠a ordenar una licuadora.


In [4]:
prompt = f"""
Tell me which language this is: 
```Combien co√ªte le lampadaire?```
"""
response = get_completion(prompt)
print(response)

This is French.


In [5]:
prompt = f"""
Translate the following  text to French and Spanish
and English pirate: \
```I want to order a basketball```
"""
response = get_completion(prompt)
print(response)

French: ```Je veux commander un ballon de basket```

Spanish: ```Quiero ordenar un bal√≥n de baloncesto```

English: ```I want to order a basketball```


In [6]:
prompt = f"""
Translate the following text to Spanish in both the \
formal and informal forms: 
'Would you like to order a pillow?'
"""
response = get_completion(prompt)
print(response)

Formal: ¬øLe gustar√≠a ordenar una almohada?
Informal: ¬øTe gustar√≠a ordenar una almohada?


### Universal Translator
Imagine you are in charge of IT at a large multinational e-commerce company. Users are messaging you with IT issues in all their native languages. Your staff is from all over the world and speaks only their native languages. You need a universal translator!

In [7]:
user_messages = [
  "La performance du syst√®me est plus lente que d'habitude.",  # System performance is slower than normal         
  "Mi monitor tiene p√≠xeles que no se iluminan.",              # My monitor has pixels that are not lighting
  "Il mio mouse non funziona",                                 # My mouse is not working
  "M√≥j klawisz Ctrl jest zepsuty",                             # My keyboard has a broken control key
  "ÊàëÁöÑÂ±èÂπïÂú®Èó™ÁÉÅ"                                               # My screen is flashing
] 

In [8]:
for issue in user_messages:
    prompt = f"Tell me what language this is: ```{issue}```"
    lang = get_completion(prompt)
    print(f"Original message ({lang}): {issue}")

    prompt = f"""
    Translate the following  text to English \
    and Korean: ```{issue}```
    """
    response = get_completion(prompt)
    print(response, "\n")

Original message (French): La performance du syst√®me est plus lente que d'habitude.
English: "The system performance is slower than usual."

Korean: "ÏãúÏä§ÌÖú ÏÑ±Îä•Ïù¥ ÌèâÏÜåÎ≥¥Îã§ ÎäêÎ¶ΩÎãàÎã§." 

Original message (This is Spanish.): Mi monitor tiene p√≠xeles que no se iluminan.
English: "My monitor has pixels that do not light up."
Korean: "ÎÇ¥ Î™®ÎãàÌÑ∞ÏóêÎäî ÎπõÎÇòÏßÄ ÏïäÎäî ÌîΩÏÖÄÏù¥ ÏûàÏäµÎãàÎã§." 

Original message (Italian): Il mio mouse non funziona
English: My mouse is not working
Korean: ÎÇ¥ ÎßàÏö∞Ïä§Í∞Ä ÏûëÎèôÌïòÏßÄ ÏïäÏäµÎãàÎã§ 

Original message (This is Polish.): M√≥j klawisz Ctrl jest zepsuty
English: My Ctrl key is broken
Korean: Ï†ú Ctrl ÌÇ§Í∞Ä Í≥†Ïû• ÎÇ¨Ïñ¥Ïöî 

Original message (This is Chinese.): ÊàëÁöÑÂ±èÂπïÂú®Èó™ÁÉÅ
English: My screen is flickering
Korean: ÎÇ¥ ÌôîÎ©¥Ïù¥ ÍπúÎ∞ïÍ±∞Î¶ΩÎãàÎã§ 



## Tone Transformation
Writing can vary based on the intended audience. ChatGPT can produce different tones.


In [9]:
prompt = f"""
Translate the following from slang to a business letter: 
'Dude, This is Joe, check out this spec on this standing lamp.'
"""
response = get_completion(prompt)
print(response)

Dear Sir/Madam,

I am writing to bring to your attention the specifications of the standing lamp. 

Sincerely,
Joe


## Format Conversion
ChatGPT can translate between formats. The prompt should describe the input and output formats.

In [10]:
data_json = { "resturant employees" :[ 
    {"name":"Shyam", "email":"shyamjaiswal@gmail.com"},
    {"name":"Bob", "email":"bob32@gmail.com"},
    {"name":"Jai", "email":"jai87@gmail.com"}
]}

prompt = f"""
Translate the following python dictionary from JSON to an HTML \
table with column headers and title: {data_json}
"""
response = get_completion(prompt)
print(response)

<html>
<head>
  <title>Restaurant Employees</title>
</head>
<body>
  <table>
    <tr>
      <th>Name</th>
      <th>Email</th>
    </tr>
    <tr>
      <td>Shyam</td>
      <td>shyamjaiswal@gmail.com</td>
    </tr>
    <tr>
      <td>Bob</td>
      <td>bob32@gmail.com</td>
    </tr>
    <tr>
      <td>Jai</td>
      <td>jai87@gmail.com</td>
    </tr>
  </table>
</body>
</html>


In [11]:
from IPython.display import display, Markdown, Latex, HTML, JSON
display(HTML(response))

Name,Email
Shyam,shyamjaiswal@gmail.com
Bob,bob32@gmail.com
Jai,jai87@gmail.com


## Spellcheck/Grammar check.

Here are some examples of common grammar and spelling problems and the LLM's response. 

To signal to the LLM that you want it to proofread your text, you instruct the model to 'proofread' or 'proofread and correct'.

In [13]:
text = [ 
  "The girl with the black and white puppies have a ball.",  # The girl has a ball.
  "Yolanda has her notebook.", # ok
  "Its going to be a long day. Does the car need it‚Äôs oil changed?",  # Homonyms
  "Their goes my freedom. There going to bring they‚Äôre suitcases.",  # Homonyms
  "Your going to need you‚Äôre notebook.",  # Homonyms
  "That medicine effects my ability to sleep. Have you heard of the butterfly affect?", # Homonyms
  "This phrase is to cherck chatGPT for speling abilitty"  # spelling
]
for t in text:
    prompt = f"""Proofread and correct the following text
    and rewrite the corrected version. If you don't find
    and errors, just say "No errors found". Don't use 
    any punctuation around the text:
    ```{t}```"""
    response = get_completion(prompt)
    print(response)

The girl with the black and white puppies has a ball.
No errors found
No errors found.
Their goes my freedom. There going to bring they‚Äôre suitcases.

No errors found.

Rewritten: 
Their goes my freedom. There going to bring their suitcases.
You're going to need your notebook.
No errors found.
No errors found


In [14]:
text = f"""
Got this for my daughter for her birthday cuz she keeps taking \
mine from my room.  Yes, adults also like pandas too.  She takes \
it everywhere with her, and it's super soft and cute.  One of the \
ears is a bit lower than the other, and I don't think that was \
designed to be asymmetrical. It's a bit small for what I paid for it \
though. I think there might be other options that are bigger for \
the same price.  It arrived a day earlier than expected, so I got \
to play with it myself before I gave it to my daughter.
"""
prompt = f"proofread and correct this review: ```{text}```"
response = get_completion(prompt)
print(response)

Got this for my daughter for her birthday because she keeps taking mine from my room. Yes, adults also like pandas too. She takes it everywhere with her, and it's super soft and cute. One of the ears is a bit lower than the other, and I don't think that was designed to be asymmetrical. It's a bit small for what I paid for it though. I think there might be other options that are bigger for the same price. It arrived a day earlier than expected, so I got to play with it myself before I gave it to my daughter.


In [53]:
# ! pip3 install redlines

In [16]:
from redlines import Redlines

diff = Redlines(text,response)
display(Markdown(diff.output_markdown))

Got this for my daughter for her birthday <span style='color:red;font-weight:700;text-decoration:line-through;'>cuz </span><span style='color:green;font-weight:700;'>because </span>she keeps taking mine from my <span style='color:red;font-weight:700;text-decoration:line-through;'>room.  </span><span style='color:green;font-weight:700;'>room. </span>Yes, adults also like pandas <span style='color:red;font-weight:700;text-decoration:line-through;'>too.  </span><span style='color:green;font-weight:700;'>too. </span>She takes it everywhere with her, and it's super soft and <span style='color:red;font-weight:700;text-decoration:line-through;'>cute.  </span><span style='color:green;font-weight:700;'>cute. </span>One of the ears is a bit lower than the other, and I don't think that was designed to be asymmetrical. It's a bit small for what I paid for it though. I think there might be other options that are bigger for the same <span style='color:red;font-weight:700;text-decoration:line-through;'>price.  </span><span style='color:green;font-weight:700;'>price. </span>It arrived a day earlier than expected, so I got to play with it myself before I gave it to my daughter.

In [17]:
prompt = f"""
proofread and correct this review. Make it more compelling. 
Ensure it follows APA style guide and targets an advanced reader. 
Output in markdown format.
Text: ```{text}```
"""
response = get_completion(prompt)
display(Markdown(response))

I purchased this adorable panda plush toy for my daughter's birthday as she kept borrowing mine from my room. It's not just for kids - even adults can appreciate the charm of pandas. The plush is incredibly soft and cute, making it the perfect companion for my daughter wherever she goes. However, I did notice a slight asymmetry in the ears, which I believe was not intentional. Additionally, I found the size to be a bit smaller than expected given the price. I believe there are larger options available for the same cost. Despite this, the plush arrived a day earlier than anticipated, allowing me to enjoy it myself before gifting it to my daughter. Overall, while there are some minor flaws, the quality and cuteness of this panda plush make it a worthwhile purchase.

# Exercise
 - Complete the prompts similar to what we did in class. 
     - Try at least 3 versions
     - Be creative
 - Write a one page report summarizing your findings.
     - Were there variations that didn't work well? i.e., where GPT either hallucinated or wrong
 - What did you learn?

In [18]:
# Exercise 1: Movie Review Style Transformation
text = """
This movie was okay. The acting was decent and the special effects were fine. 
The story was a bit predictable but I didn't hate it. Might watch it again if there's nothing else on.
"""

prompt = f"""
Transform this casual movie review into the following styles:
1. Professional film critic (sophisticated vocabulary, technical analysis)
2. Excited teenager on social media (with emojis and slang)
3. Haiku poetry format
4. Shakespearean style

Text: ```{text}```
"""
response = get_completion(prompt)
print("Movie Review Transformations:")
print(response)

Movie Review Transformations:
1. Professional film critic:
The cinematic endeavor in question presents itself as passable, with performances that can be deemed adequate and visual effects that meet the standard. The narrative, while lacking in originality, does not evoke disdain. One may consider revisiting this work in the absence of alternative viewing options.

2. Excited teenager on social media:
OMG this movie was like, okay I guess? The actors were decent and the effects were fine I guess. The story was kinda predictable but not the worst. Might watch it again if I'm bored lol üòÇüé¨

3. Haiku:
Acting and effects,
Story somewhat predictable,
Might watch it again.

4. Shakespearean style:
Verily, this moving picture didst not stir great passions within mine breast. The performances were of middling quality, and the visual illusions did not astound. The tale, though lacking in novelty, did not provoke abhorrence. Perchance I shall view it again, should no other entertainment pres

# Analysis Report: Text Transformation Using LLMs

## 1. Translation Capabilities
### Strengths:
- Accurate translations between major languages (English, Spanish, French)
- Maintains context and meaning across translations
- Handles formal/informal tone distinctions well

### Limitations:
- May struggle with highly technical or domain-specific terms
- Cultural nuances might be lost in translation
- Some languages may have less accurate translations

## 2. Tone Transformation
### Successful Transformations:
- Business formal to casual conversational
- Technical to simple explanations
- Professional to social media style

### Challenges:
- Maintaining consistent tone throughout longer texts
- Risk of stereotypical or exaggerated language
- Balancing formality with readability

## 3. Format Conversion
### Effective Uses:
- JSON to HTML conversion
- Structured data to natural language
- Technical documentation to user-friendly guides

### Areas for Improvement:
- Complex nested structures
- Maintaining data integrity in transformations
- Handling edge cases and special characters

## 4. Grammar and Spelling Correction
### Strengths:
- Identifies common grammatical errors
- Corrects spelling mistakes consistently
- Improves sentence structure

### Limitations:
- May miss context-dependent errors
- Sometimes overcorrects informal writing
- Can be too rigid with style rules

## 5. Key Learnings
1. LLMs are highly versatile in text transformation tasks
2. The quality of output depends heavily on prompt clarity
3. Context is crucial for accurate transformations
4. Some tasks require multiple iterations for best results

## 6. Practical Applications
- Customer service communication
- Content localization
- Technical documentation
- Educational materials
- Marketing content adaptation

## 7. Best Practices Identified
1. Always provide clear context in prompts
2. Use specific examples when needed
3. Break complex transformations into smaller steps
4. Verify output for critical applications

## 8. Future Considerations
- Need for domain-specific fine-tuning
- Importance of maintaining original meaning
- Balance between automation and human review
- Potential for new transformation applications