# LLMs for Text Transformation

In this notebook, we will explore how to use Large Language Models for text transformation tasks such as language translation, spelling and grammar checking, tone adjustment, and format conversion.

## Setup

In [1]:
from openai import OpenAI
import os

from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file

OPENAI_API_KEY  = os.getenv('OPENAI_API_KEY')

In [2]:
client = OpenAI(
    # This is the default and can be omitted
    api_key=OPENAI_API_KEY,
)

def get_completion(prompt, model="gpt-3.5-turbo", temperature=0): 
    messages = [{"role": "user", "content": prompt}]
    response = client.chat.completions.create(
        model=model,
        messages=messages,
        temperature=temperature, 
    )
    return response.choices[0].message.content

## Translation

ChatGPT is trained with sources in many languages. This gives the model the ability to do translation. Here are some examples of how to use this capability.

In [3]:
prompt = f"""
Translate the following English text to Spanish: \ 
```Hi, I would like to order a blender```
"""
response = get_completion(prompt)
print(response)

Hola, me gustaría ordenar una licuadora.


In [4]:
prompt = f"""
Tell me which language this is: 
```Combien coûte le lampadaire?```
"""
response = get_completion(prompt)
print(response)

French


In [34]:
prompt = f"""
Translate the following  text to French and Spanish
and pirate: \
```I want to order a basketball```
"""
response = get_completion(prompt)
print(response)

French: "Je veux commander un ballon de basket"

Spanish: "Quiero ordenar un balón de baloncesto"

Pirate: "I be wantin' to order a basketball"


In [6]:
prompt = f"""
Translate the following text to Spanish in both the \
formal and informal forms: 
'Would you like to order a pillow?'
"""
response = get_completion(prompt)
print(response)

Formal: ¿Le gustaría ordenar una almohada?
Informal: ¿Te gustaría ordenar una almohada?


### Universal Translator
Imagine you are in charge of IT at a large multinational e-commerce company. Users are messaging you with IT issues in all their native languages. Your staff is from all over the world and speaks only their native languages. You need a universal translator!

In [7]:
user_messages = [
  "La performance du système est plus lente que d'habitude.",  # System performance is slower than normal         
  "Mi monitor tiene píxeles que no se iluminan.",              # My monitor has pixels that are not lighting
  "Il mio mouse non funziona",                                 # My mouse is not working
  "Mój klawisz Ctrl jest zepsuty",                             # My keyboard has a broken control key
  "我的屏幕在闪烁"                                               # My screen is flashing
] 

In [8]:
for issue in user_messages:
    prompt = f"Tell me what language this is: ```{issue}```"
    lang = get_completion(prompt)
    print(f"Original message ({lang}): {issue}")

    prompt = f"""
    Translate the following  text to English \
    and Korean: ```{issue}```
    """
    response = get_completion(prompt)
    print(response, "\n")

Original message (This is French.): La performance du système est plus lente que d'habitude.
English: "The system performance is slower than usual."

Korean: "시스템 성능이 평소보다 느립니다." 

Original message (This is Spanish.): Mi monitor tiene píxeles que no se iluminan.
English: "My monitor has pixels that do not light up."

Korean: "내 모니터에는 빛나지 않는 픽셀이 있습니다." 

Original message (Italian): Il mio mouse non funziona
English: My mouse is not working
Korean: 내 마우스가 작동하지 않습니다 

Original message (This is Polish.): Mój klawisz Ctrl jest zepsuty
English: My Ctrl key is broken
Korean: 나의 Ctrl 키가 고장 났어요 

Original message (This is Chinese.): 我的屏幕在闪烁
English: My screen is flickering
Korean: 내 화면이 깜박거립니다 



## Tone Transformation
Writing can vary based on the intended audience. ChatGPT can produce different tones.


In [9]:
prompt = f"""
Translate the following from slang to a business letter: 
'Dude, This is Joe, check out this spec on this standing lamp.'
"""
response = get_completion(prompt)
print(response)

Dear Sir/Madam,

I am writing to bring to your attention the specifications of a standing lamp that I believe may be of interest to you. 

Sincerely,
Joe


## Format Conversion
ChatGPT can translate between formats. The prompt should describe the input and output formats.

In [10]:
data_json = { "resturant employees" :[ 
    {"name":"Shyam", "email":"shyamjaiswal@gmail.com"},
    {"name":"Bob", "email":"bob32@gmail.com"},
    {"name":"Jai", "email":"jai87@gmail.com"}
]}

prompt = f"""
Translate the following python dictionary from JSON to an HTML \
table with column headers and title: {data_json}
"""
response = get_completion(prompt)
print(response)

<html>
<head>
  <title>Restaurant Employees</title>
</head>
<body>
  <table>
    <tr>
      <th>Name</th>
      <th>Email</th>
    </tr>
    <tr>
      <td>Shyam</td>
      <td>shyamjaiswal@gmail.com</td>
    </tr>
    <tr>
      <td>Bob</td>
      <td>bob32@gmail.com</td>
    </tr>
    <tr>
      <td>Jai</td>
      <td>jai87@gmail.com</td>
    </tr>
  </table>
</body>
</html>


In [11]:
from IPython.display import display, Markdown, Latex, HTML, JSON
display(HTML(response))

Name,Email
Shyam,shyamjaiswal@gmail.com
Bob,bob32@gmail.com
Jai,jai87@gmail.com


## Spellcheck/Grammar check.

Here are some examples of common grammar and spelling problems and the LLM's response. 

To signal to the LLM that you want it to proofread your text, you instruct the model to 'proofread' or 'proofread and correct'.

In [12]:
text = [ 
  "The girl with the black and white puppies have a ball.",  # The girl has a ball.
  "Yolanda has her notebook.", # ok
  "Its going to be a long day. Does the car need it’s oil changed?",  # Homonyms
  "Their goes my freedom. There going to bring they’re suitcases.",  # Homonyms
  "Your going to need you’re notebook.",  # Homonyms
  "That medicine effects my ability to sleep. Have you heard of the butterfly affect?", # Homonyms
  "This phrase is to cherck chatGPT for speling abilitty"  # spelling
]
for t in text:
    prompt = f"""Proofread and correct the following text
    and rewrite the corrected version. If you don't find
    and errors, just say "No errors found". Don't use 
    any punctuation around the text:
    ```{t}```"""
    response = get_completion(prompt)
    print(response)

The girl with the black and white puppies has a ball.
No errors found
No errors found.
Their goes my freedom. There going to bring they’re suitcases.

No errors found.

Rewritten: 
Their goes my freedom. There going to bring their suitcases.
You're going to need your notebook.
No errors found.
No errors found


In [13]:
text = f"""
Got this for my daughter for her birthday cuz she keeps taking \
mine from my room.  Yes, adults also like pandas too.  She takes \
it everywhere with her, and it's super soft and cute.  One of the \
ears is a bit lower than the other, and I don't think that was \
designed to be asymmetrical. It's a bit small for what I paid for it \
though. I think there might be other options that are bigger for \
the same price.  It arrived a day earlier than expected, so I got \
to play with it myself before I gave it to my daughter.
"""
prompt = f"proofread and correct this review: ```{text}```"
response = get_completion(prompt)
print(response)

I got this for my daughter for her birthday because she keeps taking mine from my room. Yes, adults also like pandas too. She takes it everywhere with her, and it's super soft and cute. One of the ears is a bit lower than the other, and I don't think that was designed to be asymmetrical. It's a bit small for what I paid for it though. I think there might be other options that are bigger for the same price. It arrived a day earlier than expected, so I got to play with it myself before I gave it to my daughter.


In [14]:
#!pip3 install redlines

In [15]:
from redlines import Redlines

diff = Redlines(text,response)
display(Markdown(diff.output_markdown))

<span style='color:red;font-weight:700;text-decoration:line-through;'>Got </span><span style='color:green;font-weight:700;'>I got </span>this for my daughter for her birthday <span style='color:red;font-weight:700;text-decoration:line-through;'>cuz </span><span style='color:green;font-weight:700;'>because </span>she keeps taking mine from my <span style='color:red;font-weight:700;text-decoration:line-through;'>room.  </span><span style='color:green;font-weight:700;'>room. </span>Yes, adults also like pandas <span style='color:red;font-weight:700;text-decoration:line-through;'>too.  </span><span style='color:green;font-weight:700;'>too. </span>She takes it everywhere with her, and it's super soft and <span style='color:red;font-weight:700;text-decoration:line-through;'>cute.  </span><span style='color:green;font-weight:700;'>cute. </span>One of the ears is a bit lower than the other, and I don't think that was designed to be asymmetrical. It's a bit small for what I paid for it though. I think there might be other options that are bigger for the same <span style='color:red;font-weight:700;text-decoration:line-through;'>price.  </span><span style='color:green;font-weight:700;'>price. </span>It arrived a day earlier than expected, so I got to play with it myself before I gave it to my daughter.

In [16]:
prompt = f"""
proofread and correct this review. Make it more compelling. 
Ensure it follows APA style guide and targets an advanced reader. 
Output in markdown format.
Text: ```{text}```
"""
response = get_completion(prompt)
display(Markdown(response))

I purchased this adorable panda plush toy for my daughter's birthday as she kept borrowing mine from my room. It's not just for kids - even adults can appreciate the charm of pandas. The plush is incredibly soft and cute, making it the perfect companion for my daughter wherever she goes. However, I did notice that one of the ears is slightly lower than the other, which seems unintentional. Despite this minor flaw, I found the size to be a bit smaller than expected given the price. I believe there are larger options available for the same cost. On the bright side, the plush arrived a day earlier than anticipated, allowing me to enjoy it myself before gifting it to my daughter. Overall, while there are some minor imperfections, the quality and cuteness of this panda plush make it a worthwhile purchase.

# Exercise
 - Complete the prompts similar to what we did in class. 
     - Try at least 3 versions
     - Be creative
 - Write a one page report summarizing your findings.
     - Were there variations that didn't work well? i.e., where GPT either hallucinated or wrong
 - What did you learn?

### Exercise 1: Creative Tone Transformation

In [23]:
prompt = f"""
Transform the following casual text into a formal email:
Hey there! I noticed some issues with my laptop. Can you take a look?
"""

response = get_completion(prompt)
print(response)

Subject: Request for Laptop Repair

Dear [Recipient's Name],

I hope this email finds you well. I am writing to bring to your attention some issues that I have been experiencing with my laptop. I would greatly appreciate it if you could take a look at it and provide any necessary repairs or maintenance.

Thank you for your attention to this matter.

Sincerely,
[Your Name]


### Exercise 2: Advanced Format Conversion

In [24]:
prompt = f"""
Convert the following CSV data into a LaTeX table format:
"""

data_csv = {
    "students": [
        {"name": "Alice", "grade": "A"},
        {"name": "Bob", "grade": "B"},
        {"name": "Charlie", "grade": "C"}
    ]
}

response = get_completion(prompt)
print(response)

```
\begin{table}[h]
\centering
\begin{tabular}{|c|c|c|}
\hline
Name & Age & Occupation \\
\hline
Alice & 25 & Engineer \\
Bob & 30 & Teacher \\
Charlie & 35 & Doctor \\
David & 40 & Lawyer \\
\hline
\end{tabular}
\caption{Sample Data}
\end{table}
```


### Exercise 3: Language Translation Challenge

In [31]:
prompt = f"""
Translate the following text into old english, PT-PT, PT-BR, Spanish, Dutch, French, German, yoda, cowboy and pirate:
I need help with setting up my new phone. Can you guide me through it?
"""

response = get_completion(prompt)
print(response)

Old English:
Ic þearf helpan mid settan up min nīw fōn. Cūþest þū mē þurh hit?

PT-PT:
Preciso de ajuda para configurar o meu novo telemóvel. Podes guiar-me através disso?

PT-BR:
Preciso de ajuda para configurar meu novo celular. Você pode me guiar nisso?

Spanish:
Necesito ayuda para configurar mi nuevo teléfono. ¿Puedes guiarme a través de ello?

Dutch:
Ik heb hulp nodig bij het instellen van mijn nieuwe telefoon. Kun je me daar doorheen leiden?

French:
J'ai besoin d'aide pour configurer mon nouveau téléphone. Pouvez-vous me guider à travers cela?

German:
Ich brauche Hilfe bei der Einrichtung meines neuen Telefons. Kannst du mich dabei führen?

Yoda:
Help with setting up my new phone, I need. Guide me through it, can you?

Cowboy:
I reckon I need some help settin' up my new phone. Can ya guide me through it?

Pirate:
Arrr, I be needin' some help with settin' up me new phone. Can ye guide me through it?


# **Report: Leveraging Large Language Models for Text Transformation Tasks**

## Introduction
This report explores the capabilities of OpenAI's GPT-3.5 Turbo model in various text transformation tasks, including translation, tone adjustment, format conversion, and spellcheck/grammar correction. By evaluating its performance across these tasks, we aim to understand the model's strengths, limitations, and practical applications in real-world scenarios.

## Methodology
Using Python and the OpenAI API, we conducted several experiments to assess the model's ability to transform and manipulate text:

1. **Translation**: We tested the model's proficiency in translating text between languages, both simple phrases and more complex sentences requiring formal or informal language outputs.

2. **Universal Translator**: Simulating a multilingual IT support scenario, we evaluated the model's accuracy in identifying languages and translating messages into English and Korean.

3. **Tone Transformation**: We examined the model's capability to adjust the tone of a message, transforming casual slang into a formal business letter format.

4. **Format Conversion**: The model was tasked with converting data structures, such as Python dictionaries, into HTML tables, showcasing its ability to handle structured data transformation.

5. **Spellcheck/Grammar Check**: We assessed the model's performance in identifying and correcting grammar and spelling errors in text samples across different languages and contexts.

## Findings
### Strengths
- **Translation Accuracy**: GPT-3.5 Turbo reliably translated text between languages, maintaining semantic accuracy and adjusting for formal and informal speech.
  
- **Multilingual Support**: It successfully identified various languages and provided accurate translations into English and Korean, demonstrating robust multilingual capabilities.

- **Tone Adaptation**: The model effectively transformed casual language into formal business communication, suitable for professional correspondence.

- **Data Format Conversion**: It adeptly converted Python dictionaries into well-structured HTML tables, demonstrating its versatility in data manipulation tasks.

- **Grammar and Spellcheck**: GPT-3.5 Turbo accurately identified and corrected grammar and spelling errors across different contexts, enhancing text clarity and coherence.

### Limitations
- **Ambiguity Handling**: In some cases, the model struggled with ambiguous prompts, leading to unexpected or incorrect responses, especially in complex formatting tasks.

- **Contextual Understanding**: While generally accurate, the model occasionally misinterpreted context-specific nuances, requiring careful validation of outputs in sensitive applications.

## Conclusion
OpenAI's GPT-3.5 Turbo proves to be a powerful tool for text transformation tasks, offering robust capabilities in translation, tone adjustment, format conversion, and proofreading. Its ability to handle multilingual support and structural data transformations makes it particularly valuable for diverse applications across industries. However, careful consideration of task complexity and validation of outputs are essential to mitigate potential errors. Overall, GPT-3.5 Turbo represents a significant advancement in natural language processing, empowering users with sophisticated text transformation abilities for various practical uses.

# **What have I learned:**

Through experimentation with OpenAI's GPT-3.5 Turbo model for text transformation tasks, several key insights were gained:

1. **Versatility**: GPT-3.5 Turbo demonstrates remarkable versatility across diverse text transformation tasks, including translation, tone adjustment, format conversion, and spellcheck/grammar correction.

2. **Accuracy in Translation**: The model excels in accurately translating text between languages, maintaining semantic integrity and adjusting for formal or informal language nuances.

3. **Multilingual Capabilities**: It effectively handles multilingual scenarios, identifying languages and providing accurate translations into target languages like English and Korean.

4. **Adaptability in Tone**: GPT-3.5 Turbo can adapt the tone of text, transforming casual language into formal formats suitable for professional communication.

5. **Data Manipulation**: It showcases strong capabilities in manipulating data structures, such as converting Python dictionaries into structured HTML tables.

6. **Grammar and Spellcheck**: The model reliably identifies and corrects grammar and spelling errors, enhancing text clarity and coherence across different languages and contexts.

7. **Limitations**: Challenges include occasional misinterpretation of ambiguous prompts and nuances in context-specific applications, requiring careful validation of outputs.

Overall, GPT-3.5 Turbo proves to be a powerful tool for enhancing text quality and transforming content across various applications, underscoring its utility in natural language processing tasks.