## AI-Powered Language Tutor: A Personalized Learning Assistant Using OpenAI's Whisper and GPT

### Introduction

Language learning is a deeply personal and often challenging journey, requiring not only dedication but also access to effective, interactive, and responsive tools. In this project, we harness the power of **OpenAI's advanced language models and speech technologies** to build a **Personalized Language Tutor**—an AI-driven assistant designed to elevate the user’s experience in learning a new language.

The assistant integrates multiple features that support various aspects of language acquisition:

- **Speech-to-Text Transcription**: Utilizing OpenAI's `whisper-1` model, the assistant transcribes spoken English from an audio file (`sample.wav`) into accurate, readable text.
- **Real-time Translation**: The transcribed text is translated into the user’s target language (e.g., French) using the `gpt-4o-mini` model, facilitating multilingual comprehension and practice.
- **Grammar Check**: The translated text is evaluated for grammatical correctness, and AI-generated feedback is provided to help the user learn and improve their writing skills.
- **Pronunciation Feedback**: The user’s spoken input is compared with target sentences, offering constructive feedback to refine pronunciation and intonation.

By combining these capabilities, the assistant becomes a powerful ally for learners, enabling practice in listening, speaking, writing, and reading—all supported by AI. It is particularly beneficial for users seeking immersive, responsive learning environments that mimic one-on-one tutoring experiences.


In [19]:
import os
from openai import OpenAI

# Initialize OpenAI client
client = OpenAI(api_key=openai_api_key)    

#### We will transcribe audio using Whisper-1 model and store in transcription_text

In [21]:
with open("data/sample.wav", "rb") as audio_file:
    # Call the Whisper-1 model for audio transcription
    transcription_response = client.audio.transcriptions.create(model="whisper-1", file=audio_file)
    
    # Store the transcribed text
    transcription_text = transcription_response.text 
print("Original English Text:", transcription_text)

The stale smell of old beer lingers. It takes heat to bring out the odor. A cold dip restores health and zest. A salt pickle tastes fine with ham. Tacos al pastor are my favorite. A zestful food is the hot cross bun.


#### Now we can translate the original transcription_text to, let says, French language.

In [3]:
target_language = 'French'

completion = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": f"Please translate the following text to {target_language}: {transcription_text}"}        
        ]
    )

translated_text = completion.choices[0].message.content

print(f"Translated Text into {target_language}: {translated_text}")

Translated Text into French: Here is the translation of your text into French:

L'odeur rance de la vieille bière persiste. Il faut de la chaleur pour faire ressortir l'odeur. Un bain froid restaure la santé et le dynamisme. Un cornichon au sel se marie bien avec le jambon. Les tacos al pastor sont mes préférés. Un aliment plein de peps est le petit pain de Pâques.


#### Laslty we will get grammar feedback for the translated text

In [4]:
completion = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {"role": "system", "content": "You are a language learning assistant providing grammar feedback."},
            {"role": "user", "content":  f"Please correct any grammar mistakes in the following text and provide feedback: {translated_text}"}
        ]
    )
grammar_feedback = completion.choices[0].message.content

print(f"Grammar Checked Text: {grammar_feedback}")

Grammar Checked Text: Your translation into French is quite well done! However, there are a couple of points to consider for clarity and grammatical accuracy:

1. **Cornichon au sel**: While this phrase is grammatically correct, it might flow better as "des cornichons au sel" if you are speaking about them in general. This depends on whether you refer to a specific pickle or pickles in general.

2. **Aliment plein de peps**: This phrase is correct, but you might consider rephrasing it slightly to enhance clarity. A simple alternative could be "Un aliment qui a du peps est le petit pain de Pâques".

Here’s a revised version of your text:

**Revised Translation:**
L'odeur rance de la vieille bière persiste. Il faut de la chaleur pour faire ressortir l'odeur. Un bain froid restaure la santé et le dynamisme. Des cornichons au sel se marient bien avec le jambon. Les tacos al pastor sont mes préférés. Un aliment qui a du peps est le petit pain de Pâques.

**Overall Feedback:**
- The structur

#### We could try to get pronunciation feedback based on the original transcription text and a target pronunciation

In [5]:
target_text = "Heat brings out flavor, cold restores, salt complements ham, tacos are a favorite, and hot cross buns are zestful."  

completion = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {"role": "system", "content": "You are a language learning assistant providing pronunciation feedback."},
            {"role": "user", "content": f"I said: {transcription_text}. How close is it to: {target_text}? Give feedback."}
        ]
    )
pronunciation_feedback = completion.choices[0].message.content

print(f"Pronunciation Feedback: {pronunciation_feedback}")

Pronunciation Feedback: Your pronunciation of the original sentences conveys the intended meaning well, but there are a few areas where you could improve clarity and fluency:

1. **Rhythm and Intonation**: Pay attention to the natural rhythm of speech. In English, we often emphasize certain words to convey meaning. For example, in "stale smell" and "brings out flavor," emphasize "stale" and "brings" respectively to enhance clarity.

2. **Vowel Sounds**: Ensure you are pronouncing vowel sounds distinctly. For words like "zest" and "ham," make sure the vowels are clear and adequately pronounced.

3. **Consonant Clarity**: In words like "pickle" and "tacos," ensuring that the initial consonant sounds are crisp will make your speech more comprehensible.

4. **Pacing**: Try to maintain a steady pace. Avoid rushing through parts of the sentence to allow listeners to absorb the information.

5. **Natural Connection of Phrases**: In more conversational contexts, you might want to connect relat

## Conclusion

This project demonstrates the potential of AI in transforming language education through accessible, personalized tools. By integrating OpenAI's models—`whisper-1` for speech recognition and `gpt-4o-mini` for language processing—we've created a comprehensive assistant that not only transcribes and translates spoken language but also provides intelligent grammar corrections and nuanced pronunciation feedback.

The use of **Harvard Sentences** as a starting point ensures phonetically balanced input for robust pronunciation analysis. From this foundation, the assistant empowers users to:

- Hear and understand how language sounds in real use,
- Reflect on their written grammar through real-time feedback,
- Learn how to pronounce and enunciate words correctly.

More than just a technical exercise, this project underscores how **AI can serve as a supportive, always-available tutor**, helping language learners gain confidence and proficiency at their own pace. The modular structure of this assistant also opens the door to future enhancements, such as conversation simulations, personalized vocabulary drills, and gamified progress tracking, making it a scalable solution for diverse educational needs.
