# Import

In [2]:
%load_ext autoreload

In [73]:
%autoreload 2
import os
from dotenv import load_dotenv
import sys
import os
import networkx as nx
import ipywidgets as widgets
from collections import defaultdict

# Add the parent directory of 'src' to the Python path
module_path = os.path.abspath(os.path.join('..'))
if module_path not in sys.path:
    sys.path.append(module_path)
# Load environment variables from .env file
load_dotenv()

from src.dialogue_generation import generate_story_plan, generate_dialogue_prompt, generate_dialogue, generate_recap
from src.audio_generation import  text_to_speech, play_audio
from src.phrase import correct_grammar, correct_phrases
from src.language_graph import get_sentences_from_dialogue, get_merged_graph_from_sentences, create_interactive_sentence_graph, enrich_graph, generate_phrases, generate_phrase_progression
from src.initialise import initialise_usage_data
from src.utils import save_json, convert_defaultdict, save_defaultdict


## Setup Google Cloud credentials and prerequisites
You will need a Google Project with the following APIs enabled:
* Text to Speech
* Translate
* Vertex AI with the following Anthropic models enabled (from the model garden)
    * Sonnet 3.5
    * Haiku
* Add your GOOGLE_PROJECT_ID to the .env file

You should alter src/config.json which contains your target language.


In [10]:
from google.auth import default
credentials, project = default()

# Audio Language Trainer Workflow

The aim of this project is to create audio material for you to practise a foreign language. It needs to be engaging and be tailored to words you want to practise. 

The overall steps we follow are:

1. Create an outline story plan based on a theme you select (e.g. 'an adventure', 'a holiday in Rome'). An LLM produces a story plan following a typical story arc (exposition, rising action, climax, falling action, resolution). This ensures an engaging plot.
2. Flesh out the story using your practice vocabulary and grammatical concepts. Vocab and concepts are sampled from lists you provide in the 'data' folder (vocab_usage.json and grammar_concepts_usage.json), with sampling being skewed towards words you haven't heard yet. The output here is a dialogue between two people (Sam and Alex).

Recaps are generated between each story part so when the LLM generates the next dialogue it logically continues from the previous one.

3. The dialogue is broken up into shorter practice phrases via a 'language graph' concept, so we give you not just the long-form dialogue to listen and practise to, but smaller, mixed-up phrases based on the vocab in the story, starting small and buliding to more complex phrases.
4. Your vocab list is updated based on the produced dialogue.
5. The smaller phrases and main dialogue are translated into your target language and convert to speech.
6. Research shows that listening to double-speed audio (on words you already known) can help with your listening comprehension for a foreign language (it helps the brain with the ability to separate distinct words). We therefore create a fast version of the dialogue for listening practice.
7. The audio files are stiched together to create an MP3 file for each part in the story (there are 5 parts to the story). The stages for each audio lesson are: 
* dialogue in the target language
* practice phrases of the form 'how do you say: "practice phrase' in 'target language'?". A pause (where you speak in the foreign language), then the correct translation is played twice, first fast, then slow.
* repeat of the dialogue in the target language so you can satisfy yourself you understand it properly
* 12 repeated playings of the fast version of the dialogue to improve your listening comprehension.

The intent is then you would listen to the next audio lesson in the story.


## Setup your vocab and grammatical concepts
You should populate or edit
* known_vocab_list.json 
* grammar_concepts.json

### Initiliase the vocab and grammar counters
This creates vocab_usage.json (setting all values to 0) and grammar_concepts_usage.json (setting all values to 'true' and counts to 0)

You can tweak these to minimise what words and concepts you are exposed to

In [19]:
initialise_usage_data(overwrite=False) #the overwrite commands stops you wiping all your usage data if it already exists

Usage files already exist. Set overwrite=True to reinitialize.


# Being Lesson Generation

## Create a story plan

In [74]:
STORY_DATA_PATH = "../data/story_data.json"
#this is where all the text data goes (prompts, dialogue, recaps etc)

In [25]:
story_plan = generate_story_plan(story_guide = "an outdoor adventure", test = True) #the test parameter will provide pre-canned responses avoiding LLM costs
story_plan

Data saved to ../data/story_plan.json


{'exposition': 'Two friends, Alex and Sam, decide to learn a new language together.',
 'rising_action': 'They face challenges in their studies and personal lives that test their commitment.',
 'climax': 'A language competition is announced, pushing them to their limits.',
 'falling_action': 'They prepare for the competition, supporting each other through difficulties.',
 'resolution': 'They participate in the competition, growing closer as friends and more confident in their language skills.'}

## Create all dialogue

1. Create dialouge LLM prompt based on the story part
2. LLM generates dialogue
3. LLM generates recap
4. move to next story part and repeat


In [61]:
PAY_FOR_LLM = False

if PAY_FOR_LLM:
    story_dialogue = defaultdict(lambda: defaultdict(str))
    recap = "This is the beginning of the story."
    for step, story_part in enumerate(story_plan.keys()):
        prompt = generate_dialogue_prompt(story_part=story_part,
                                        story_part_outline=story_plan[story_part],
                                        last_recap = recap,
                                        verb_count=10,
                                        verb_use_count=3,
                                        vocab_count=100,
                                        vocab_use_count=10,
                                        grammar_concept_count=10,
                                        grammar_use_count=3)
        dialogue = generate_dialogue(prompt)
        recap = generate_recap(dialogue, test=False)
        story_dialogue[story_part]["dialogue_generation_prompt"] = prompt
        story_dialogue[story_part]["dialogue"] = dialogue
        story_dialogue[story_part]["recap"] = recap


In [64]:
story_data_dict = convert_defaultdict(story_dialogue)
save_json(story_data_dict, "../data/story_data.json")

Data saved to ../data/story_data.json


### Update the vocab lists based on the dialogue

The grammatical concepts are updated during prompt creation as it is more difficult to extract these from the dialogue

In [65]:
%autoreload 2
from src.dialogue_generation import get_vocab_from_dialogue, update_vocab_usage

for story_part in story_plan.keys():
    dialogue = story_dialogue[story_part]["dialogue"]
    vocab_used = get_vocab_from_dialogue(dialogue)
    update_vocab_usage(vocab_used)

Data saved to ../data/vocab_usage.json
Data saved to ../data/vocab_usage.json
Data saved to ../data/vocab_usage.json
Data saved to ../data/vocab_usage.json
Data saved to ../data/vocab_usage.json


### Build phrases from dialogue

Here we:
1. Break up the dialogue into separate sentences. For this bit we don't care who the speaker is, we just want to create different phrases of different lengths and combinations based on the vocab int the dialogue
2. Once we have this list of phrases, some of them may not make sense, so we pass to a small LLM for minor corrections (e.g. he love haves three piece of cake -> he loves having three pieces of cake.)
3. We will then have a list of phrases for each story part

In [72]:
for story_part in story_data_dict:
    dialogue = story_data_dict[story_part]["dialogue"]
    original_sentences = get_sentences_from_dialogue(dialogue)
    merged_graph = get_merged_graph_from_sentences(original_sentences)
    enriched_graph = enrich_graph(merged_graph, num_enrichments=5)
    generated_phrases = generate_phrases(enriched_graph)
    phrase_progression = generate_phrase_progression(original_sentences, generated_phrases, max_edge_use=4)
    story_data_dict[story_part]["uncorrected_phrase_list"] = phrase_progression

Added edge from it (PRON) to let (VERB)
Added priority edge from together (ADV) to call (VERB)
Added priority edge from let (VERB) to work (NOUN)
Added priority edge from too (ADV) to done (VERB)
Added priority edge from must (AUX) to forget (VERB)
Added edge from my (PRON) to is (AUX)
Added priority edge from french (ADJ) to breakfast (NOUN)
Added priority edge from our (PRON) to will (AUX)
Added edge from i (PRON) to are (AUX)
Added edge from that (PRON) to has (AUX)
Added priority edge from french (ADJ) to competition (NOUN)
Added edge from what (PRON) to 'm (AUX)
Added edge from that (PRON) to announced (VERB)
Added edge from 's (PRON) to improve (VERB)
Added edge from me (PRON) to have (AUX)
Added edge from me (PRON) to let (VERB)
Added edge from my (PRON) to understand (VERB)
Added priority edge from 'm (AUX) to like (VERB)
Added priority edge from now (ADV) to like (VERB)
Added edge from what (PRON) to speaking (VERB)
Added priority edge from after (ADP) to time (NOUN)
Added edg

### Correct phrases using a small LLM
To ensure grammatical correctness

In [77]:
PAY_FOR_LLM = False

if PAY_FOR_LLM:
    for story_part in story_data_dict:
        phrase_list = story_data_dict[story_part]["uncorrected_phrase_list"]
        corrected_phrase_list = correct_phrases(phrase_list)
        story_data_dict[story_part]["corrected_phrase_list"] = corrected_phrase_list

In [80]:
save_defaultdict(story_data_dict, STORY_DATA_PATH)

Data saved to ../data/story_data.json


### Translate dialogue and phrases

In [18]:
%autoreload

# Get sentences from the dialogue
sentences = get_sentences_from_dialogue(input_text_restaurant)

# Create a graph from the first sentence
G = get_merged_graph_from_sentences(sentences)


# Plot the graph
# Create the interactive graph
interactive_graph = create_interactive_sentence_graph(G)

# Display the interactive graph
display(interactive_graph)

VBox(children=(Button(description='Regenerate Graph', style=ButtonStyle()), CytoscapeWidget(cytoscape_layout={…

In [11]:
# Enrich the graph, maintaining grammatical directions
enriched_graph = enrich_graph(G, num_enrichments=5)

# Create the interactive graph
#nteractive_graph = create_interactive_sentence_graph(enriched_graph)

# Display the interactive graph
#display(interactive_graph)

Added edge from what (PRON) to like (VERB)
Added priority edge from would (AUX) to order (VERB)
Added priority edge from anything (PRON) to order (VERB)
Added edge from anything (PRON) to choice (NOUN)
Added edge from anything (PRON) to order (NOUN)


Added priority edge from it (PRON) to 're (AUX)
Added edge from we (PRON) to are (AUX)
Added priority edge from 's (AUX) to saw (VERB)
Added priority edge from am (AUX) to says (VERB)
Added priority edge from some (DET) to hiking (NOUN)


In [199]:
phrase_progression

['i see',
 'i want',
 'we got',
 'you see',
 'i am says',
 'trip looks',
 'the summer',
 'you watching',
 "you 're right",
 'programme says',
 "sam it 's fine",
 'apples in my bag',
 'i never saw shoes',
 'what about the trip',
 'yes i see the hiking',
 'some lunch and i want',
 'apples in the weather',
 "that 's eight o'clock",
 'beautiful in the hiking',
 'you see some lunch and tea',
 'lunch and i see the hiking',
 'what about the programme ?',
 'we got bananas for the food',
 "shoes like that 's eight o'clock",
 'lunch and i see some green apples',
 "sam it 's saw shoes they are using",
 'they are you watching the long walk',
 "that 's important for the trip looks",
 "sam it 's important for the trip looks",
 'bananas for the expensive shoes they are you see',
 "sam , it 's eight o'clock .",
 'are you watching the hiking programme ?',
 'yes , i am .',
 'the trip looks hard .',
 'what about the food ?',
 'i see some lunch and tea .',
 'we got bananas for our trip , right ?',
 'yes ,

In [208]:
# now we correct the grammar in the phrases
%autoreload 2


resp = correct_phrases(phrase_progression)

Config file has been modified. Reloading...
Successfully loaded config from: c:\Users\i5\Documents\Python Scripts\audio-language-trainer\src\config.json
Multiple country codes available for en: en-AU, en-GB, en-IN, en-US


In [212]:
for new, old in zip(resp, phrase_progression):
    print(old, "---", new)

i see --- I see.
i want --- I want.
we got --- We got.
you see --- You see.
i am says --- I am saying.
trip looks --- The trip looks.
the summer --- The summer.
you watching --- You are watching.
you 're right --- You're right.
programme says --- The programme says.
sam it 's fine --- Sam, it's fine.
apples in my bag --- I have apples in my bag.
i never saw shoes --- I never saw those shoes.
what about the trip --- What about the trip?
yes i see the hiking --- Yes, I see the hiking.
some lunch and i want --- I want some lunch.
apples in the weather --- Apples in the weather.
that 's eight o'clock --- That's eight o'clock.
beautiful in the hiking --- It's beautiful in the hiking.
you see some lunch and tea --- You see, I want some lunch and tea.
lunch and i see the hiking --- I want lunch and I see the hiking.
what about the programme ? --- What about the programme?
we got bananas for the food --- We got bananas for the food.
shoes like that 's eight o'clock --- Shoes like that's eight 

In [None]:


corrected_phrases = []
for phrase in phrase_progression:
    corrected_phrase = correct_grammar(phrase)
    corrected_phrases.append(corrected_phrase)

corrected_phrases

In [9]:
%autoreload
from src.translation import translate_from_english


In [12]:
translate_from_english("hello there")

Config file has been modified. Reloading...
Successfully loaded config from: c:\Users\i5\Documents\Python Scripts\audio-language-trainer\src\config.json


'Hej där'

In [None]:

translated_phrases = []

for phrase in corrected_phrases:
    translated_phrase = translate_from_english(phrase)
    translated_phrases.append((phrase, translated_phrase))


In [67]:
%autoreload
import src.audio_generation
from src.audio_generation import get_voice_models, join_audio_segments, generate_translated_phrase_audio


Multiple country codes available for en: en-AU, en-GB, en-IN, en-US


In [30]:
dir(config)

['__class__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattr__',
 '__getattribute__',
 '__getstate__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__weakref__',
 '_check_reload',
 '_file_modified_time',
 '_find_config_file',
 '_last_load_time',
 '_load_config',
 'config',
 'config_file']

In [None]:

# Get the voice models
english_models, target_models = config.get_voice_models()

# Use the models in your text-to-speech functions
# ...

# If you need to access other config values:
target_language = config.TARGET_LANGUAGE
use_cheap_voices = config.USE_CHEAP_VOICE_MODELS

In [78]:
tvm

{'language_code': 'sv-SE',
 'male_voice': 'sv-SE-Standard-D',
 'female_voice': 'sv-SE-Standard-A'}

In [54]:
#process a phrase
#english - pause - swedish slow - swedish fast

translated_phrase = translated_phrases[6]

In [55]:
x = generate_translated_phrase_audio(translated_phrase=translated_phrase,
                                     english_voice_models=evm,
                                     target_voice_models=svm)


In [56]:
play_audio(x)

In [148]:
%autoreload 2
from src.dialogue_generation import generate_story_plan, generate_dialogue_prompt, generate_recap, extract_json_from_llm_response, get_dialogue


In [None]:

story_plan = generate_story_plan(story_guide = "planning an exciting outdoor adventure", test=False)

In [137]:
dialogue_prompt1, words_list = generate_dialogue_prompt()

In [138]:
print(dialogue_prompt1)


    Create a brief dialogue for language learners using the following guidelines:
    1. This is for language practice. Only use words from the lists provided below.
    2. Pick from about 3 of these verbs:
    watch, stay, hurry, be, hope, cost, want, wake, can, become
    You can use other verbs if required to make the tenses work (e.g., auxiliary verbs).
    3. Use at least 10 words from this vocabulary list:
    yes, three, herself, warm, programme, app, so, outside, Friday, although, wife, evening, bed, everybody, boring, why, tired, about, o'clock, next, week, anything, shoe, difficult, at, when, here, course, angry, who, some, entire, warm, tonight, with, walk, father, leg, nothing, every, themselves, during, within, expensive, or, they, course, before, every, climate, who, Swedish, thanks, me, at, cigarette, today, outside, back, warm, first, we, are, important, why, glass, better, o'clock, sister, hard, yourself, now, today, to, herself, off, myself, time, television, differe

In [142]:
dialogue_exposition = anthropic_generate(dialogue_prompt1)

In [149]:
extract_json_from_llm_response(dialogue_exposition)

{'dialogue': [{'speaker': 'Alex',
   'text': 'Hey Sam, are you tired today? We can watch a programme about hiking tonight.'},
  {'speaker': 'Sam',
   'text': "Thanks, but I'm so sorry. I have to hurry to the university this evening."},
  {'speaker': 'Alex',
   'text': "That's okay. If we had watched it tonight, we would have become better hikers."},
  {'speaker': 'Sam',
   'text': "I know. Although I'm busy today, I hope we can watch it next week."},
  {'speaker': 'Alex',
   'text': "Sounds good! Why don't we watch it on Friday at 8 o'clock?"},
  {'speaker': 'Sam',
   'text': "Yes, that works for me. I'll be here with some snacks."}]}

In [151]:
dialogue_parsed_exposition = get_dialogue(dialogue_exposition)

In [153]:
(dialogue_parsed_exposition)

[{'speaker': 'Alex',
  'text': 'Hey Sam, are you tired today? We can watch a programme about hiking tonight.'},
 {'speaker': 'Sam',
  'text': "Thanks, but I'm so sorry. I have to hurry to the university this evening."},
 {'speaker': 'Alex',
  'text': "That's okay. If we had watched it tonight, we would have become better hikers."},
 {'speaker': 'Sam',
  'text': "I know. Although I'm busy today, I hope we can watch it next week."},
 {'speaker': 'Alex',
  'text': "Sounds good! Why don't we watch it on Friday at 8 o'clock?"},
 {'speaker': 'Sam',
  'text': "Yes, that works for me. I'll be here with some snacks."}]

In [155]:
recap1 = generate_recap(dialogue_parsed_exposition, test=False)

In [158]:
dialgoue_prompt2, _ = generate_dialogue_prompt()

In [159]:
print(dialgoue_prompt2)


    Create a brief dialogue for language learners using the following guidelines:
    1. This is for language practice. Only use words from the lists provided below.
    2. Pick from about 3 of these verbs:
    be, watch, smoke, stay, dream, meet, put, ate, see, want
    You can use other verbs if required to make the tenses work (e.g., auxiliary verbs).
    3. Use at least 10 words from this vocabulary list:
    o'clock, or, key, better, about, until, two, friend, lunch, leg, long, food, got, Sweden, fine, to, my, me, o'clock, never, summer, money, us, because, at, hard, from, child, telephone, some, school, mother, walk, shoe, myself, to, during, what, question, am, better, now, me, morning, glass, leg, your, expensive, lunch, glass, dining, house, banana, herself, month, year, today, policeman, anything, tea, important, problem, during, head, time, expensive, me, and, there, wife, computer, film, trip, programme, politics, down, theatre, radio, it's, often, beautiful, to, sick, oft

In [160]:
dialogue_rising_action = anthropic_generate(dialgoue_prompt2)

In [161]:
dialogue_parsed_2 = get_dialogue(dialogue_rising_action)
dialogue_parsed_2

[{'speaker': 'Alex',
  'text': "Sam, it's eight o'clock. Are you watching the hiking programme?"},
 {'speaker': 'Sam',
  'text': 'Yes, I am. The trip looks hard. What about the food?'},
 {'speaker': 'Alex',
  'text': 'I see some lunch and tea. We got bananas for our trip, right?'},
 {'speaker': 'Sam',
  'text': 'Yes, and I want to put some green apples in my bag.'},
 {'speaker': 'Alex',
  'text': "That's fine. Did you see the expensive shoes they are using?"},
 {'speaker': 'Sam',
  'text': "I never saw shoes like that. It's important for the long walk."},
 {'speaker': 'Alex',
  'text': "You're right. What about the weather during our trip?"},
 {'speaker': 'Sam',
  'text': "The programme says it's often beautiful in the summer."}]

In [162]:
recap2 = generate_recap(dialogue_parsed_2, test=False)

In [2]:
%autoreload 2
from src.audio_generation import generate_audio_from_dialogue, speed_up_audio, generate_normal_and_fast_audio
fast2a = speed_up_audio(audio2)


UsageError: Line magic function `%autoreload` not found.


In [1]:
fast2a

NameError: name 'fast2a' is not defined

In [188]:

audio_parts = generate_audio_from_dialogue(dialogue_parsed_2)


Config file has been modified. Reloading...
Successfully loaded config from: c:\Users\i5\Documents\Python Scripts\audio-language-trainer\src\config.json
Multiple country codes available for en: en-AU, en-GB, en-IN, en-US


In [195]:

normal, fast = generate_normal_and_fast_audio(audio_parts)

In [197]:
normal