# Story Generation
We remember things better as stories. The plan here is to pick a subset of our phrases, extract the vocabularly, and generate a story based off of them. We can then pull in more flashcards / phrases to ensure a more complete phrase coverage.

The story name will be story_some_title; when added as a 'tag' into Anki, this will add a hyperlink to a google cloud bucket of a specific format of bucket/language/story_name/story_name.html

This means it is easy to add new stories to an existing flashcard deck, and the links will update as soon as you add the tags

In [1]:
%load_ext autoreload
%autoreload 2
from dotenv import load_dotenv

load_dotenv()

PAY_FOR_API = True #change to True to run cells that cost money via API calls

In [2]:
import random
from pathlib import Path
from pprint import pprint

from src.anki_tools import AnkiCollectionReader, get_deck_contents
from src.config_loader import config
from src.nlp import (
    create_flashcard_index,
    find_missing_vocabulary,
    get_vocab_dict_from_dialogue,
    get_vocab_dictionary_from_phrases,
)
from src.utils import (
    load_json,
    load_text_file,
    save_json,
    save_pickle,
    load_pickle,
    upload_story_to_gcs,
    upload_to_gcs,
)
# Add the parent directory of 'src' to the Python path


FFmpeg path added to system PATH: C:\Program Files\ffmpeg-7.0-essentials_build\bin


### Add directories
story images can be re-used between languages, but audio files are language specific, so we structure the story directory story_name/language with audio files in 'language/' and images and the english JSON file in story_name dir

In [3]:
notebook_dir = Path().absolute()  # This gives src/notebooks
phrase_dir = notebook_dir.parent / "data" / "phrases" #where we store text files of phrases
story_dir = notebook_dir.parent / "outputs" / "stories" # where we store our stories

we already have flashcards generated for some phrases:
a flashcard index allows us to select flashcards that cover a specific vocabulary range, it's quite computationally expensive, but is generated
using create_flashcard_index

In [4]:
PHRASE_LIST_NAME = "swedish_language_learning"
phrase_file = phrase_dir / f"{PHRASE_LIST_NAME}.txt"
phrases = load_text_file(phrase_file)
pprint(f"First few phrases {phrases[:10]}")



("First few phrases ['a friendly Swedish conversation partner', 'Are there any "
 "Swedish events nearby?', 'Are there any Swedish holidays soon?', 'aromatic "
 "cinnamon buns for fika', 'authentic Swedish pronunciation practice', 'Can "
 "you recommend a Swedish book?', 'Can you suggest any Swedish films?', "
 "'colourful wildflowers in a sunny meadow', 'cosy language exchange meetup', "
 "'Could you explain this Swedish phrase?']")


## create the flashcard index
This makes it very fast to find matching flashcards from a given vocab list

In [5]:
# long process, so only create if it doesn't exist
notebook_dir = Path().absolute()  # This gives src/notebooks
index_file = phrase_dir / f"{PHRASE_LIST_NAME}_index.json"

if index_file.exists():
    phrase_index = load_json(index_file)
else:
    phrase_index = create_flashcard_index(phrases)
    save_json(data=phrase_index, file_path=index_file)



Indexes phrases...: 100%|██████████| 52/52 [01:06<00:00,  1.27s/it]


## Sample some phrases to generate the story from
This will pin the story to the vocab found in some pre-existing phrases

In [11]:
#we can obtain phrases we know to create a story from:
# NOTE: you must close Anki Desktop when trying to form a connection here
with AnkiCollectionReader() as reader:
    pprint(reader.get_deck_names())

#this will print out deck_id : deck_name -> we want to copy the relevant deck_name

{1: 'Default',
 1731524665442: 'Swedish EAL',
 1732020971325: 'RapidRetention - Swedish - LM1000',
 1732316149591: 'RapidRetention - Russian - LM1000',
 1732316936163: 'RapidRetention - Italian - LM1000',
 1732637740663: 'RapidRetention - Welsh - LM1000',
 1732980361514: 'RapidRetention - Russian - GCSE',
 1732993700879: 'Persian Alphabet',
 1734260227418: 'RapidRetention - Swedish - NumbersDays',
 1734261644938: 'RapidRetention - Russian - NumbersDays',
 1734264578929: 'RapidRetention - Italian - NumbersDays',
 1734426251278: 'RapidRetention - French - NumbersDays',
 1735660105659: 'RapidRetention - Swedish - EatingOut',
 1735684325990: 'RapidRetention - Czech - EatingOut',
 1735687489606: 'RapidRetention - Welsh - EatingOut',
 1735998451053: 'RapidRetention - Italian - EatingOut',
 1737410923009: 'RapidRetention - Swedish - LanguageMeetUp'}


In [12]:
DECK_NAME = "RapidRetention - Swedish - LanguageMeetUp"
df = get_deck_contents(deck_name=DECK_NAME) #calculates knowledge score
df.head()

Unnamed: 0,note_id,model_name,tags,n_cards,avg_ease,total_reps,avg_reps,total_lapses,avg_lapses,avg_interval,TargetText,TargetAudio,TargetAudioSlow,EnglishText,WiktionaryLinks,Picture,TargetLanguageName,knowledge_score
0,1737410886674,Language Practice With Images,,3,0.0,0,0.0,0,0.0,0.0,välsliten svensk grammatiklärobok,[sound:3710f25e-8920-424f-82b3-28a5c4cbb42c.mp3],[sound:7945e360-168c-47c9-8d61-be13f098bec4.mp3],well-worn Swedish grammar textbook,"välsliten <a href=""https://en.wiktionary.org/w...","<img src=""f8fcab2a-1821-4337-80dc-9ecd5ddea90e...",Swedish,0.0
1,1737410886678,Language Practice With Images,,3,0.0,0,0.0,0,0.0,0.0,Svenska köttbullar med lingon,[sound:1f3e2c5d-3956-49a3-93f0-6b457f4cea40.mp3],[sound:600e641d-2c3c-4ed9-99e0-0fee2c726742.mp3],Swedish meatballs with lingonberries,"<a href=""https://en.wiktionary.org/wiki/svensk...","<img src=""339cd2f9-f3ce-4eb8-a1ae-77b9a5448ff4...",Swedish,0.0
2,1737410886682,Language Practice With Images,,3,0.0,0,0.0,0,0.0,0.0,autentisk svensk uttalspraxis,[sound:7ddbf66b-dc74-4ede-b673-0188148120a1.mp3],[sound:57f0b762-db0d-4143-ab8f-4825c2da72bb.mp3],authentic Swedish pronunciation practice,"<a href=""https://en.wiktionary.org/wiki/autent...","<img src=""cbed7cf0-8c6b-4fdf-bb1c-c5a71d4fae93...",Swedish,0.0
3,1737410886686,Language Practice With Images,,3,0.0,0,0.0,0,0.0,0.0,mysig språkutbytesträff,[sound:586c6239-d2c8-4e24-94d6-1e81010e399c.mp3],[sound:1347e7e3-65ab-4f64-a102-5c2f60680957.mp3],cosy language exchange meetup,"<a href=""https://en.wiktionary.org/wiki/mysig#...","<img src=""b44ca58d-2e29-481e-8376-236ee8954557...",Swedish,0.0
4,1737410886690,Language Practice With Images,,3,0.0,0,0.0,0,0.0,0.0,Har du besökt Sverige förut?,[sound:095c5937-4b37-412d-8378-e7136061687b.mp3],[sound:d28d9166-a773-4d86-883d-9a9cabe31885.mp3],Have you visited Sweden before?,"<a href=""https://en.wiktionary.org/wiki/har#Sw...","<img src=""8c33fd7e-b6bf-40de-ad72-547103679a57...",Swedish,0.0


Find phrases we know, and limit the flashcard index to those

In [13]:
from src.phrase import get_phrase_indices

known_phrases = df.query("knowledge_score > 0.2").sort_values(by="knowledge_score", ascending=False)['EnglishText'].tolist()

#we need to know the location of each phrase as an integer in the phrase_index
known_phrase_indicies = get_phrase_indices(known_phrases = known_phrases, all_phrases = phrase_index['phrases'])

In [14]:
from copy import deepcopy
from src.nlp import remove_unknown_index_values

#if we don't know a phrase, we don't want to retrieve that from the index and link it to a story
known_index = deepcopy(phrase_index)
known_index['verb_index'] = remove_unknown_index_values(known_phrase_indicies, known_index['verb_index'])
known_index['vocab_index'] = remove_unknown_index_values(known_phrase_indicies, known_index['vocab_index'])

In [16]:
sampled_phrases = random.sample(known_phrases, min(75, len(known_phrases)))


In [6]:
#or use sampled_phrases
vocab_dict_flashcards = get_vocab_dictionary_from_phrases(phrases) #75 phrases should give a decent amount of vocab

Now generate the story

In [7]:
from src.dialogue_generation import generate_story

story_name, story_dialogue = generate_story(vocab_dict_flashcards)


Function that called this one: generate_story. Sleeping for 20 seconds
generated story: Swedish Adventure in Winter Wilderness


In [8]:
#story_name = "lost_in_stockholm"
clean_story_name = f"story_{story_name.lower().replace(' ', '_')}"
story_path = story_dir / clean_story_name / f"{clean_story_name}.json"

save_json(story_dialogue, story_path)
print(f"saved story to {story_path}")

saved story to y:\Python Scripts\audio-language-trainer\outputs\stories\story_swedish_adventure_in_winter_wilderness\story_swedish_adventure_in_winter_wilderness.json


We find that the LLM goes a bit beyond the vocab found in the flashcards

In [9]:
vocab_dict_story = get_vocab_dict_from_dialogue(story_dialogue, limit_story_parts=None)
vocab_overlap = find_missing_vocabulary(vocab_dict_flashcards, vocab_dict_story)

=== VOCABULARY COVERAGE ANALYSIS ===
Target verbs covered by flashcards: 27.0%
Target vocabulary covered by flashcards: 44.2%

Verbs needing new flashcards:
['think', "'ve", 'get', 'freeze', 'let'] ...

Vocabulary needing new flashcards:
['an', 'back', 'really', 'beautiful', 'great'] ...


Let's retrieve flashcards we know that better fit the story vocab

In [15]:
from src.nlp import get_matching_flashcards_indexed

# Let's pull all the existing phrases we need to cover the vocab on our story
#remember we modified the index to only use flashcards we known
known_results = get_matching_flashcards_indexed(vocab_dict_story, known_index)
known_flashcards = [card.get('phrase') for card in known_results['selected_cards']]
print("Average knowledge: ", df.loc[df['EnglishText'].isin(known_flashcards)].knowledge_score.mean())
known_vocab_dict = get_vocab_dictionary_from_phrases(known_flashcards)
missing_vocab = find_missing_vocabulary(vocab_dict_source=known_vocab_dict, vocab_dict_target=vocab_dict_story)
missing_vocab_dict = missing_vocab["missing_vocab"]


Average knowledge:  nan
=== VOCABULARY COVERAGE ANALYSIS ===
Target verbs covered by flashcards: 0.0%
Target vocabulary covered by flashcards: 0.0%

Verbs needing new flashcards:
['think', "'ve", 'get', 'freeze', 'let'] ...

Vocabulary needing new flashcards:
['an', 'snowy', 'back', 'pine', 'great'] ...


Now supplement these with any remaining flascards we don't yet know

In [None]:
#we should have a higher match in the cell above, we can now draw missing flashcards from the full index

additional_results = get_matching_flashcards_indexed(missing_vocab_dict, phrase_index)
additional_flashcards = [card.get('phrase') for card in additional_results['selected_cards']]
print(len(additional_flashcards))

all_flashcards = additional_flashcards + known_flashcards
all_flashcards_vocab_dict = get_vocab_dictionary_from_phrases(all_flashcards)
final_missing_vocab = find_missing_vocabulary(all_flashcards_vocab_dict, vocab_dict_story)


In [None]:
print(f"We need {len(all_flashcards)} flashcards to cover the story")

## Generate the story files
Once you are happy with the flashcard coverage, you can:
* translate and add audio
* create the story images
* create the story album files (M4a files with synced lyrics)
* create the story HTML file using those previous files, and upload to Google Cloud Storage
* tag the flascards with the story name...this will then mean you can link to the story from within Anki (the template uses tags to auto-create hyperlinks)

In [16]:
from src.generate import add_audio, add_translations

story_dialogue_audio = add_translations(story_dialogue)
story_dialogue_audio = add_audio(story_dialogue_audio)

adding translations:   0%|          | 0/3 [00:00<?, ?it/s]

Beginning translation for introduction


adding translations:  33%|███▎      | 1/3 [00:01<00:03,  2.00s/it]

Translated dialogue
Beginning translation for development


adding translations:  67%|██████▋   | 2/3 [00:03<00:01,  1.96s/it]

Translated dialogue
Beginning translation for resolution


adding translations: 100%|██████████| 3/3 [00:05<00:00,  1.97s/it]


Translated dialogue


adding audio:   0%|          | 0/3 [00:00<?, ?it/s]

Beginning text-to-speech for introduction


Generating dialogue audio: 100%|██████████| 6/6 [00:18<00:00,  3.02s/it]
adding audio:  33%|███▎      | 1/3 [00:33<01:06, 33.16s/it]

Text-to-speech for dialogue done
Beginning text-to-speech for development


Generating dialogue audio: 100%|██████████| 7/7 [00:13<00:00,  1.98s/it]
adding audio:  67%|██████▋   | 2/3 [00:47<00:22, 22.01s/it]

Text-to-speech for dialogue done
Beginning text-to-speech for resolution


Generating dialogue audio: 100%|██████████| 6/6 [00:11<00:00,  1.98s/it]
adding audio: 100%|██████████| 3/3 [00:59<00:00, 19.84s/it]

Text-to-speech for dialogue done





In [17]:
#this has target language content in now so we save in language dir
save_pickle(data=story_dialogue_audio, file_path=story_dir / clean_story_name / config.TARGET_LANGUAGE_NAME / f"{clean_story_name}.pkl")
#story_dialogue_audio = load_pickle(story_dir / clean_story_name / config.TARGET_LANGUAGE_NAME / f"{clean_story_name}.pkl")

Image files for each part of the story:

In [20]:
from src.images import generate_and_save_story_images
    
image_data = generate_and_save_story_images(story_dict=story_dialogue_audio, output_dir = story_dir / clean_story_name, story_name=clean_story_name)


Generating story images:   0%|          | 0/3 [00:00<?, ?it/s]

Function that called this one: create_image_generation_prompt_for_story_part. Sleeping for 20 seconds


Waiting for API cooldown: 100%|[34m████████████[0m| 19/19 [00:19<00:00,  1.01s/it][0m
Waiting for API cooldown: 100%|[34m████████████[0m| 15/15 [00:15<00:00,  1.01s/it][0m
Generating story images:  33%|███▎      | 1/3 [00:50<01:41, 50.66s/it]

Generated image with imagen using prompt: Looking out over a snow-covered Swedish lakeside from the window of a cozy wooden resort cabin, with distant pine forests and mountains, warm golden light from inside contrasting with the crisp blue winter twilight outside, and a few bundled-up figures visible in the distance enjoying the serene landscape in the style of Studio Ghibli art style, soft atmospheric colors, detailed backgrounds, gentle gradients, natural elements, dreamy lighting, painted textures
Successfully generated and saved image for introduction


Waiting for API cooldown: 100%|[34m██████████████[0m| 8/8 [00:08<00:00,  1.01s/it][0m


Function that called this one: create_image_generation_prompt_for_story_part. Sleeping for 20 seconds


Waiting for API cooldown: 100%|[34m████████████[0m| 19/19 [00:19<00:00,  1.01s/it][0m
Waiting for API cooldown: 100%|[34m████████████[0m| 16/16 [00:16<00:00,  1.00s/it][0m
Generating story images:  67%|██████▋   | 2/3 [01:48<00:55, 55.03s/it]

Generated image with imagen using prompt: A fog-shrouded pine forest trail in winter, with a small wooden cabin barely visible through the thick mist; the atmosphere is eerie and disorienting, with a sense of isolation and uncertainty pervading the cold, damp air in the style of Studio Ghibli art style, soft atmospheric colors, detailed backgrounds, gentle gradients, natural elements, dreamy lighting, painted textures
Successfully generated and saved image for development


Waiting for API cooldown: 100%|[34m██████████████[0m| 8/8 [00:08<00:00,  1.00s/it][0m


Function that called this one: create_image_generation_prompt_for_story_part. Sleeping for 20 seconds


Waiting for API cooldown: 100%|[34m████████████[0m| 19/19 [00:19<00:00,  1.01s/it][0m
Waiting for API cooldown: 100%|[34m████████████[0m| 16/16 [00:16<00:00,  1.02s/it][0m
Generating story images: 100%|██████████| 3/3 [02:47<00:00, 55.86s/it]

Generated image with imagen using prompt: View from a cozy resort restaurant in Sweden, evening light casting a warm glow on rustic wooden tables set with traditional meatballs and lingonberry sauce, large windows showcasing snow-capped mountains in the distance, a crackling fireplace adding to the comfortable atmosphere in the style of Studio Ghibli art style, soft atmospheric colors, detailed backgrounds, gentle gradients, natural elements, dreamy lighting, painted textures
Successfully generated and saved image for resolution





M4A audio files which you will be able to download and play via a media player.
They have synced lyrics which can be viewed in the Oto Music Player app

In [21]:
from PIL import Image
from src.story import create_album_files, generate_index_html

FIRST_STORY_PART = list(image_data.keys())[0]
#may need to change depending on size of story made and what parts there are
album_image = Image.open(story_dir / clean_story_name / f"{clean_story_name}_{FIRST_STORY_PART}.png")
#create m4a file:
create_album_files(story_data_dict=story_dialogue_audio, cover_image=album_image, output_dir=story_dir / clean_story_name / config.TARGET_LANGUAGE_NAME, story_name=clean_story_name)

creating album:  33%|███▎      | 1/3 [00:03<00:06,  3.02s/it]

Saved M4A file track number 1


creating album:  67%|██████▋   | 2/3 [00:06<00:03,  3.42s/it]

Saved M4A file track number 2


creating album: 100%|██████████| 3/3 [00:09<00:00,  3.24s/it]

Saved M4A file track number 3





Now we generate the main html file - this wraps up the M4A files and image files within it, so it's self-contained

In [22]:
from src.story import create_html_story

create_html_story(
            story_data_dict=story_dialogue_audio,
            image_dir=story_dir / clean_story_name, #the langauge sub-folders will be picked up automatically
            story_name=clean_story_name,
        )

Preparing HTML data: 100%|██████████| 3/3 [04:12<00:00, 84.09s/it] 


HTML story created at: y:\Python Scripts\audio-language-trainer\outputs\stories\story_swedish_adventure_in_winter_wilderness\Swedish\story_swedish_adventure_in_winter_wilderness.html


Upload to a public google cloud bucket

In [23]:
html_story_path = story_dir / clean_story_name / config.TARGET_LANGUAGE_NAME / f"{clean_story_name}.html"
assert html_story_path.exists()
upload_story_to_gcs(html_file_path=html_story_path)

'https://storage.googleapis.com/audio-language-trainer-stories/swedish/story_swedish_adventure_in_winter_wilderness/story_swedish_adventure_in_winter_wilderness.html'

Now update and reupload our index.html - which allows users to navigate all the stories

In [24]:
generate_index_html()
#will default to public GCS bucket
upload_to_gcs(
    file_path="../outputs/stories/index.html",
    content_type="text/html"
)


'https://storage.googleapis.com/audio-language-trainer-stories/index.html'

## Linking stories to flash cards
We will use the Anki tag feature. Given a list of english phrases that are required to understand a story, we can tag each of those phrases within a specific Anki Deck.

The card template will turn any tag starting story_ into a hyperlink to the public google cloud bucket

In [25]:
#sometimes this needs running twice...
from src.anki_tools import add_tag_to_matching_notes

updates, errors = add_tag_to_matching_notes(
    deck_name=DECK_NAME,
    phrases=all_flashcards,
    tag=clean_story_name
)

print(f"Updated {updates} notes")
if errors:
    print("Errors encountered:")
    for error in errors:
        print(f"- {error}")

NameError: name 'all_flashcards' is not defined

In [None]:
df_deck = get_deck_contents(DECK_NAME)
df_deck.query("tags == @clean_story_name").shape

In [None]:
#we should know most of the vocab...
df_deck.query("tags == @clean_story_name").knowledge_score.hist()