# Story Generation
We remember things better as stories. The plan here is to pick a subset of our phrases, extract the vocabularly, and generate a story based off of them. We can then pull in more flashcards / phrases to ensure a more complete phrase coverage.

The story name will be story_some_title; when added as a 'tag' into Anki, this will add a hyperlink to a google cloud bucket of a specific format of bucket/language/story_name/story_name.html

This means it is easy to add new stories to an existing flashcard deck, and the links will update as soon as you add the tags

In [1]:
%load_ext autoreload
%autoreload 2
from dotenv import load_dotenv

load_dotenv()

PAY_FOR_API = True #change to True to run cells that cost money via API calls

In [None]:
import random
from pathlib import Path
from pprint import pprint

from src.anki_tools import AnkiCollectionReader, get_deck_contents
from src.config_loader import config
from src.nlp import (
    create_flashcard_index,
    find_missing_vocabulary,
    get_vocab_dict_from_dialogue,
    get_vocab_dictionary_from_phrases,
    get_index_subset
)
from src.utils import (
    load_json,
    load_text_file,
    save_json,
    save_pickle,
    load_pickle,
    upload_story_to_gcs,
    upload_to_gcs,
)

from src.phrase import get_phrase_indices
from copy import deepcopy


### Add directories
story images can be re-used between languages, but audio files are language specific, so we structure the story directory story_name/language with audio files in 'language/' and images and the english JSON file in story_name dir

In [3]:
notebook_dir = Path().absolute()  # This gives src/notebooks
phrase_dir = notebook_dir.parent / "data" / "phrases" #where we store text files of phrases
story_dir = notebook_dir.parent / "outputs" / "stories" # where we store our stories

we already have flashcards generated for some phrases:
a flashcard index allows us to select flashcards that cover a specific vocabulary range, it's quite computationally expensive, but is generated
using create_flashcard_index

In [4]:
PHRASE_LIST_NAME = "longman_1000_phrases"
phrase_file = phrase_dir / f"{PHRASE_LIST_NAME}.txt"
phrases = load_text_file(phrase_file)
pprint(f"First few phrases {phrases[:10]}")



("First few phrases ['Do you want to become a famous writer?', 'Let me show "
 "you around the city', 'We need to handle this situation carefully', 'Stop "
 'wasting time on this\', \'Do you like playing the guitar at night?\', "I\'m '
 'taking a vacation next month", "Don\'t forget to wear a helmet while '
 'cycling", "Let\'s cut unnecessary expenses this year", "We\'re producing a '
 'new product soon", \'Did you remember to turn off the stove?\']')


## create the flashcard index
This makes it very fast to find matching flashcards from a given vocab list

In [5]:
# long process, so only create if it doesn't exist
notebook_dir = Path().absolute()  # This gives src/notebooks
index_file = phrase_dir / f"{PHRASE_LIST_NAME}_index.json"

if index_file.exists():
    phrase_index = load_json(index_file)
else:
    phrase_index = create_flashcard_index(phrases)
    save_json(data=phrase_index, file_path=index_file)



## Sample some phrases to generate the story from
This will pin the story to the vocab found in some pre-existing phrases

In [6]:
#we can obtain phrases we know to create a story from:
# NOTE: you must close Anki Desktop when trying to form a connection here
with AnkiCollectionReader() as reader:
    pprint(reader.get_deck_names())

#this will print out deck_id : deck_name -> we want to copy the relevant deck_name

{1: 'Default',
 1731524665442: 'Swedish EAL',
 1732020971325: 'RapidRetention - Swedish::LM1000',
 1732637740663: 'RapidRetention - Welsh - LM1000',
 1734260227418: 'RapidRetention - Swedish::NumbersDays',
 1734261644938: 'RapidRetention - Russian::NumbersDays',
 1734264578929: 'RapidRetention - Italian - NumbersDays',
 1734426251278: 'RapidRetention - French - NumbersDays',
 1735660105659: 'RapidRetention - Swedish::EatingOut',
 1735684325990: 'RapidRetention - Czech - EatingOut',
 1735687489606: 'RapidRetention - Welsh - EatingOut',
 1735998451053: 'RapidRetention - Italian - EatingOut',
 1737410923009: 'RapidRetention - Swedish::LanguageMeetUp',
 1737964362773: 'RapidRetention - Swedish',
 1739041921475: 'RapidRetention - Russian::EatingOut',
 1739041939923: 'RapidRetention - Russian',
 1739715099478: 'RapidRetention - Spanish',
 1739715099479: 'RapidRetention - Spanish::EatingOut',
 1741030244238: 'RapidRetention - Chinese',
 1741030244239: 'RapidRetention - Chinese::EatingOut'}


## RESTART here to refresh the list of phrases without tags

In [297]:
DECK_NAME = "RapidRetention - Swedish::LM1000"
df = get_deck_contents(deck_name=DECK_NAME) #calculates knowledge score
df.head()

Unnamed: 0,note_id,model_name,tags,n_cards,avg_ease,total_reps,avg_reps,total_lapses,avg_lapses,avg_interval,TargetText,TargetAudio,TargetAudioSlow,EnglishText,WiktionaryLinks,Picture,TargetLanguageName,knowledge_score
0,1732020511348,Language Practice With Images,story_rainy_football_match,3,93.3,6,2.0,0,0.0,56.7,Var mer uppmärksam på detaljer,[sound:a821f020-5a84-44fb-af42-c6c4133e4379.mp3],[sound:3d6997c2-92c7-43c9-b31e-a20cc4f0bf9e.mp3],Pay more attention to details,"<a href=""https://en.wiktionary.org/wiki/var#Sw...","<img src=""f7153993-cfee-40f4-841c-1bd6cfaeb5a9...",Swedish,0.394
1,1732020511352,Language Practice With Images,story_unexpected_wedding_guests,3,280.0,16,5.3,0,0.0,166.3,Kommer kunden att känna igen mig?,[sound:fa15b936-ef4e-44d5-932d-e94a6b477c9d.mp3],[sound:79a26555-55ef-43d6-a9ca-6ee02ade7721.mp3],Will the customer recognize me?,"<a href=""https://en.wiktionary.org/wiki/kommer...","<img src=""cd5e83e4-813e-4097-962a-cf536f866e99...",Swedish,0.517
2,1732020511356,Language Practice With Images,story_birthday_party_planning_mishap,3,83.3,5,1.7,0,0.0,24.0,Vänligen svara ärligt på alla frågor,[sound:f54483fc-3303-427d-acef-6710ae244bc9.mp3],[sound:41c55da5-347a-4d0b-b6c8-40b27b46c1b9.mp3],Please answer all questions honestly,"<a href=""https://en.wiktionary.org/wiki/v%C3%A...","<img src=""1a6640ec-fe2d-4808-b638-33a38b694224...",Swedish,0.342
3,1732020511360,Language Practice With Images,story_fishing_trip_gone_awry,3,186.7,6,2.0,0,0.0,54.0,Sluta slösa tid på detta,[sound:b34a331b-6dd9-44a2-a588-f92be1b11d06.mp3],[sound:a1a6714c-9bfd-4cd7-b30e-1a209c0ab42b.mp3],Stop wasting time on this,"<a href=""https://en.wiktionary.org/wiki/sluta#...","<img src=""8de5a6d3-b10c-45ea-860c-512cc2673be7...",Swedish,0.389
4,1732020511364,Language Practice With Images,story_workplace_stress_vacation,3,270.0,13,4.3,0,0.0,129.3,Vi producerar en ny produkt snart,[sound:10281f18-fddc-4a69-ab7b-6098f63b948f.mp3],[sound:0e472f07-e35b-4ab0-9161-e0a1474c6e34.mp3],We're producing a new product soon,"<a href=""https://en.wiktionary.org/wiki/vi#Swe...","<img src=""329bfcb3-7cb4-4174-923f-567a1bfe7ec9...",Swedish,0.475


In [298]:
print(f"""{df.query("tags == ''").shape[0]} phrases left""")

8 phrases left


# We want to arrive at all phrases assigned to stories (via tags)
So we create an untagged index - an index of flashcards that do not have a tag. We will use these to link to story vocabularly

In [289]:

phrases_with_tags = df.query("tags != ''")['EnglishText'].tolist()
phrases_without_tags = df.query("tags == ''")['EnglishText'].tolist()
#how many words are yet to be assigned to a story?
available_vocab = get_vocab_dictionary_from_phrases(phrases_without_tags)
print(len(available_vocab['verbs']),  len(available_vocab['vocab']))

#we need to know the location of each phrase as an integer in the phrase_index
phrases_without_tags_indicies = get_phrase_indices(known_phrases = phrases_without_tags, all_phrases = phrase_index['phrases'])

#if we already have a phrase linked to a story, we don't want to retrieve that from the index and link it to a story
untagged_index = deepcopy(phrase_index)
untagged_index['verb_index'] = get_index_subset(phrases_without_tags_indicies, untagged_index['verb_index'])
untagged_index['vocab_index'] = get_index_subset(phrases_without_tags_indicies, untagged_index['vocab_index'])

36 65


## If generating a new story - random sample some new phrases

We want to sample from phrases that have no tags

In [290]:
sampled_phrases = random.sample(phrases_without_tags, min(100, len(phrases_without_tags)))

#or use sampled_phrases
vocab_dict_flashcards = get_vocab_dictionary_from_phrases(sampled_phrases) #75 phrases should give a decent amount of vocab

Now generate the story

In [291]:
from src.dialogue_generation import generate_story

story_name, story_dialogue = generate_story(vocab_dict_flashcards)


Function that called this one: generate_story. Sleeping for 20 seconds
generated story: Unexpected Power Outage


## If using pre-generated story that we want to assign tags to?
Then overwrite the story name and load the json dialogue file

In [292]:
#story_name = "unexpected_music_project"
clean_story_name = f"story_{story_name.lower().replace(' ', '_')}"

story_path = story_dir / clean_story_name / f"{clean_story_name}.json"

#story_dialogue = load_json(story_path)
save_json(story_dialogue, story_path)
print(f"saved {clean_story_name} to {story_path}")

vocab_dict_story = get_vocab_dict_from_dialogue(story_dialogue, limit_story_parts=None)

saved story_unexpected_power_outage to y:\Python Scripts\audio-language-trainer\outputs\stories\story_unexpected_power_outage\story_unexpected_power_outage.json


Let's retrieve flashcards we know that better fit the story vocab

In [293]:
from src.nlp import get_matching_flashcards_indexed

# Let's find the minimal set of flashcards that we need to learn for the story
candidate_flashcards = get_matching_flashcards_indexed(vocab_dict_story, untagged_index)
candidate_phrases = [card.get('phrase') for card in candidate_flashcards['selected_cards']]


We can check the coverage below, we want stories to stretch learners so 70% ish is fine

In [294]:

known_vocab_dict = get_vocab_dictionary_from_phrases(candidate_phrases)
missing_vocab = find_missing_vocabulary(vocab_dict_source=known_vocab_dict, vocab_dict_target=vocab_dict_story)
missing_vocab_dict = missing_vocab["missing_vocab"]


=== VOCABULARY COVERAGE ANALYSIS ===
Target verbs covered by flashcards: 15.6%
Target vocabulary covered by flashcards: 26.4%

Verbs needing new flashcards:
['find', 'know', 'flicker', "'ve", 'make'] ...

Vocabulary needing new flashcards:
['my', 'introduction', 'light', 'building', 'some'] ...


Now supplement these with any remaining flascards we don't yet know

# Add tags to the flashcard deck

In [295]:
print(f"We are going to add '{clean_story_name}' tag to {len(candidate_phrases)} phrases withing '{DECK_NAME}'")

We are going to add 'story_unexpected_power_outage' tag to 17 phrases withing 'RapidRetention - Swedish::LM1000'


In [296]:
#sometimes this needs running twice...
from src.anki_tools import add_tag_to_matching_notes

updates, errors = add_tag_to_matching_notes(
    deck_name=DECK_NAME,
    phrases=candidate_phrases,
    tag=clean_story_name
)

print(f"Updated {updates} notes")
if errors:
    print("Errors encountered:")
    for error in errors:
        print(f"- {error}")

audio-language-trainer\src\anki_tools.py:217:save() is deprecated: saving is automatic
Updated 17 notes


## Repeat
We can now update phrases without tags at the top of this notebook and generate another story

## Generate the story files
Once you are happy with the flashcard coverage, you can:
* translate and add audio
* create the story images
* create the story album files (M4a files with synced lyrics)
* create the story HTML file using those previous files, and upload to Google Cloud Storage
* tag the flascards with the story name...this will then mean you can link to the story from within Anki (the template uses tags to auto-create hyperlinks)

In [304]:
str(df['TargetText'].sample().values[0])

'Han kan bli av med jobbet om han är sen'

In [321]:
from src.config_loader import config

In [322]:
config._load_config()

setting voice override: sv-SE-SofieNeural
setting voice override: sv-SE-MattiasNeural


In [None]:
vm.

In [320]:
config.get_voice_models()

(VoiceInfo(name='en-GB-Studio-B', provider=<VoiceProvider.GOOGLE: 'google'>, voice_type=<VoiceType.STUDIO: 'studio'>, gender='MALE', language_code='en-GB', country_code='GB', voice_id='en-GB-Studio-B'),
 VoiceInfo(name='Hillevi', provider=<VoiceProvider.AZURE: 'azure'>, voice_type=<VoiceType.NEURAL: 'neural'>, gender='FEMALE', language_code='sv-SE', country_code='SE', voice_id='sv-SE-HilleviNeural'),
 VoiceInfo(name='Mattias', provider=<VoiceProvider.AZURE: 'azure'>, voice_type=<VoiceType.NEURAL: 'neural'>, gender='MALE', language_code='sv-SE', country_code='SE', voice_id='sv-SE-MattiasNeural'))

In [324]:
from src.audio_generation import text_to_speech

text_to_speech("Hej! Hur mår du idag? Jag hoppas att allt är bra med dig. Det är en vacker dag ute, och jag tänkte ta en promenad i parken senare.", config_language="target", gender="MALE")

In [325]:
clean_story_name = "story_midnight_garden_mystery"
# story_workplace_stress_vacation 
# story_unexpected_music_project
# story_rainy_football_match 
# story_unexpected_train_adventure 
# story_unexpected_marathon_adventure
# story_sunset_wedding_blues 
# story_unexpected_wedding_guests 
# story_unexpected_career_change
# story_unexpected_coffee_adventure
# story_unexpected_movie_adventure
# story_surprise_hospital_adventure
# story_unexpected_power_outage

print(f"About to generate {clean_story_name}")

About to generate story_midnight_garden_mystery


In [326]:
story_path = story_dir / clean_story_name / f"{clean_story_name}.json"
story_dialogue = load_json(story_path)

In [327]:
from src.generate import add_audio, add_translations

story_dialogue_audio = add_translations(story_dialogue)
story_dialogue_audio = add_audio(story_dialogue_audio)

adding translations:   0%|          | 0/3 [00:00<?, ?it/s]

Beginning translation for introduction


adding translations:  33%|███▎      | 1/3 [00:01<00:03,  1.98s/it]

Translated dialogue
Beginning translation for development


adding translations:  67%|██████▋   | 2/3 [00:03<00:01,  1.94s/it]

Translated dialogue
Beginning translation for resolution


adding translations: 100%|██████████| 3/3 [00:05<00:00,  1.97s/it]


Translated dialogue


adding audio:   0%|          | 0/3 [00:00<?, ?it/s]

Beginning text-to-speech for introduction


Generating dialogue audio: 100%|██████████| 8/8 [00:08<00:00,  1.06s/it]
adding audio:  33%|███▎      | 1/3 [00:23<00:47, 23.89s/it]

Text-to-speech for dialogue done
Beginning text-to-speech for development


Generating dialogue audio: 100%|██████████| 10/10 [00:12<00:00,  1.23s/it]
adding audio:  67%|██████▋   | 2/3 [00:36<00:17, 17.29s/it]

Text-to-speech for dialogue done
Beginning text-to-speech for resolution


Generating dialogue audio: 100%|██████████| 9/9 [00:10<00:00,  1.13s/it]
adding audio: 100%|██████████| 3/3 [00:47<00:00, 15.70s/it]

Text-to-speech for dialogue done





In [328]:
#this has target language content in now so we save in language dir
save_pickle(data=story_dialogue_audio, file_path=story_dir / clean_story_name / config.TARGET_LANGUAGE_NAME / f"{clean_story_name}.pkl")
#story_dialogue_audio = load_pickle(story_dir / clean_story_name / config.TARGET_LANGUAGE_NAME / f"{clean_story_name}.pkl")

Image files for each part of the story:

In [329]:
from src.images import generate_and_save_story_images
    
image_data = generate_and_save_story_images(story_dict=story_dialogue_audio, output_dir = story_dir / clean_story_name, story_name=clean_story_name)


Generating story images:   0%|          | 0/3 [00:00<?, ?it/s]

Function that called this one: create_image_generation_prompt_for_story_part. Sleeping for 20 seconds


Waiting for API cooldown: 100%|[34m████████████[0m| 19/19 [00:19<00:00,  1.01s/it][0m
Waiting for API cooldown: 100%|[34m████████████[0m| 15/15 [00:15<00:00,  1.01s/it][0m


No image generated using imagen-3.0-generate-001 with prompt: Interior of a bright, welcoming community center lobby with colorful posters advertising music classes, a reception desk with flyers, and glimpses of practice rooms through open doors, sunlight streaming through large windows, creating a warm and inviting atmosphere in the style of Studio Ghibli art style, soft atmospheric colors, detailed backgrounds, gentle gradients, natural elements, dreamy lighting, painted textures


Waiting for API cooldown: 100%|[34m██████████████[0m| 8/8 [00:08<00:00,  1.01s/it][0m
Generating story images:  33%|███▎      | 1/3 [01:02<02:04, 62.24s/it]

Generated image with deepai using prompt: Interior of a bright, welcoming community center lobby with colorful posters advertising music classes, a reception desk with flyers, and glimpses of practice rooms through open doors, sunlight streaming through large windows, creating a warm and inviting atmosphere in the style of Studio Ghibli art style, soft atmospheric colors, detailed backgrounds, gentle gradients, natural elements, dreamy lighting, painted textures
Successfully generated and saved image for introduction


Waiting for API cooldown: 100%|[34m████████████[0m| 15/15 [00:15<00:00,  1.01s/it][0m


Function that called this one: create_image_generation_prompt_for_story_part. Sleeping for 20 seconds


Waiting for API cooldown: 100%|[34m████████████[0m| 19/19 [00:19<00:00,  1.01s/it][0m
Waiting for API cooldown: 100%|[34m████████████[0m| 15/15 [00:15<00:00,  1.01s/it][0m
Generating story images:  67%|██████▋   | 2/3 [02:07<01:03, 63.74s/it]

Generated image with imagen using prompt: Late evening view of a dimly lit dorm room or small apartment, desk cluttered with school materials and open textbooks, a laptop glowing in the foreground, scattered sticky notes on a corkboard, and a window revealing city lights in the distance, creating an atmosphere of urgency and academic stress in the style of Studio Ghibli art style, soft atmospheric colors, detailed backgrounds, gentle gradients, natural elements, dreamy lighting, painted textures
Successfully generated and saved image for development


Waiting for API cooldown: 100%|[34m██████████████[0m| 8/8 [00:08<00:00,  1.01s/it][0m


Function that called this one: create_image_generation_prompt_for_story_part. Sleeping for 20 seconds


Waiting for API cooldown: 100%|[34m████████████[0m| 19/19 [00:19<00:00,  1.01s/it][0m
Waiting for API cooldown: 100%|[34m████████████[0m| 16/16 [00:16<00:00,  1.01s/it][0m
Generating story images: 100%|██████████| 3/3 [03:05<00:00, 61.80s/it]

Generated image with imagen using prompt: View of a cheerful classroom after school hours, soft afternoon light filtering through colorful curtains, children's artwork and French vocabulary posters adorning the walls, scattered musical instruments and rain-making props on desks, an atmosphere of accomplishment and creativity lingering in the air in the style of Studio Ghibli art style, soft atmospheric colors, detailed backgrounds, gentle gradients, natural elements, dreamy lighting, painted textures
Successfully generated and saved image for resolution





M4A audio files which you will be able to download and play via a media player.
They have synced lyrics which can be viewed in the Oto Music Player app

In [330]:
from PIL import Image
from src.story import create_album_files, generate_index_html

FIRST_STORY_PART = list(story_dialogue_audio.keys())[0]
#may need to change depending on size of story made and what parts there are
album_image = Image.open(story_dir / clean_story_name / f"{clean_story_name}_{FIRST_STORY_PART}.png")
#create m4a file:
create_album_files(story_data_dict=story_dialogue_audio, cover_image=album_image, output_dir=story_dir / clean_story_name / config.TARGET_LANGUAGE_NAME, story_name=clean_story_name)

creating album:  33%|███▎      | 1/3 [00:01<00:03,  1.96s/it]

Saved M4A file track number 1


creating album:  67%|██████▋   | 2/3 [00:04<00:02,  2.53s/it]

Saved M4A file track number 2


creating album: 100%|██████████| 3/3 [00:07<00:00,  2.53s/it]

Saved M4A file track number 3





Now we generate the main html file - this wraps up the M4A files and image files within it, so it's self-contained

In [None]:
from src.story import create_html_story

create_html_story(
            story_data_dict=story_dialogue_audio,
            image_dir=story_dir / clean_story_name, #the langauge sub-folders will be picked up automatically
            story_name=clean_story_name,
        )

Preparing HTML data:  33%|███▎      | 1/3 [01:51<03:42, 111.37s/it]

Upload to a public google cloud bucket

In [None]:
html_story_path = story_dir / clean_story_name / config.TARGET_LANGUAGE_NAME / f"{clean_story_name}.html"
assert html_story_path.exists()
upload_story_to_gcs(html_file_path=html_story_path)

'https://storage.googleapis.com/audio-language-trainer-stories/swedish/story_birthday_party_planning_mishap/story_birthday_party_planning_mishap.html'

Now update and reupload our index.html - which allows users to navigate all the stories

In [None]:
generate_index_html()
#will default to public GCS bucket
upload_to_gcs(
    file_path="../outputs/stories/index.html",
    content_type="text/html"
)


'https://storage.googleapis.com/audio-language-trainer-stories/index.html'