Skip to content
/ wikiaudify Public

Generate audio summaries of Wikipedia articles using OpenAI and ElevenLabs

License

Notifications You must be signed in to change notification settings

hay/wikiaudify

Repository files navigation

wikiaudify

Generate audio summaries of Wikipedia articles using OpenAI and ElevenLabs

By Hay Kranen

Introduction

This was a hackathon project made during the Wikimedia NL Mini Hackathon 2024 to generate audio summaries like the ones from NotebookLM. Obviously it's not as good as that one, but it makes quite enjoyable fun short audio conversations.

Here's an example of a generated audio summary about the article on Grilled cheese.

grilled_cheese.mp4

And you can find a transcription here.

Install

What you'll need:

There is an option to use a local LLM (like Ollama) but i didn't get very good results, but you could try to make it work!

To use this script:

  1. Clone this repo
git clone https://github.com/hay/wikiaudify.git
  1. Make a virtual environment and install the requirements.txt
python -m venv env
source env/bin/activate
pip install -r requirements.txt
  1. Copy the example-config.toml to a new file (e.g. test.toml) and fill in your API keys and other details

  2. Try running it from the command line like this:

python generate.py -a "Grilled_Cheese" -q "At what temperature will my cheese melt?" -c test.toml

Note that the Wikipedia article you give with the -a option should have underscores, e.g. the path in the URL of the article.

By default this will generate two files in the root of this project: a summary.mp3 containing the summary and a summary.txt with a transcription.

Troubleshooting

If you add the -v (verbose) flag audio2text will give much more debug information.

All options

You'll get this when doing python generate.py -h

usage: generate.py [-h] [-a ARTICLE] [-c CONFIG] [-na] [-nt] [-o OUT]
                   [-ot OUT_TRANSCRIPT] [-q QUESTION] [-v]

Generate an audio summary of a Wikipedia article

options:
  -h, --help            show this help message and exit
  -a, --article ARTICLE
                        Article you want the audio summary to be about
  -c, --config CONFIG   Path to a TOML file with configuration
  -na, --no-audio       Don't generate audio output
  -nt, --no-transcript  Don't generate an audio transcript
  -o, --out OUT         Path of output MP3 file
  -ot, --out-transcript OUT_TRANSCRIPT
                        Path of output transcript
  -q, --question QUESTION
                        User question that will be included in the summary
  -v, --verbose         Show debug information

License

MIT © Hay Kranen

About

Generate audio summaries of Wikipedia articles using OpenAI and ElevenLabs

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages