Generate audio summaries of Wikipedia articles using OpenAI and ElevenLabs
By Hay Kranen
This was a hackathon project made during the Wikimedia NL Mini Hackathon 2024 to generate audio summaries like the ones from NotebookLM. Obviously it's not as good as that one, but it makes quite enjoyable fun short audio conversations.
Here's an example of a generated audio summary about the article on Grilled cheese.
grilled_cheese.mp4
And you can find a transcription here.
What you'll need:
- An OpenAI API key
- An ElevenLabs API key
- Python 3.13+ (it probably works with older versions too, but no guarantees)
There is an option to use a local LLM (like Ollama) but i didn't get very good results, but you could try to make it work!
To use this script:
- Clone this repo
git clone https://github.com/hay/wikiaudify.git
- Make a virtual environment and install the
requirements.txt
python -m venv env
source env/bin/activate
pip install -r requirements.txt
-
Copy the
example-config.toml
to a new file (e.g.test.toml
) and fill in your API keys and other details -
Try running it from the command line like this:
python generate.py -a "Grilled_Cheese" -q "At what temperature will my cheese melt?" -c test.toml
Note that the Wikipedia article you give with the -a
option should have underscores, e.g. the path in the URL of the article.
By default this will generate two files in the root of this project: a summary.mp3
containing the summary and a summary.txt
with a transcription.
If you add the -v
(verbose) flag audio2text
will give much more debug information.
You'll get this when doing python generate.py -h
usage: generate.py [-h] [-a ARTICLE] [-c CONFIG] [-na] [-nt] [-o OUT]
[-ot OUT_TRANSCRIPT] [-q QUESTION] [-v]
Generate an audio summary of a Wikipedia article
options:
-h, --help show this help message and exit
-a, --article ARTICLE
Article you want the audio summary to be about
-c, --config CONFIG Path to a TOML file with configuration
-na, --no-audio Don't generate audio output
-nt, --no-transcript Don't generate an audio transcript
-o, --out OUT Path of output MP3 file
-ot, --out-transcript OUT_TRANSCRIPT
Path of output transcript
-q, --question QUESTION
User question that will be included in the summary
-v, --verbose Show debug information
MIT © Hay Kranen