AudibleGraphics

Generate narrated infographics locally using Google Gemini for images, scripts, and text-to-speech. Enter any topic and get a full-screen infographic with an AI-generated narration script and audio.

Features

AI-generated infographics - Gemini generates the image, script, and narration from a single topic or question
Text-to-speech narration - Gemini voices bring the infographic to life
Multiple aspect ratios - 16:9 (landscape) and 9:16 (portrait)
4 voice options - Achird, Aoede, Charon, Laomedeia
MP4 export - Download your infographic as a shareable video
Local storage - Everything is saved locally, no cloud accounts needed

Prerequisites

Node.js 20.9+ (required by Next.js 16)
npm (bundled with Node.js)
Windows, macOS, or Linux. FFmpeg is bundled through npm dependencies; no separate system FFmpeg install is required.

Setup

Clone the repository

git clone https://github.com/Cp557/audiblegraphics.git
cd audiblegraphics

Install dependencies
```
npm install
```
Add your API key

Open .env and fill in your key. The Google account or project behind this key must have billing enabled for Gemini API usage.
```
GEMINI_API_KEY=your-gemini-api-key-here
```
Generate voice samples
```
npm run generate:voice-samples
```
Start the app
```
npm run dev
```
Open http://localhost:3000

Getting API Keys

Gemini - aistudio.google.com -> Get API key. Make sure billing is enabled for the API key's Google Cloud project before generating infographics or voice samples.

How It Works

Enter a topic or question in the input box
Choose a voice and aspect ratio
Click Generate - Gemini creates the script, image, and narration audio
Your infographic is saved locally and listed in the sidebar
Optionally download as MP4 video

Local Storage

Each infographic is stored as its own folder under public/uploads/, named after the topic:

public/uploads/
  history-of-rome/
    image.jpg
    audio.mp3
    meta.json    <- title, speaker notes, aspect ratio, creation date
    video.mp4    <- only present if you exported it
  black-holes-explained/
    image.jpg
    audio.mp3
    meta.json

Everything is gitignored - your generated content stays local.

Tech Stack

Next.js 16 (App Router)
TypeScript
Tailwind CSS v4
shadcn/ui
Google Gemini (@google/genai) - image, script, and text-to-speech generation
ffmpeg-static + Sharp - MP4 video export (bundled, no system install needed)

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
public		public
scripts		scripts
src		src
.env		.env
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
components.json		components.json
eslint.config.mjs		eslint.config.mjs
next.config.ts		next.config.ts
package-lock.json		package-lock.json
package.json		package.json
postcss.config.mjs		postcss.config.mjs
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AudibleGraphics

Features

Prerequisites

Setup

Getting API Keys

How It Works

Local Storage

Tech Stack

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AudibleGraphics

Features

Prerequisites

Setup

Getting API Keys

How It Works

Local Storage

Tech Stack

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages