Public Voice

Built by Belgrave Warriors.

Public Voice turns a simple query into a visual pulse-check of what people are saying online: which topics dominate, and how those topics feel.

Serious insights, fun interface.

What This Project Does

Scrapes TikTok videos related to your query
Extracts audio with ffmpeg
Transcribes audio with whisper.cpp
Cleans and processes text
Runs sentiment analysis + topic modeling
Serves topic coverage and sentiment via FastAPI
Renders interactive topic bars in the React frontend

Tech Stack

Frontend: React + Vite + Recharts
Backend API: FastAPI
NLP/ML: Transformers + BERTopic
Speech: whisper.cpp
Media processing: ffmpeg / ffmpeg-python

Project Structure

bath_hack_26/
├── src/                         # Frontend app
├── backend/
│   ├── centre_back.py           # FastAPI server
│   ├── model.py                 # Main NLP + topic pipeline
│   ├── scraper.py               # TikTok scrape + audio + transcription
│   └── clean_data.py            # Text cleaning helpers
├── whisper.cpp/                 # Local whisper.cpp source/build
├── pyproject.toml               # Python dependencies
├── package.json                 # Frontend dependencies
└── README.md

Setup

1) System dependencies

On Ubuntu/Debian:

sudo apt update
sudo apt install -y python3 python3-venv python3-pip ffmpeg cmake build-essential pkg-config

2) Frontend dependencies

From repo root:

npm install

3) Python environment

From repo root:

python3 -m venv .venv
source .venv/bin/activate
pip install --upgrade pip setuptools wheel
pip install -e .
pip install fastapi uvicorn requests ffmpeg-python python-dotenv

4) Environment variables

Create .env in repo root and add:

API_KEY=your_rapidapi_key
API_HOST=free-tiktok-api-scraper-mobile-version.p.rapidapi.com
OPENROUTER_API_KEY=your_openrouter_key

5) whisper.cpp build + model

From repo root:

cmake -S whisper.cpp -B whisper.cpp/build
cmake --build whisper.cpp/build --config Release -- -j$(nproc)

Optional model download:

cd whisper.cpp
./models/download-ggml-model.sh tiny.en
cd ..

Run Public Voice

Backend

From repo root:

source .venv/bin/activate
cd backend
python -m uvicorn centre_back:app --reload

Backend URL: http://localhost:8000

Frontend

From repo root:

npm run dev

Frontend URL: http://localhost:5173

Data Flow: Query -> Bars -> Colors

This is the core flow inside Public Voice.

User query enters frontend search bar.
Frontend calls GET /api/search?keyword=....
FastAPI route in backend/centre_back.py calls main(keyword) in backend/model.py.
backend/scraper.py:
- Finds TikTok videos for the keyword
- Downloads audio with ffmpeg
- Transcribes with whisper.cpp
backend/model.py:
- Cleans transcripts
- Runs sentiment model (positive/negative)
- Runs BERTopic for topic grouping
- Computes topic coverage percentages (values)
- Computes sentiment scores per topic (sentiments)
Backend returns JSON:

{
  "labels": ["Topic A", "Topic B", "Topic C"],
  "values": [0.42, 0.33, 0.25],
  "sentiments": [0.81, 0.47, 0.22]
}

Frontend graph logic (src/components/GraphPanel.jsx):
- Bar height comes from values (coverage as percent of total inputs)
- Bar color is scaled from sentiments
  - lower sentiment -> redder
  - higher sentiment -> greener

So in one glance:

Height tells you how much that topic is being discussed
Color tells you how positive/negative that topic is

Troubleshooting

API_KEY not set: ensure .env exists and has API_KEY.
whisper executable not found: verify whisper.cpp/build/bin/whisper-cli exists.
ValueError: k must be less than or equal to the number of training points: input set is too small for BERTopic/HDBSCAN; rerun with more scraped samples.
ffmpeg failures: verify system ffmpeg is installed and available on PATH.

Team

Made with intent by Belgrave Warriors.

Product: Public Voice.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
__pycache__		__pycache__
backend		backend
bath_hack_26.egg-info		bath_hack_26.egg-info
speech_data		speech_data
src		src
whisper.cpp		whisper.cpp
.env.example		.env.example
.gitignore		.gitignore
.nvmrc		.nvmrc
.python-version		.python-version
.tool-versions		.tool-versions
README.md		README.md
index.html		index.html
package-lock.json		package-lock.json
package.json		package.json
pyproject.toml		pyproject.toml
testing.ipynb		testing.ipynb
uv.lock		uv.lock
video.mp4		video.mp4
vite.config.js		vite.config.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Public Voice

What This Project Does

Tech Stack

Project Structure

Setup

1) System dependencies

2) Frontend dependencies

3) Python environment

4) Environment variables

5) whisper.cpp build + model

Run Public Voice

Backend

Frontend

Data Flow: Query -> Bars -> Colors

Troubleshooting

Team

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Public Voice

What This Project Does

Tech Stack

Project Structure

Setup

1) System dependencies

2) Frontend dependencies

3) Python environment

4) Environment variables

5) whisper.cpp build + model

Run Public Voice

Backend

Frontend

Data Flow: Query -> Bars -> Colors

Troubleshooting

Team

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages