Skip to content

modumir/bath_hack_26

Repository files navigation

Public Voice

Built by Belgrave Warriors.

Public Voice turns a simple query into a visual pulse-check of what people are saying online: which topics dominate, and how those topics feel.

Serious insights, fun interface.

What This Project Does

  • Scrapes TikTok videos related to your query
  • Extracts audio with ffmpeg
  • Transcribes audio with whisper.cpp
  • Cleans and processes text
  • Runs sentiment analysis + topic modeling
  • Serves topic coverage and sentiment via FastAPI
  • Renders interactive topic bars in the React frontend

Tech Stack

  • Frontend: React + Vite + Recharts
  • Backend API: FastAPI
  • NLP/ML: Transformers + BERTopic
  • Speech: whisper.cpp
  • Media processing: ffmpeg / ffmpeg-python

Project Structure

bath_hack_26/
├── src/                         # Frontend app
├── backend/
│   ├── centre_back.py           # FastAPI server
│   ├── model.py                 # Main NLP + topic pipeline
│   ├── scraper.py               # TikTok scrape + audio + transcription
│   └── clean_data.py            # Text cleaning helpers
├── whisper.cpp/                 # Local whisper.cpp source/build
├── pyproject.toml               # Python dependencies
├── package.json                 # Frontend dependencies
└── README.md

Setup

1) System dependencies

On Ubuntu/Debian:

sudo apt update
sudo apt install -y python3 python3-venv python3-pip ffmpeg cmake build-essential pkg-config

2) Frontend dependencies

From repo root:

npm install

3) Python environment

From repo root:

python3 -m venv .venv
source .venv/bin/activate
pip install --upgrade pip setuptools wheel
pip install -e .
pip install fastapi uvicorn requests ffmpeg-python python-dotenv

4) Environment variables

Create .env in repo root and add:

API_KEY=your_rapidapi_key
API_HOST=free-tiktok-api-scraper-mobile-version.p.rapidapi.com
OPENROUTER_API_KEY=your_openrouter_key

5) whisper.cpp build + model

From repo root:

cmake -S whisper.cpp -B whisper.cpp/build
cmake --build whisper.cpp/build --config Release -- -j$(nproc)

Optional model download:

cd whisper.cpp
./models/download-ggml-model.sh tiny.en
cd ..

Run Public Voice

Backend

From repo root:

source .venv/bin/activate
cd backend
python -m uvicorn centre_back:app --reload

Backend URL: http://localhost:8000

Frontend

From repo root:

npm run dev

Frontend URL: http://localhost:5173

Data Flow: Query -> Bars -> Colors

This is the core flow inside Public Voice.

  1. User query enters frontend search bar.
  2. Frontend calls GET /api/search?keyword=....
  3. FastAPI route in backend/centre_back.py calls main(keyword) in backend/model.py.
  4. backend/scraper.py:
    • Finds TikTok videos for the keyword
    • Downloads audio with ffmpeg
    • Transcribes with whisper.cpp
  5. backend/model.py:
    • Cleans transcripts
    • Runs sentiment model (positive/negative)
    • Runs BERTopic for topic grouping
    • Computes topic coverage percentages (values)
    • Computes sentiment scores per topic (sentiments)
  6. Backend returns JSON:
{
  "labels": ["Topic A", "Topic B", "Topic C"],
  "values": [0.42, 0.33, 0.25],
  "sentiments": [0.81, 0.47, 0.22]
}
  1. Frontend graph logic (src/components/GraphPanel.jsx):
    • Bar height comes from values (coverage as percent of total inputs)
    • Bar color is scaled from sentiments
      • lower sentiment -> redder
      • higher sentiment -> greener

So in one glance:

  • Height tells you how much that topic is being discussed
  • Color tells you how positive/negative that topic is

Troubleshooting

  • API_KEY not set: ensure .env exists and has API_KEY.
  • whisper executable not found: verify whisper.cpp/build/bin/whisper-cli exists.
  • ValueError: k must be less than or equal to the number of training points: input set is too small for BERTopic/HDBSCAN; rerun with more scraped samples.
  • ffmpeg failures: verify system ffmpeg is installed and available on PATH.

Team

Made with intent by Belgrave Warriors.

Product: Public Voice.

About

bath hack 26 entry from belgrave warriors team

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors