Skip to content

Assign Topics & Subtopics to Each Podcast Episode (Episode-Level + Timestamp-Level Categorization) #94

@kavaivaleri

Description

@kavaivaleri

We have a large library of podcast episodes, each containing dozens of insights across data, AI, ML, open source, careers, and engineering. Today, these episodes have only manually assigned high-level topics, which are not enough to support deeper content discovery or to create thematic landing pages.

The goal of this task is to design a consistent topic taxonomy (topics + subtopics) and categorize both whole podcast episodes and individual timestamps (clips) based on this taxonomy. This structured classification will eventually power a set of topic-based “insight hubs” on the website, similar to the topic pages used by Huberman Lab (example of a topic: https://www.hubermanlab.com/topics/fitness-and-workout-routines)

1. Create a Clean Topic Taxonomy

You need to propose a list of major topics and define at least 3 meaningful subtopics under each one.

Example:

Topic: Open-source
Subtopics:

  • Getting started with open-source
  • Contributing effectively
  • Roles and responsibilities in open-source
  • Growing a career through open-source
  • etc.

The taxonomy should be:

  • broad enough to cover all podcast episodes
  • specific enough to categorize timestamps accurately
  • consistent with the long-term /insights/ structure

This taxonomy becomes our “topics vocabulary” across the whole platform.

2. Categorize Podcast Episodes

For each podcast episode:

  • assign 1-3 topics from the taxonomy
  • assign relevant subtopics
  • write them into the front matter of each episode file

3. Categorize Individual Timestamps (Clips)

Each timestamp in the transcript should also receive:

  • one topic
  • and one or more subtopics
    based on the content of the clip.

This creates a dense dataset that allows us to surface insights by theme rather than by episode.

Later, these will feed into the topic landing pages at /insights/<topic>/.

Long-Term Direction

This topic system will power:

  • SEO-optimized topic hubs
  • user-friendly landing pages
  • “insights from multiple episodes” sections
  • cross-episode internal linking
  • newsletter compilation automation
  • and content discovery tools

We are following the same approach as Huberman Lab Topics: https://www.hubermanlab.com/topics

A full description of the direction is documented here:
https://docs.google.com/document/d/1Ewh6-6fNLu8qLab3C8gJgKZqIFHdL5qbH7gIZQS0E_o/edit?tab=t.0

After this hackathon task, the next milestone will be building the /insights/ pages and automating how podcasts feed into them.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions