-
Notifications
You must be signed in to change notification settings - Fork 37
Description
We have a large library of podcast episodes, each containing dozens of insights across data, AI, ML, open source, careers, and engineering. Today, these episodes have only manually assigned high-level topics, which are not enough to support deeper content discovery or to create thematic landing pages.
The goal of this task is to design a consistent topic taxonomy (topics + subtopics) and categorize both whole podcast episodes and individual timestamps (clips) based on this taxonomy. This structured classification will eventually power a set of topic-based “insight hubs” on the website, similar to the topic pages used by Huberman Lab (example of a topic: https://www.hubermanlab.com/topics/fitness-and-workout-routines)
1. Create a Clean Topic Taxonomy
You need to propose a list of major topics and define at least 3 meaningful subtopics under each one.
Example:
Topic: Open-source
Subtopics:
- Getting started with open-source
- Contributing effectively
- Roles and responsibilities in open-source
- Growing a career through open-source
- etc.
The taxonomy should be:
- broad enough to cover all podcast episodes
- specific enough to categorize timestamps accurately
- consistent with the long-term
/insights/structure
This taxonomy becomes our “topics vocabulary” across the whole platform.
2. Categorize Podcast Episodes
For each podcast episode:
- assign 1-3 topics from the taxonomy
- assign relevant subtopics
- write them into the front matter of each episode file
3. Categorize Individual Timestamps (Clips)
Each timestamp in the transcript should also receive:
- one topic
- and one or more subtopics
based on the content of the clip.
This creates a dense dataset that allows us to surface insights by theme rather than by episode.
Later, these will feed into the topic landing pages at /insights/<topic>/.
Long-Term Direction
This topic system will power:
- SEO-optimized topic hubs
- user-friendly landing pages
- “insights from multiple episodes” sections
- cross-episode internal linking
- newsletter compilation automation
- and content discovery tools
We are following the same approach as Huberman Lab Topics: https://www.hubermanlab.com/topics
A full description of the direction is documented here:
https://docs.google.com/document/d/1Ewh6-6fNLu8qLab3C8gJgKZqIFHdL5qbH7gIZQS0E_o/edit?tab=t.0
After this hackathon task, the next milestone will be building the /insights/ pages and automating how podcasts feed into them.