This repository contains the source code and content for datatalks.club, a Rustkyll-built community website for data science, machine learning, AI, and data engineering practitioners.
- Static website built with Rustkyll
- Content-first structure: markdown, data files, and reusable templates
- Main entities are modeled as collections (
_posts,_podcast,_books,_people, etc.) - Navigation, events, announcements, and sponsors are managed via YAML files in
_data
| URL | Source file | What it means | How it works |
|---|---|---|---|
/ |
index.md |
Main landing page for the community | Uses Liquid loops to aggregate data from multiple sources: upcoming events (_data/events.yaml), latest podcast episodes (_podcast), latest posts (_posts), sponsors (_data/sponsors.yaml), and active books (_books). |
/articles.html |
articles.md |
Full article index | Iterates over site.posts and links to each article with author references from _people. |
/podcast.html |
podcast.md |
Podcast hub page | Lists all episodes by season from _podcast; each episode gets its own detail page via collection permalink rules. |
/books.html |
books.md |
"Book of the Week" program | Splits books into upcoming vs archive using date filters (book.end > site.time and book.end < site.time). |
/events.html |
events.md |
Public events calendar page | Reads _data/events.yaml and divides events into upcoming and past based on event timestamp relative to site.time. |
/people.html |
people.md |
Community people directory | Renders all person profiles from _people, each with an auto-generated profile URL. |
/slack.html |
slack.md |
Slack onboarding page | Uses subscribe.html include for invite flow and documents key channels and participation guidelines. |
/support.html |
support.md |
Community support and sponsorship page | Static content page for funding model, sponsor principles, and contact details. |
/tools.html |
tools.md |
Open-source spotlight page | Iterates through _tools collection entries (tool links, demos, maintainers). |
/blog/guide-to-free-online-courses-at-datatalks-club.html |
Post in _posts |
Primary courses landing page in navigation | The top nav "Courses" item points here; individual Zoomcamp pages live mostly in _posts plus legacy _courses docs. |
| Layer | Folder/files | Responsibility |
|---|---|---|
| Content pages | *.md in repo root |
Entry pages and hubs (index.md, events.md, podcast.md, etc.). |
| Blog posts | _posts/*.md |
Long-form articles, course landing pages, and announcements; rendered under /blog/:title.html. |
| Domain collections | _podcast, _books, _people, _courses, _tools, _conferences |
Structured content types with dedicated layouts and permalinks. |
| Data sources | _data/*.yaml |
Site-wide data for menus, events, sponsors, and header announcements. |
| Layouts | _layouts/*.html |
High-level page skeletons (home, page, post, podcast, book, author). |
| Reusable components | _includes/*.html |
Shared snippets (header/footer, authors, event cards, subscribe blocks, etc.). |
| Assets | images, assets |
Static media, styles, and supporting files. |
| Generated output | _site |
Local build output generated by Rustkyll. |
| Type | Location | URL shape | Typical usage |
|---|---|---|---|
| Posts | _posts/*.md |
/blog/:title.html |
Articles, guides, Zoomcamp pages, editorial content. |
| Podcast episodes | _podcast/*.md |
/podcast/:title.html |
Episode pages linked from /podcast.html and homepage. |
| Books | _books/*.md |
/books/:title.html |
Book of the Week detail pages and archive entries. |
| People | _people/*.md |
/people/:title.html |
Author/speaker profiles used across posts, episodes, and events. |
| Courses | _courses/*.md |
/courses/:title.html |
Legacy standalone course pages; many newer course pages are posts. |
| Tools | _tools/*.md |
/tools/:title.html |
Open-source tool spotlights. |
| Conferences | _conferences/*.md |
/conferences/:title.html |
Conference-specific pages. |
| Global data file | Purpose | Used by |
|---|---|---|
_data/navigation.yaml |
Top and bottom navigation links | header.html, footer.html includes |
_data/events.yaml |
Event records and metadata | index.md, events.md, event include |
_data/header.yaml |
Optional announcement bar | header.html include |
_data/sponsors.yaml |
Sponsor names/logos/links | Homepage sponsors section |
- Shared page layouts live in
_layouts(home,page,post,podcast,book,author) - Reusable fragments live in
_includes(header,footer,authors,event, subscribe forms, etc.) - Pages and collection documents combine front matter + markdown/html + Liquid loops/filters
- The global permalink rule in
_config.ymlis/blog/:title.htmlfor posts. - Collections define their own permalinks in
_config.yml(/:collection/:title.html). - This means each content type can have both:
- a hub/list page (e.g.
podcast.md->/podcast.html) - item detail pages (e.g.
_podcast/*.md->/podcast/<slug>.html)
- a hub/list page (e.g.
- Rustkyll release binary for site builds
- Python environment manager (
uv) for helper scripts
Install the pinned Rustkyll release binary:
make installRun the development server:
make serveOpen http://localhost:4000.
| Task | Edit this | Notes |
|---|---|---|
| Publish a new article | _posts |
Include front matter (title, description, authors, tags, layout, date). |
| Publish a new podcast episode | _podcast |
Make sure season and episode are set for correct grouping on /podcast.html. |
| Add/update event | _data/events.yaml |
Event type controls styling (webinar, podcast, workshop, conference). |
| Add/update person profile | _people |
Required for author/speaker linking across pages and includes. |
| Add/update a book | _books |
start/end dates determine upcoming vs archived display. |
| Update top menu links | _data/navigation.yaml |
Header links are rendered from top entries. |
| Update homepage blocks | index.md |
Homepage sections are manually structured and data-driven via Liquid. |
| Update announcement bar | _data/header.yaml |
Shown in header only when announcement data exists. |
Install script dependencies:
uv sync
cd previews
npm install
cd ..Run helper creator script:
uv run python scripts/create.pyThis script helps create/update content entities such as people, books, and events from templates.
| Script/command | Purpose |
|---|---|
uv run python scripts/create.py |
Interactive helper to create people, books, and events using templates. |
uv run python scripts/pandoc_full.py ... |
Generate post draft content from a DOCX source. |
scripts/generate-book-preview.sh (called internally) |
Creates book preview assets for newly added books. |
uv run python scripts/pandoc_full.py \
--input ~/Downloads/template.docx \
--author angelicaloduca \
--tags "mlops,devops,process"- Add/edit article:
_posts - Add/edit podcast episode:
_podcast - Add/edit person profile:
_people - Add/edit book discussion:
_books - Add/edit event:
_data/events.yaml - Edit top menu/footer links:
_data/navigation.yaml - Edit homepage content blocks:
index.md - Edit global page structure/header/footer:
_layoutsand_includes
- Site URL is configured in
_config.ymlashttps://datatalks.club - GitHub-specific files (like
.github) are excluded from Rustkyll output - Generated site output is in
_siteduring local builds
- The repository includes many pages written in markdown with embedded HTML and Liquid; this is expected and used heavily for SEO and rich formatting.
- Author references across posts/podcast/books depend on
_peoplerecords; missing person entries usually cause broken attribution links. - Event rendering logic is date-driven (
site.timecomparisons), so event timestamp format consistency in_data/events.yamlis important. - Navigation is fully data-driven from
_data/navigation.yaml, which keeps menu edits separate from template code.