-
Notifications
You must be signed in to change notification settings - Fork 37
Description
We want to build an automated system that extracts frequently asked questions from our Slack workspace and syncs them to the website's FAQ section. The goal is to run this process daily, pull new questions and answers, and automatically update the existing FAQ files or create new entries.
We already have an early prototype of an agent that fetches Slack data, but it’s not fully integrated and isn't running in an automated or reliable way. This hackathon task is to improve, extend, or rewrite that agent so that it becomes a production-ready component of our documentation workflow.
What Needs to Be Built
The system should connect to Slack, scan selected channels (for example #course-ml-zoomcamp, #data-engineering, #llm-zoomcamp), identify messages that represent user questions and authoritative answers, and export them into structured files.
The exported content should match the format required by our website FAQ engine. Ideally, the system should distinguish between new items and updates to existing ones.
The implementation can reuse the existing agent logic, build new functionality on top of it, or replace it entirely if the team finds a better approach.
Expected Functionality
The solution should run daily, ideally via GitHub Actions or another CI system. It should pull new content from Slack, clean and normalize the text, and produce a commit or a pull request with the updated FAQ files.
The system must handle basic formatting, avoid duplicates, and ensure that questions are stored consistently and searchably.
Integration With the Website
The website already supports FAQ content through Markdown or HTML blocks. The task includes making sure the exported data is compatible with our existing FAQ structure and that the integration is smooth.
If needed, the team can extend or adjust the FAQ format to make the pipeline easier or more robust.
Participantion
Teams may choose to:
- improve the existing agent,
- completely rewrite it using a more reliable approach,
- or build additional intelligence (classification, semantic matching, deduplication, etc.).
All improvements are welcome as long as the final result is automated, reliable, and easy to maintain.
Outcome
A working daily pipeline that automatically extracts FAQ items from Slack and updates our website with fresh, high-quality entries.
This solution will significantly reduce manual effort, improve learner experience, and ensure that frequently asked questions across all Zoomcamps remain accurate and up to date.