Welcome to the first virtual Hack Day hosted by Materialize and our good friends at Redpanda and dbt Labs! The goal of this event is to encourage knowledge-sharing between our communities (we've already learned so much just putting it together!), and give you a taste of what building streaming analytics pipelines with this stack looks like.
Maybe you've never used dbt. Maybe you're new to streaming. Maybe you're even new to dbt and streaming. But guess what: it doesn't really matter!
We'll kick things off with a quick intro to each of the projects and go over the details of the event to make sure you're all set! From there, you can choose your own Hack Day adventure. We are also giving you somewhere to start!
👾 Build
Throughout the day, folks from all three projects will be available to bounce off ideas, support you with your project or just...chat. To get in touch with us, join the official Slack channel or reach out in Troubleshooting!
🤲 Share
At the end of the event, we encourage you to share your projects, experiments and learnings in Show and tell! This can be a link to a GitHub repo with your project, a blog post, or just a plain text recap of your Hack Day...whatever feels right.
💥 Get out there!
As a "Thank you!" for joining us and getting your hack on, we'd love to send you some swag! We might also reach out about promoting and showcasing your work more widely in the data community.
Our goal was to guarantee that everyone is able to get up and running in a reasonable amount of time, as well as find something fun to work on regardless of their level of expertise with each tool. For this reason, you can find a sample project in the repo with enough plumbing to spin up an end-to-end setup that you can play around with, extend or completely modify:
To get started, fork this repo, clone it and navigate to the sample_project
directory:
git clone https://github.com/<github-username>/mz-hack-day-2022.git
cd mz-hack-day-2022/sample_project
There's a lot more you can do as you ramp up! In case you need some ideas, here are a few seed challenges:
Tool | Challenge |
---|---|
Materialize | Replace the JSON file with a Postgres database that pushes changes to the aircraft reference data into Materialize, either through Redpanda+Debezium or directly. |
Materialize | Push data from a materialized view to a web app using TAIL . You can use our Node.js and Materialize guide as a reference! |
Redpanda | Create a producer for a new data source or adapt the existing one to use pandaproxy instead. |
Redpanda | Give WASM transforms (beta) a try for data pre-processing (e.g. cleaning, masking). |
dbt | Add a sink model that outputs the results of the fct_flight materialized view back to Redpanda. |
dbt | Incorporate macros from the materialize-dbt-utils package into your models. |
Source | Requires authentication? | Rate limited? | Link |
---|---|---|---|
Citi Bike NYC | Citi Bike GBFS real-time feed | ||
Network Rail UK | ☑️ | RTPPM | |
☑️ | ☑️ | Twitter API v2 | |
Twitch | ☑️ | ☑️ | Twitch API |
If you know about other cool data sources you'd like to add to the list, feel free to open an issue or a pull request with suggestions!