In [None]:
import pandas as _hex_pandas
import datetime as _hex_datetime
import json as _hex_json

In [None]:
hex_scheduled = _hex_json.loads("false")

In [None]:
hex_user_email = _hex_json.loads("\"example-user@example.com\"")

In [None]:
hex_run_context = _hex_json.loads("\"logic\"")

In [None]:
import datetime
import pytz

date_now = datetime.datetime.now()
timezone = pytz.timezone("Europe/Berlin")
d_aware = timezone.localize(date_now)
d_aware.tzinfo

date_arrival = datetime.datetime(2022, 3, 22, 15, 21, 00)
timezone = pytz.timezone("US/Central")
da_aware = timezone.localize(date_arrival)
da_aware.tzinfo

diff = d_aware-da_aware
diff_trunc = str(diff).split(".")[0]

I should have landed in Austin {{diff_trunc}} hours ago. But the universe had bigger plans, so I'm stuck at home trying to ignore the FOMO, while 99.9% of my Twitter timeline (and probably yours, if you're reading this!) is out and about at [**Data Council**](https://www.datacouncil.ai/).

To try and live vicarioulsy through everyone there, I sat down and built a streaming demo using â¨[**Redpanda**](https://redpanda.com/), [**Materialize**](https://materialize.com/) and [**Hex**](https://hex.tech/)â¨ to keep up with the buzz around the conference as folks post about it on Twitter. This means that, if there is no buzz and no one tweets, the demo will be an absolute failure (@pedram_navid, I'm counting on you).

If you want a sneak peek into what's running behind the scenes, check out [this GitHub repo](https://github.com/morsapaes/hex-data-council).

<hr>

#### Are you also not there?

If you're also **not** at the conference, you can at least make this demo do something cool! Enter your **Twitter handle** below (without the `@`) and **click** the red button.

In [None]:
twitter_handle = _hex_json.loads("\"@random\"")

In [None]:
not_there = _hex_json.loads("false")

In [None]:
twitter_handle_clean=twitter_handle.replace('@','')

In [None]:
# import jinja2
# raw_query = """
#     {% if not_there %}
#     
#     INSERT INTO users_not_there
#     VALUES ({{twitter_handle_clean}});
#     
#     {% endif %}
# """
# sql_query = jinja2.Template(raw_query).render(vars())

Nothing to see yet, but we'll make good use of this in a bit!

<hr>

#### Staying in the (k)now

Okay, so we have data streaming in from Twitter in real-time: what now? Let's start by keeping an updated list of tweets being posted about the conference as they slip off fingers far, far away.

**What is picked up?**

This is not an exact science, so for your tweets to show up you'll need to either mention `@DataCouncilAI` or include the words `data council`. Note: only _real_ tweets, quoted retweets and replies are picked up!

Just **try tweeting** something! It should appear ð in a heartbeat (well, you'll need to re-run the notebook <sup>1</sup>):

In [None]:
# import jinja2
# raw_query = """
#     SELECT tweet AS "Tweet",
#            tweet_type AS "Tweet type",
#            username AS "Username",
#            created_at AS "Created at"
#     FROM twitter_tweets_enriched
#     WHERE username NOT IN (SELECT username FROM users_not_there)
#     --Hashtag classic data processing :sweatsmile:
#     AND username <> 'Artha__Global'
#     ORDER BY created_at DESC;
# """
# sql_query = jinja2.Template(raw_query).render(vars())

> <sup>1</sup> _Re-running a notebook doesn't incurr any significant "cost" in Materialize; we're just reading data out of the self-updated materialized views. And, although the results are updated with sub-second latency in the database, it is currently not possible to schedule runs anywhere near that frequency. We're in touch with the Hex team to change that!_ ð§âðâ¨

Remember the red button up there, and how we're not at Data Council? This allows us to dynamically route tweets from folks that mark themselves as "not there" to create a commiseration feed of sorts (thanks for the [inspiration](https://twitter.com/jillzzy/status/1506021651288272899?s=20&t=x-edWkssrAP2u71XWR4HkA), @jillzzy!).

In [None]:
# import jinja2
# raw_query = """
#     SELECT * 
#     FROM twitter_tweets_enriched
#     WHERE username IN (SELECT username FROM users_not_there)
#     ORDER BY created_at DESC;
# """
# sql_query = jinja2.Template(raw_query).render(vars())

#### Keeping score

How do we know who's ahead of the Twitter game at the conference? Or how much Twitter activity there is in the first place? 

Materialize is pretty good at keeping track of events over time and maintaining query results **incrementally updated** in (you've guessed it) [materialized views](https://materialize.com/why-use-a-materialized-view/). This means that it can handle heavy-duty computations, like running aggregations, with minimal effort as new events stream in, ditching the need for scheduled refreshes or full rescans of the source data each time.

Let's get an idea of the bigger picture as time moves forward:

In [None]:
# import jinja2
# raw_query = """
#     SELECT SUM(total_tweets) AS cnt_tweets
#     FROM agg_tweets;
# """
# sql_query = jinja2.Template(raw_query).render(vars())

In [None]:
# import jinja2
# raw_query = """
#     SELECT COUNT(username) AS cnt_users
#     FROM agg_tweets;
# """
# sql_query = jinja2.Template(raw_query).render(vars())

In [None]:
cnt_tweets=total_tweets.iat[0,0]

In [None]:
cnt_users=total_users.iat[0,0]

In [None]:
# import jinja2
# raw_query = """
#     SELECT COUNT(DISTINCT username) 
#     FROM users_not_there;
# """
# sql_query = jinja2.Template(raw_query).render(vars())

In [None]:
cnt_users_nt=total_users_nt.iat[0,0]

In [None]:
cnt_tweets

In [None]:
cnt_users

In [None]:
cnt_users_nt

In [None]:
# import jinja2
# raw_query = """
#     SELECT username, 
#            total_tweets 
#     FROM agg_tweets 
#     ORDER BY total_tweets DESC;
# """
# sql_query = jinja2.Template(raw_query).render(vars())

In [None]:
# import jinja2
# raw_query = """
#     SELECT time_bucket AS "Time bucket",
#            total_tweets AS "Total tweets" 
#     FROM tweets_hourly;
# """
# sql_query = jinja2.Template(raw_query).render(vars())

In [None]:
import altair
chart_query_result_6 = altair.Chart.from_json("""
{
    "width": 500,
    "height": 500,
    "$schema": "https://vega.github.io/schema/vega-lite/v4.json",
    "layer": [
        {
            "data": {
                "name": "layer00"
            },
            "mark": {
                "type": "bar",
                "clip": true,
                "tooltip": true
            },
            "encoding": {
                "x": {
                    "field": "Time bucket",
                    "type": "temporal"
                },
                "y": {
                    "field": "Total tweets",
                    "type": "quantitative"
                },
                "color": {
                    "value": "#C84654"
                }
            }
        }
    ],
    "resolve": {
        "scale": {}
    },
    "datasets": {
        "layer00": [
            {
                "name": "dummy",
                "value": 0
            }
        ]
    },
    "config": {
        "legend": {
            "disable": false
        }
    }
}
""")
chart_query_result_6.datasets.layer00 = query_result_6.to_json(orient='records')
chart_query_result_6.display(actions=False)

#### Missing out

To finish off this self-flagellation experiment, let's turn to what happens _beyond_ the conference program itself. What are folks planning to do after @JLDLaughlin's talk on [dbt+Materialize](https://www.datacouncil.ai/talks/materializedbt-streaming-for-the-modern-data-stack?hsLang=en)? What vendors are using BBQ as bait?

It's admitedly naive to do text processing in SQL, but we can roll with a [regex](https://www.tiktok.com/@nimay.ndolo/video/7042702152774552838) pattern `beer|drink|food|bbq|bats` to capture some of the gatherings we're missing out on. Same as before, whenever a new tweet comes around and it matches the given pattern, it'll end up ð:

In [None]:
# import jinja2
# raw_query = """
#     SELECT *
#     FROM twitter_tweets_enriched
#     WHERE regexp_match(tweet,'beer|drink|food|bbq|bats','i') IS NOT NULL
#     ORDER BY created_at DESC;
# """
# sql_query = jinja2.Template(raw_query).render(vars())

### That's it!

Though I might not be on a cruise to see a cloud of bats fly out from under a bridge while sipping on a Jester King or get the chance to see Benn Stancil fight someone IRL, learning Hex was a lot of **FUN**.

**And we'll always have Data Council.** ð

<hr>

[![Twitter URL](https://img.shields.io/twitter/url/https/twitter.com/morsapaes.svg?style=social&label=Follow%20%40morsapaes)](https://twitter.com/morsapaes)