-
Notifications
You must be signed in to change notification settings - Fork 732
feat: mentions handling on tinybird #3614
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
b0ceb09
3a4f0a1
42d3c34
2586547
ed79abe
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,53 @@ | ||
| DESCRIPTION > | ||
| - `mentions` contains community mentions from various sources tracked via Octolens integration. | ||
| - Raw datasource only exists in Tinybird - pushed directly from Octolens webhook processing. | ||
| - Tracks mentions across platforms like Reddit, HackerNews, Twitter, and other community sources. | ||
| - Includes sentiment analysis and relevance scoring for each mention. | ||
| - `sourceId` is the unique identifier from the source platform. | ||
| - `url` is the direct link to the mention on the source platform. | ||
| - `timestamp` is when the mention occurred on the source platform. | ||
| - `source` indicates the source platform (reddit, hackernews, twitter, etc.) using LowCardinality. | ||
| - `author` is the username/display name of the person who created the mention. | ||
| - `authorProfileLink` is the URL to the author's profile on the source platform. | ||
| - `title` contains the mention's title or subject line. | ||
| - `body` contains the full text content of the mention. | ||
| - `imageUrl` contains the URL to any associated image (empty string if not available). | ||
| - `relevanceScore` is the computed relevance score from Octolens (string representation). | ||
| - `relevanceComment` contains the explanation for the relevance score. | ||
| - `keyword` is the keyword that triggered this mention match. | ||
| - `sentimentLabel` provides the sentiment classification (positive, negative, neutral, mixed). | ||
| - `subreddit` contains the subreddit name for Reddit mentions (empty string for other sources). | ||
| - `viewId` is the Octolens view identifier that captured this mention. | ||
| - `viewName` is the human-readable name of the Octolens view. | ||
| - `projectSlug` identifies which project this mention belongs to. | ||
| - `createdAt` is the timestamp when the record was created in Tinybird. | ||
|
|
||
| TAGS "" Octolens integration", Community", "Sentiment analysis" | ||
|
|
||
| SCHEMA > | ||
| `sourceId` String `json:$.sourceId` DEFAULT '', | ||
| `url` String `json:$.url` DEFAULT '', | ||
| `timestamp` DateTime `json:$.timestamp`, | ||
| `source` LowCardinality(String) `json:$.source` DEFAULT '', | ||
| `author` String `json:$.author` DEFAULT '', | ||
| `authorProfileLink` String `json:$.authorProfileLink` DEFAULT '', | ||
| `title` String `json:$.title` DEFAULT '', | ||
| `body` String `json:$.body` DEFAULT '', | ||
| `imageUrl` String `json:$.imageUrl` DEFAULT '', | ||
| `relevanceScore` String `json:$.relevanceScore` DEFAULT '', | ||
| `relevanceComment` String `json:$.relevanceComment` DEFAULT '', | ||
| `keyword` String `json:$.keyword` DEFAULT '', | ||
| `sentimentLabel` LowCardinality(String) `json:$.sentimentLabel` DEFAULT '', | ||
| `subreddit` String `json:$.subreddit` DEFAULT '', | ||
| `viewId` Int64 `json:$.viewId` DEFAULT 0, | ||
| `viewName` String `json:$.viewName` DEFAULT '', | ||
| `language` String `json:$.language` DEFAULT '', | ||
| `projectSlug` LowCardinality(String) `json:$.projectSlug` DEFAULT '', | ||
| `createdAt` DateTime64(3) `json:$.createdAt` DEFAULT now64(3), | ||
| `bookmarked` UInt8 `json:$.bookmarked`, | ||
| `keywords` Array(String) `json:$.keywords[:]` | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Bug: Missing DEFAULT value for keywords arrayThe |
||
|
|
||
| ENGINE ReplacingMergeTree | ||
| ENGINE_PARTITION_KEY toYear(timestamp) | ||
| ENGINE_SORTING_KEY projectSlug, timestamp, sourceId | ||
| ENGINE_VER createdAt | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,12 +1,17 @@ | ||
| NODE health_score_select_fields | ||
| SQL > | ||
| SELECT | ||
| id, | ||
| segmentId, | ||
| slug, | ||
| if(isNaN(overallScore), null, overallScore) as overallScore, | ||
| toStartOfDay(now()) as date | ||
| FROM health_score_copy_ds | ||
|
|
||
| SELECT id, segmentId, slug, if (isNaN(overallScore), null, overallScore) as overallScore, toStartOfDay(now()) as date FROM health_score_copy_ds | ||
|
|
||
| TYPE sink | ||
| TYPE SINK | ||
| EXPORT_SERVICE kafka | ||
| EXPORT_CONNECTION_NAME lfx-oracle-kafka-streaming | ||
| EXPORT_KAFKA_TOPIC health_score_sink | ||
| EXPORT_SCHEDULE 30 0 * * * | ||
|
|
||
|
|
||
| EXPORT_FORMAT csv | ||
| EXPORT_STRATEGY @new | ||
| EXPORT_KAFKA_TOPIC health_score_sink |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,13 +1,12 @@ | ||
| NODE insights_projects_select_fields | ||
| SQL > | ||
|
|
||
| SELECT id, collectionsSlugs, name, slug, segmentId, softwareValue, toStartOfDay(now()) as date | ||
| FROM insights_projects_populated_ds | ||
|
|
||
| TYPE sink | ||
| TYPE SINK | ||
| EXPORT_SERVICE kafka | ||
| EXPORT_CONNECTION_NAME lfx-oracle-kafka-streaming | ||
| EXPORT_KAFKA_TOPIC insights_projects_populated_sink | ||
| EXPORT_SCHEDULE 30 0 * * * | ||
|
|
||
|
|
||
| EXPORT_FORMAT csv | ||
| EXPORT_STRATEGY @new | ||
| EXPORT_KAFKA_TOPIC insights_projects_populated_sink |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,30 @@ | ||
| NODE mentions_list_results | ||
| SQL > | ||
| % | ||
| SELECT * | ||
| FROM mentions FINAL | ||
| WHERE | ||
| 1 = 1 | ||
| {% if defined(projectSlug) %} | ||
| AND projectSlug | ||
| = {{ String(projectSlug, description="Filter by project slug", required=False) }} | ||
| {% end %} | ||
| {% if defined(platforms) %} | ||
| AND source | ||
| IN {{ Array(platforms, 'String', description="Filter by platforms", required=False) }} | ||
| {% end %} | ||
| {% if defined(keywords) %} | ||
| AND keyword | ||
| IN {{ Array(keywords, 'String', description="Filter by keywords", required=False) }} | ||
| {% end %} | ||
| {% if defined(sentiments) %} | ||
| AND sentimentLabel | ||
| IN {{ Array(sentiments, 'String', description="Filter by sentiments", required=False) }} | ||
| {% end %} | ||
| {% if defined(languages) %} | ||
| AND language | ||
| IN {{ Array(languages, 'String', description="Filter by languages", required=False) }} | ||
| {% end %} | ||
| ORDER BY timestamp DESC | ||
| LIMIT {{ Int32(pageSize, 20) }} | ||
| OFFSET {{ Int32(page, 0) * Int32(pageSize, 20) }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Missing DEFAULT value for bookmarked field
The
bookmarkedfield is defined asUInt8without aDEFAULTvalue. All otherUInt8fields in the codebase consistently haveDEFAULT 0orDEFAULT 1. Without a default, the field becomes required in incoming JSON, which will cause insertion failures when the field is missing from webhook payloads.