Skip to content

feat(twitter): expose has_media and media_urls columns#1115

Merged
jackwener merged 1 commit intojackwener:mainfrom
Dylanwooo:feat/twitter-media-columns
Apr 21, 2026
Merged

feat(twitter): expose has_media and media_urls columns#1115
jackwener merged 1 commit intojackwener:mainfrom
Dylanwooo:feat/twitter-media-columns

Conversation

@Dylanwooo
Copy link
Copy Markdown
Contributor

@Dylanwooo Dylanwooo commented Apr 21, 2026

Summary

Adds has_media (boolean) and media_urls (string[]) columns to the Twitter read commands — search, timeline, tweets, thread, likes — so downstream pipelines can filter tweets that carry photos / videos / GIFs without launching twitter download per tweet.

Closes #1107.

This is an additive, non-breaking column change. The existing INTERCEPT / COOKIE payloads already carry legacy.extended_entities.media; the adapter-side change is purely in the row-mapping layer, so no new network work is introduced. Pattern mirrors #465 (the time / created_at column).

Approach

  • New shared helper extractMedia(legacy) in clis/twitter/shared.js:
    • Photos: media_url_https
    • Videos / animated GIFs: prefer the video/mp4 variant from video_info.variants, fall back to media_url_https thumbnail if no mp4 is present
    • Fall back to legacy.entities.media when extended_entities is missing
    • Returns { has_media: false, media_urls: [] } for the empty case
  • All five read adapters import the helper and spread its output into each row.
  • New columns are appended to the end of each columns array so existing column-index consumers aren't disrupted.

Tests

  • New clis/twitter/shared.test.js covers the helper directly: empty case, photos, video mp4-variant selection, animated_gif, entities.media fallback, and the no-variants fallback.
  • Updated search.test.js result assertions and tweets.test.js columns assertion to include the new fields.
npm test          → 225 files, 1769 passed
npm run typecheck → clean

Adds two additive columns to the Twitter read commands (search, timeline,
tweets, thread, likes):

- has_media: boolean — true if the tweet contains any photo, video, or GIF
- media_urls: string[] — photo URLs and mp4 variant URLs for videos/GIFs,
  extracted from legacy.extended_entities.media (falls back to entities.media)

The INTERCEPT/COOKIE payloads already carry this data; this change only
extends the row-mapping layer, so no new network work is needed. Pattern
mirrors jackwener#465 (time column).

Shared extraction helper lives in clis/twitter/shared.js so all five
adapters stay consistent, with unit coverage for photo, video (mp4 variant
selection), animated_gif, entities.media fallback, and the empty case.

Closes jackwener#1107
@jackwener jackwener merged commit 666a955 into jackwener:main Apr 21, 2026
11 checks passed
@Dylanwooo Dylanwooo deleted the feat/twitter-media-columns branch April 21, 2026 15:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature]: expose has_media and media_urls in twitter search/timeline/thread/likes output

2 participants