You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Apr 25, 2026. It is now read-only.
The main problem is that the CLI snapshot does not expose media playback state, so the caller cannot tell whether a media element is currently playing or paused.
Why this looks real
Current snapshot collection only records generic DOM structure plus a limited set of attributes.
Relevant code:
extension/src/content/content-script.ts:797-818 collects a snapshot from the page DOM
extension/src/content/content-script.ts:822-859 walks regular element children and open shadow roots
extension/src/content/content-script.ts:987-1027 extracts generic attributes such as href, type, role, tabindex, aria-*, onclick, value, etc.
There is no media-specific state extraction for things like:
HTMLMediaElement.paused
HTMLMediaElement.ended
HTMLMediaElement.currentTime
HTMLMediaElement.duration
a normalized play/pause state exposed to the Rust side
Actual
A page may contain a visible audio/video player, but the snapshot does not tell the CLI whether it is currently playing or paused.
As a result, higher-level automation cannot reliably decide whether it should:
press play
press pause
leave the player unchanged
Expected
The snapshot or protocol should expose enough media state for the CLI to determine the current playback state.
This could be done by either:
attaching media-state fields to relevant nodes, or
adding a dedicated media-state path in the protocol
Scope note
This issue is about state visibility first. Whether native browser media controls are directly clickable is a separate concern.
Summary
The main problem is that the CLI snapshot does not expose media playback state, so the caller cannot tell whether a media element is currently playing or paused.
Why this looks real
Current snapshot collection only records generic DOM structure plus a limited set of attributes.
Relevant code:
extension/src/content/content-script.ts:797-818collects a snapshot from the page DOMextension/src/content/content-script.ts:822-859walks regular element children and open shadow rootsextension/src/content/content-script.ts:987-1027extracts generic attributes such ashref,type,role,tabindex,aria-*,onclick,value, etc.There is no media-specific state extraction for things like:
HTMLMediaElement.pausedHTMLMediaElement.endedHTMLMediaElement.currentTimeHTMLMediaElement.durationActual
A page may contain a visible audio/video player, but the snapshot does not tell the CLI whether it is currently playing or paused.
As a result, higher-level automation cannot reliably decide whether it should:
Expected
The snapshot or protocol should expose enough media state for the CLI to determine the current playback state.
This could be done by either:
Scope note
This issue is about state visibility first. Whether native browser media controls are directly clickable is a separate concern.