English | 中文
VideoDecoder is a native Android playback experiment built with Android + JNI + FFmpeg + OpenGL ES + AAudio. Instead of wrapping the system media player, it separates demuxing, audio/video decoding, audio output, OpenGL rendering, playback control, and UI interaction into an observable and debuggable native playback pipeline.
This project is evolved through a multi-agent, cross-domain full-stack workflow designed for mixed technology stacks. The system bridges low-level C++ cross-compilation, strict arm64-v8a native builds, multi-threaded FFmpeg packet/frame queues, OpenGL / AAudio clock-sensitive playback, and modern Jetpack Compose UI with dynamic shader lighting and Liquid Glass interaction.
The workflow relies on long-chain reasoning across layers. A frontend agent analyzes open-source motion libraries, spatial math, spring damping, refraction, highlights, and gesture deformation. A cross-stack agent traces JNI state locks, native playback clocks, async queues, seek wakeups, EGL surfaces, and the rendering pipeline. When the input is an ambiguous perceptual issue such as "speed switching feels abrupt", "dragging stalls", "there is too much black area", or "the button does not feel like liquid glass", the system can trace from Compose Spring behavior and gesture state all the way down to native queues, Surface lifecycle, OpenGL viewport logic, and AAudio sync.
This turns cross-environment code edits, NDK build-chain adaptation, Gradle validation, and UI feel iteration into a tight engineering loop, compressing work that would normally require repeated senior full-stack debugging into fast, minutes-level iterations.
- Local video selection and playback.
- FFmpeg demuxing and software decoding for audio and video streams.
- OpenGL ES YUV frame rendering to Android Surface.
- Visible playback surface migrated to
TextureView, withSurfaceTexturebuffer size synchronized to the UI window. - Low-latency audio output through AAudio.
- Play, pause, resume, seek, and playback speed control.
- Speed changes through FFmpeg
atempo, preserving pitch while changing playback rate. - Native playback progress polling and seek through the progress bar.
- Modern Liquid Glass interaction built with Jetpack Compose and AndroidLiquidGlass.
- Four-zone player layout: top text/status area, video area, progress area, and button area.
- Liquid Glass progress slider with drag deformation, release-to-seek, and stable drag-state cleanup.
- Select, decode, play, and pause buttons use AndroidLiquidGlass-style physical drag deformation, elastic rebound, and real backdrop refraction.
- Speed tabs use a bright translucent glass container with a liquid indicator and drag-to-change interaction.
The project uses a Java UI layer + Native media core architecture. Java handles UI, file selection, and user interaction. C++ handles media processing, thread orchestration, synchronization, and rendering.
flowchart TB
subgraph Java["Java UI Layer"]
Activity["MainActivity"]
SurfaceView["SurfaceView"]
TextureView["TextureView"]
Controls["Play / Pause / Seek / Speed / Progress"]
PFD["ParcelFileDescriptor"]
end
subgraph JNI["JNI Boundary"]
DecodeApi["decodeVideo(videoPath, outputPath)"]
SurfaceApi["setSurface(surface)"]
ControlApi["pause / resume / seek / speed / release"]
end
subgraph Native["Native Media Core"]
Session["PlaybackSession"]
State["PlaybackState"]
Window["NativeWindowHolder"]
Demuxer["Demuxer"]
VideoDecoder["Video Decoder"]
AudioDecoder["Audio Decoder"]
Renderer["OpenGL ES Renderer"]
AudioRenderer["AudioRenderer (AAudio)"]
end
subgraph FFmpeg["FFmpeg"]
Avio["Custom AVIO(fd)"]
Format["AVFormatContext"]
Codec["AVCodecContext"]
Sws["sws_scale"]
Swr["SwrConvert"]
Atempo["atempo filter"]
end
Activity --> DecodeApi
SurfaceView --> SurfaceApi
TextureView --> SurfaceApi
Controls --> ControlApi
PFD --> DecodeApi
DecodeApi --> Session
SurfaceApi --> Window
ControlApi --> State
Session --> State
Session --> Demuxer
Session --> VideoDecoder
Session --> AudioDecoder
Session --> Renderer
Session --> AudioRenderer
Demuxer --> Avio --> Format
VideoDecoder --> Codec --> Sws
AudioDecoder --> Codec --> Swr --> Atempo
Renderer --> Window
AudioRenderer --> State
The project integrates the AndroidLiquidGlass style to create an iOS/visionOS-like liquid glass player interface.
-
Hybrid Layout Architecture
activity_main.xmlkeeps the required legacy View IDs so Java can reuse event, state, and JNI wiring.- The video area uses
TextureViewfor the native rendering Surface. ComposeViewis a full-screen Liquid Glass overlay that draws the top information area, progress area, and button area.- The visible layout is organized into four stable zones: top text/status, video, progress, and controls.
-
Liquid Glass Interaction
LiquidControls.kt: Compose Liquid Glass panels and buttons based onrememberLayerBackdrop()anddrawBackdrop.LiquidSlider.kt: Liquid Glass progress slider with direct drag tracking and release-to-seek.DampedDragAnimation.kt: drag deformation, press/release animation, cancellation of stale value animations, and stable cleanup during long drags.InteractiveHighlight.kt: press highlights and elastic deformation using nonlinear displacement and direction-aware scaling.DragGestureInspector.kt: shared gesture parsing utilities.
-
Visual Consistency
- Top status panel, progress panel, and button panel share the same Liquid Glass visual language.
- Select, decode, play/pause, and speed controls use transparent glass styling instead of strong red/blue tint blocks.
- The background uses
wallpaper_light.webpfrom AndroidLiquidGlass as the root/window background with edge-to-edge window rendering. - The Compose overlay records the same wallpaper into a
LayerBackdropsource around the video region, so buttons and panels refract real background pixels instead of a transparent layer. - Playback state is synchronized through
LiquidGlassHelperand reflected in button activity and status text. - Active decode/playing/pause button text and speed tab state use accent blue (
#0091FF/ system-blue variants) instead of plain white.
-
Ripple Effect (
Ripple.kt)- Custom
glassRipple()with reduced alpha values (pressed 0.1, dragged 0.16, hovered 0.08) for a subtler glass feel. - Uses
createRippleModifierNodefromandroidx.compose.material.ripple.
- Custom
-
Parameter Alignment with AndroidLiquidGlass Catalog
LiquidSlider.kt: thumb 40×24dp, track 6dp, shadow alpha 0.05f.LiquidButton: restores the catalog-styleInteractiveHighlight.gestureModifierpath so finger movement drives nonlinear displacement, stretch, and rebound.SpeedBottomTabs: container 64dp, pressedScale 78/56, indicator highlight/shadow alpha followsprogress, with a bright translucent glass surface and subtle Plus-mode sheen.- Invisible tab layer retains
layerBackdropcapture withColorFilter.tint(accentColor)for indicator backdrop.
-
Edge-to-Edge Window
configureEdgeToEdgeWindow()sets transparent status/navigation bars withLAYOUT_STABLE | LAYOUT_FULLSCREEN | LAYOUT_HIDE_NAVIGATION.applySystemBarInsets()applies system bar insets toplayer_contentso content is properly positioned below system bars.- Theme enforces
enforceStatusBarContrast=falseandenforceNavigationBarContrast=falseto prevent system from drawing opaque bar backgrounds. - The former gray strip at the top of the wallpaper was caused by Android drawing a separate opaque status-bar background over the page, not by the wallpaper image.
- The fix keeps the wallpaper edge-to-edge while adding status-bar spacing back to View and Compose content (
player_contentandWindowInsets.statusBars), so controls stay below system icons.
app/
├─ src/main/java/com/example/videodecoder/
│ ├─ MainActivity.java # Java entry, TextureView, JNI calls, edge-to-edge, state sync
│ ├─ LiquidControls.kt # Compose Liquid Glass panels and buttons
│ ├─ LiquidSlider.kt # Liquid Glass progress slider
│ ├─ DampedDragAnimation.kt # Drag deformation and release animation
│ ├─ LiquidGlassHelper.kt # Java -> Compose state bridge
│ ├─ InteractiveHighlight.kt # Press highlight and elastic deformation
│ ├─ DragGestureInspector.kt # Gesture parser
│ ├─ Ripple.kt # Custom glass ripple with reduced alpha
│ ├─ UISensor.kt # Accelerometer-driven lighting angle with threshold filter
│ ├─ PlaybackInputPolicy.java
│ ├─ PlaybackTimeFormatter.java
│ └─ PlaybackUiPolicy.java
├─ src/main/res/layout/
│ └─ activity_main.xml # Four-zone layout skeleton and ComposeView container
├─ src/main/res/drawable/
│ ├─ wallpaper_light.webp # Light gradient wallpaper background
│ ├─ bg_deadliner_surface.xml
│ ├─ bg_deadliner_chip.xml
│ └─ bg_liquid_player_surface.xml # Legacy dark player background
└─ src/main/cpp/
├─ native-lib.cpp
├─ MediaInput.cpp/.h
├─ NativeEgl.cpp/.h
├─ NativeWindowHolder.cpp/.h
├─ ScopeExit.h
├─ JniStringChars.h
├─ Demuxer.cpp/.h
├─ Decoder.cpp/.h
├─ queue.cpp/.h
├─ videoRender.cpp/.h
└─ AudioRender.cpp/.h
The liquid glass UI renders multiple backdrop panels with blur, lens refraction, vibrancy, highlights, shadows, and animated shaders. The following optimizations reduce per-frame GPU and recomposition overhead to keep the interface smooth:
-
Sensor-driven recomposition throttling (
UISensor.kt)- Accelerometer updates arrive at ~60 Hz and previously triggered full overlay recomposition every frame.
- A 2-degree angle threshold filters sub-threshold changes; internal smoothing continues but Compose state only updates on meaningful orientation shifts.
-
Gesture animation cancellation (
DampedDragAnimation.kt)- Progress dragging previously risked accumulating value animations during long touch sessions.
- The current animation controller cancels stale value/press jobs, keeps only the latest drag target, and releases deterministically.
-
Backdrop source scoping (
LiquidControls.kt)- AndroidLiquidGlass effects need a Compose-recorded backdrop source; XML-only backgrounds are not enough for real lens sampling.
- The overlay records
wallpaper_light.webpinto aLayerBackdroparound the video region, giving glass controls real pixels while avoiding a wallpaper layer over the native video surface.
-
Single-pass highlight shader (
InteractiveHighlight.kt)- Button highlights now use one radial shader pass driven by the actual gesture position.
- The same state drives nonlinear displacement, stretch, and release instead of running a separate decorative trail animation.
- Android Studio or command-line Gradle.
- Android SDK 36.
- Kotlin 2.3.10.
- Android NDK + CMake.
- Java 11.
- Gradle Wrapper from this repository:
gradlew/gradlew.bat.
.\gradlew.bat clean
.\gradlew.bat assembleDebug./gradlew clean
./gradlew assembleDebugThe project currently supports arm64-v8a only:
app/build.gradlefixesabiFilters "arm64-v8a".- The prebuilt FFmpeg library is located at
app/src/main/jniLibs/arm64-v8a/libffmpeg.so. - Do not test on x86/x86_64 emulators, because the native FFmpeg library will not be found or loaded.
- Use an arm64 device or compatible arm64 environment for playback validation.
Playback starts four major native threads:
- Demux thread: reads packets through
av_read_frameand dispatches them to audio/videoPacketQueue. - Video decode thread: decodes video packets, converts frames to tightly packed
YUV420Pwithsws_scale, and pushes frames intoFrameQueue. - Audio decode thread: decodes audio packets, resamples to S16 with
Swr, appliesatempofor speed control, and writes to AAudio. - Render thread: reads frames from
FrameQueue, synchronizes against the audio clock, uploads YUV textures, and callseglSwapBuffers.
flowchart LR
Input["SAF Uri / fd"] --> DemuxThread["Demux Thread"]
subgraph Session["PlaybackSession"]
VideoPackets["video PacketQueue"]
AudioPackets["audio PacketQueue"]
Frames["FrameQueue"]
AudioOut["AudioRenderer"]
end
DemuxThread --> VideoPackets
DemuxThread --> AudioPackets
VideoPackets --> VideoThread["Video Decode Thread"]
VideoThread --> Convert["Convert to tight YUV420P"]
Convert --> Frames
Convert -. "debug export enabled" .-> YuvFile["output.yuv"]
AudioPackets --> AudioThread["Audio Decode Thread"]
AudioThread --> Resample["S16 resample"]
Resample --> Tempo["speed via atempo"]
Tempo --> AudioOut
Frames --> RenderThread["Render Thread"]
RenderThread --> Sync["wait by audio clock"]
Sync --> GLES["YUV texture upload + shader"]
GLES --> Surface["TextureView / Surface"]
AudioOut --> Speaker["Device audio output"]
PacketQueue and FrameQueue both apply backpressure to prevent demux/decode from growing memory without bounds. Seek and stop paths clear queues and wake waiting threads so playback can recover or exit.
The current sync strategy uses audio playback progress as the main reference. The render thread estimates the real audio position by subtracting the pending AAudio/software queue duration from the latest submitted audio PTS:
exact_audio_pts = audioClock - pending_audio_duration
diff = video_pts - exact_audio_ptssequenceDiagram
participant A as Audio Decoder
participant R as AudioRenderer
participant V as Render Thread
participant C as Clocks
A->>R: writeData(pcm, samples)
A->>C: update audioClock
V->>C: read videoClock
V->>R: getPendingAudioDurationUs()
R-->>V: pending audio duration
V->>V: exact_audio_pts = audioClock - pending
V->>V: diff = video_pts - exact_audio_pts
alt video is ahead of audio
V->>V: sleep scaled by playback speed
else video is not ahead
V->>V: render immediately to catch up
end
diff > 0: video is ahead, so the render thread sleeps briefly.diff <= 0: video renders immediately and catches up to audio.- During speed playback, video wait duration is scaled by
playbackSpeed, while audio timing is adjusted throughatempo.
Seek uses a two-stage handshake:
- Java calls
seekToPosition(ms). - Native stores the target time and sets
isSeeking = true. - Demuxer runs
avformat_seek_file, falling back toav_seek_frameif needed. - Old packet/frame queues are cleared, decoders are flushed, audio buffers are rebuilt, and clocks are reset.
- Seek state is cleared and playback resumes.
stateDiagram-v2
[*] --> Playing
Playing --> Seeking: seekToPosition(ms)
Seeking --> DemuxSeek: Demuxer executes seek
DemuxSeek --> FlushQueues: clear PacketQueue / FrameQueue
FlushQueues --> FlushDecoders: avcodec_flush_buffers
FlushDecoders --> ResetClocks: reset audio/video clocks
ResetClocks --> Playing
Playing --> Paused: pauseDecoding()
Paused --> Playing: resumeDecoding()
Playing --> Stopping: nativeReleaseAudio() / Surface destroyed
Paused --> Stopping: nativeReleaseAudio() / Surface destroyed
Seeking --> Stopping: stop requested
Stopping --> JoinThreads: wake queues and terminate FrameQueue
JoinThreads --> ReleaseSession: release AudioRenderer / EGL / AVIO after join
ReleaseSession --> [*]
PacketQueue::push() now wakes periodically while full and checks isSeeking, allowing seek requests to interrupt a full queue instead of waiting for another user interaction.
flowchart TB
DecodeCall["decodeVideo call"] --> LocalSession["stack PlaybackSession"]
DecodeCall --> Input["MediaInput"]
Input --> InputResource["AVFormatContext / AVIOContext / fd"]
LocalSession --> State["shared_ptr<PlaybackState>"]
LocalSession --> Queues["PacketQueue / FrameQueue"]
LocalSession --> AudioPtr["unique_ptr<AudioRenderer>"]
State --> DemuxerState["Demuxer / Decoder / PacketQueue read state explicitly"]
SurfaceCall["setSurface call"] --> WindowHolder["NativeWindowHolder"]
WindowHolder --> SharedWindow["shared_ptr<ANativeWindow>"]
SharedWindow --> RenderSnapshot["Render thread copies window snapshot"]
RenderSnapshot --> NativeEgl["NativeEgl"]
NativeEgl --> EGLSurface["EGLSurface / EGLContext"]
Stop["stop requested"] --> Wake["notifyAll / FrameQueue terminate"]
Wake --> Join["join demux / decode / audio / render"]
Join --> Cleanup["ScopeExit releases session, EGL, AVCodec, AVIO, fd"]
The video window now prioritizes filling the player area while preserving the video aspect ratio.
flowchart LR
Texture["TextureView size"] --> Buffer["SurfaceTexture.setDefaultBufferSize"]
Buffer --> Surface["Surface"]
Surface --> NativeWindow["ANativeWindow"]
NativeWindow --> Geometry["ANativeWindow_setBuffersGeometry"]
Geometry --> EGL["EGLSurface"]
EGL --> Renderer["Renderer.init(width, height)"]
Frame["AVFrame + sample_aspect_ratio"] --> Viewport["AspectFill viewport"]
Renderer --> Viewport
Viewport --> Stage["Fill video_stage; crop edges if needed"]
- Render initialization failures set stop flags, wake packet queues, and terminate
FrameQueue. AudioRendererownership is scoped insidePlaybackSession, avoiding cross-session raw pointer risks.FrameQueue::clear()notifies waiting threads after clearing.- Video decode no longer assumes source frames are tightly packed
YUV420P; frames are converted throughsws_scale. - U/V planes are uploaded with
(width + 1) / 2and(height + 1) / 2, supporting odd frame sizes. - Debug YUV export is opt-in to reduce I/O and storage pressure.
- Native windows are held through shared handles in
NativeWindowHolder; render threads use snapshots to avoid stale global window access. - EGL display, surface, and context setup/cleanup are extracted into
NativeEgl. - Visible rendering moved to
TextureView, with SurfaceTexture buffer sizing kept in sync. - OpenGL viewport uses
AspectFill / CenterCropto reduce black borders. - Liquid slider drag uses a 40dp touch target, freezes external progress sync while interacting, cancels stale value animations, and releases deterministically.
- Select/decode/play/pause buttons restore AndroidLiquidGlass physical drag: finger-position highlight, nonlinear displacement, stretch, and elastic rebound.
- The Compose overlay records the wallpaper into
LayerBackdroparound the video window, giving Liquid Glass controls real background pixels for blur/lens refraction. - Speed tabs use a bright translucent glass container and a subtle blue-accented liquid indicator instead of a dark tinted panel.
PacketQueue::push()periodically checks seek state while blocked by queue backpressure.
- Four-zone layout: top text/status, video, progress, and button control areas.
- Edge-to-edge window: transparent status/navigation bars with system bar inset handling;
enforceStatusBarContrast=falseprevents opaque system bar backgrounds. - Wallpaper background:
wallpaper_light.webpfrom AndroidLiquidGlass as full-screen page background. - Top gray strip fix: the wallpaper now renders behind the status bar; system bar insets are applied to content instead of letting Android fill the status bar with a separate gray theme color.
- Top information panel: Liquid Glass surface synchronized through
LiquidGlassHelper.setStatusText(). - Progress placement: progress stays close to the video area; buttons stay close to progress.
- Unified transparent glass controls: select, decode, play/pause, and speed controls share the same backdrop language.
- Physical button interaction: select, decode, play, and pause buttons keep the AndroidLiquidGlass drag/rebound model, with click pulse only as a short-tap fallback.
- Real backdrop sampling: Compose records the wallpaper backdrop around the video window so glass blur and lens effects scatter actual background pixels.
- State linkage:
MainActivitymaps playback state to chips, button activity, progress, and Compose state. - Motion rhythm: entrance and state transitions use subtle fade/slide animation; drag controls use spring release and elastic cleanup.
- Visual parameter alignment: slider thumb 40×24dp, track 6dp; speed tabs container 64dp, indicator alpha follows press progress with a bright glass surface.
- Custom ripple:
glassRipple()with reduced alpha for subtler glass feedback.
.\gradlew.bat assembleDebug
.\gradlew.bat testDebugUnitTestdecodeVideo(String videoPath, String outputPath)setSurface(Surface surface)pauseDecoding()resumeDecoding()setPlaybackSpeed(float speed)seekToPosition(int progressMs)getDurationMs()getCurrentPositionMs()nativeReleaseAudio()
- Video input uses native custom AVIO over an authorized file descriptor. Seek may be limited if a content provider returns a non-seekable fd.
- YUV export remains as a native debug capability and is disabled by default in normal UI playback.
- End-to-end testing should focus on continuous seek, speed switching, Surface destruction/recreation, and long video playback on arm64 devices.
native-lib.cppstill carries JNI, thread orchestration, synchronization, and resource cleanup responsibilities; it can be further split into dedicated session/controller modules.
- FFmpeg
- Android NDK
- AndroidLiquidGlass
- Jetpack Compose