📚 Daily Paper Review - 2026-05-27
Found 10 relevant papers today. Please review and approve/reject.
1. EchoPilot: Training-Free Ultrasound Video Segmentation via Scale-Space Semantic Prompting and Reliability-Gated Memory
Score: 5.5/10 | arXiv: 2605.25944v1
Authors: Ruiqiang Xiao, Zhaohu Xing, Yijun Yang...
Relevance:
- 🎯 Field Match: 0.51/10 - Matches: segmentation
- 🏆 Venue: MICCAI (10/10)
- 💻 Code: ✅ Available
AI Summary:
Ultrasound video segmentation is clinically valuable yet difficult due to speckle noise, weak boundaries, and rapid anatomical deformation. Recent promptable foundation models enable point-guided segmentation, but their direct deployment in ultrasound remains unreliable: a single point provides insufficient spatial context to resolve scale ambiguity, and greedy memory updates amplify early errors ...
Key Contributions:
- Ultrasound video segmentation is clinically valuable yet difficult due to speckle noise, weak boundaries, and rapid anatomical deformation.
- Recent promptable foundation models enable point-guided segmentation, but their direct deployment in ultrasound remains unreliable: a single point provides insufficient spatial context to resolve scale ambiguity, and greedy memory updates amplify early errors into severe temporal drift.
- We present EchoPilot, a training-free framework for ultrasound video segmentation under sparse first-frame interaction, requiring only a single point click and an anatomical category name.
Links: 📄 Paper | 📥 PDF
Actions:
- ✅ Approve: Add label
approved and comment "approve"
- ❌ Reject: Add label
rejected and comment "reject"
- ⭐ Important: Add label
starred
2. Global Structure-from-Motion Meets Feedforward Reconstruction
Score: 4.9/10 | arXiv: 2605.26103v1
Authors: Linfei Pan, Johannes Schönberge, Marc Pollefeys
Relevance:
- 🎯 Field Match: 1.02/10 - Matches: 3d reconstruction, computer vision
- 🏆 Venue: CVPR (10/10)
- 💻 Code: ❌ Not mentioned
AI Summary:
Structure-from-Motion -- the process of simultaneously estimating camera poses and 3D scene structure from a collection of images -- remains a central challenge in computer vision, with many open problems yet to be solved. Recent advances in feedforward 3D reconstruction have made significant strides in overcoming persistent failure cases of classical SfM methods, particularly in scenarios charact...
Key Contributions:
- Structure-from-Motion -- the process of simultaneously estimating camera poses and 3D scene structure from a collection of images -- remains a central challenge in computer vision, with many open problems yet to be solved.
- Recent advances in feedforward 3D reconstruction have made significant strides in overcoming persistent failure cases of classical SfM methods, particularly in scenarios characterized by low texture, limited overlap, and symmetries.
- However, while feedforward approaches excel in these challenging conditions, they often face limitations regarding scalability, accuracy, or robustness, and typically fall short of classical methods in standard reconstruction settings.
Links: 📄 Paper | 📥 PDF
Actions:
- ✅ Approve: Add label
approved and comment "approve"
- ❌ Reject: Add label
rejected and comment "reject"
- ⭐ Important: Add label
starred
3. Hidden in Plain Tokens: Simply Robust, Gradient-Free Watermark for Synthetic Audio
Score: 4.3/10 | arXiv: 2605.25967v1
Authors: Georgios Milis, Yubin Qin, Yihan Wu...
Relevance:
- 🎯 Field Match: 0.0/10 - Matches:
- 🏆 Venue: ICML (10/10)
- 💻 Code: ❌ Not mentioned
AI Summary:
As policy catches up with the capabilities of generative AI, watermarking is central to content provenance efforts. Inference-time watermarks for autoregressive models are unfit for continuous modalities due to discretization inconsistencies. Existing methods overcome this by finetuning the modality tokenizers, nullifying the watermark's training-free advantage. In this work, motivated by the voca...
Key Contributions:
- As policy catches up with the capabilities of generative AI, watermarking is central to content provenance efforts.
- Inference-time watermarks for autoregressive models are unfit for continuous modalities due to discretization inconsistencies.
- Existing methods overcome this by finetuning the modality tokenizers, nullifying the watermark's training-free advantage.
Links: 📄 Paper | 📥 PDF
Actions:
- ✅ Approve: Add label
approved and comment "approve"
- ❌ Reject: Add label
rejected and comment "reject"
- ⭐ Important: Add label
starred
4. Conditional KRR: Injecting Unpenalized Features into Kernel Methods with Applications to Kernel Thresholding
Score: 4.2/10 | arXiv: 2605.26067v1
Authors: Rustem Takhanov, Zhenisbek Assylbekov
Relevance:
- 🎯 Field Match: 0.0/10 - Matches:
- 🏆 Venue: ICML (10/10)
- 💻 Code: ❌ Not mentioned
AI Summary:
Conditionally positive definite (CPD) kernels are defined with respect to a function class $\mathcal{F}$. It is well known that such a kernel $K$ is associated with its native space (defined analogously to an RKHS), which in turn gives rise to a learning method -- called conditional kernel ridge regression (conditional KRR) due to its analogy with KRR -- where the estimated regression function is ...
Key Contributions:
- Conditionally positive definite (CPD) kernels are defined with respect to a function class $\mathcal{F}$.
- It is well known that such a kernel $K$ is associated with its native space (defined analogously to an RKHS), which in turn gives rise to a learning method -- called conditional kernel ridge regression (conditional KRR) due to its analogy with KRR -- where the estimated regression function is penalized by the square of its native space norm.
- This method is of interest because it can be viewed as classical linear regression, with features specified by $\mathcal{F}$, followed by the application of standard KRR to the residual (unexplained) component of the target variable.
Links: 📄 Paper | 📥 PDF
Actions:
- ✅ Approve: Add label
approved and comment "approve"
- ❌ Reject: Add label
rejected and comment "reject"
- ⭐ Important: Add label
starred
5. Towards 3D heart mesh generation using contactless radar imaging and physics-informed neural network
Score: 4.1/10 | arXiv: 2605.26003v1
Authors: Jinye Li, Chenxi Fu, Minghang Zheng...
Relevance:
- 🎯 Field Match: 1.44/10 - Matches: cardiac, heart
- 🏆 Venue: None (5.0/10)
- 💻 Code: ❌ Not mentioned
AI Summary:
Cardiac function evaluation necessitates continuous, non-invasive monitoring, a capability limited in MRI. Millimeter-wave (mmWave) radar and its Synthetic Aperture Radar (SAR) mode offer a privacy-preserving and portable point-of-care clinical applications. However, reconstructing high-fidelity 3D cardiac geometry from SAR remains an open challenge. Traditional radar methods generate sparse point...
Key Contributions:
- Cardiac function evaluation necessitates continuous, non-invasive monitoring, a capability limited in MRI.
- Millimeter-wave (mmWave) radar and its Synthetic Aperture Radar (SAR) mode offer a privacy-preserving and portable point-of-care clinical applications.
- However, reconstructing high-fidelity 3D cardiac geometry from SAR remains an open challenge.
Links: 📄 Paper | 📥 PDF
Actions:
- ✅ Approve: Add label
approved and comment "approve"
- ❌ Reject: Add label
rejected and comment "reject"
- ⭐ Important: Add label
starred
6. Reinforcing Few-step Generators via Reward-Tilted Distribution Matching
Score: 4.1/10 | arXiv: 2605.26108v1
Authors: Yushi Huang, Xiangxin Zhou, Ruoyu Wang...
Relevance:
- 🎯 Field Match: 0.0/10 - Matches:
- 🏆 Venue: None (5.0/10)
- 💻 Code: ✅ Available
AI Summary:
Recent advances in few-step diffusion distillation have enabled efficient image generation, yet aligning these models with human preferences remains challenging. We propose Reward-Tilted Distribution Matching Distillation (RTDMD), a two-stage framework that unifies distribution matching distillation with reward-guided reinforcement learning for few-step flow generators. We show that minimizing the...
Key Contributions:
- Recent advances in few-step diffusion distillation have enabled efficient image generation, yet aligning these models with human preferences remains challenging.
- We propose Reward-Tilted Distribution Matching Distillation (RTDMD), a two-stage framework that unifies distribution matching distillation with reward-guided reinforcement learning for few-step flow generators.
- We show that minimizing the KL divergence to a reward-tilted teacher distribution naturally decomposes into a distribution matching term and a reward maximization term.
Links: 📄 Paper | 📥 PDF
Actions:
- ✅ Approve: Add label
approved and comment "approve"
- ❌ Reject: Add label
rejected and comment "reject"
- ⭐ Important: Add label
starred
7. TriSplat: Simulation-Ready Feed-Forward 3D Scene Reconstruction
Score: 4.1/10 | arXiv: 2605.26115v1
Authors: Weijie Wang, Zimu Li, Jinchuan Shi...
Relevance:
- 🎯 Field Match: 0.59/10 - Matches: 3d reconstruction
- 🏆 Venue: None (5.0/10)
- 💻 Code: ✅ Available
AI Summary:
Sparse-view 3D reconstruction is increasingly addressed with feed-forward splatting networks that predict explicit primitives directly from images. Yet most existing methods remain centered on Gaussian primitives and expose surfaces only indirectly: extracting a usable mesh for downstream simulation, physics reasoning, or embodied interaction still requires expensive post-hoc steps that break the ...
Key Contributions:
- Sparse-view 3D reconstruction is increasingly addressed with feed-forward splatting networks that predict explicit primitives directly from images.
- Yet most existing methods remain centered on Gaussian primitives and expose surfaces only indirectly: extracting a usable mesh for downstream simulation, physics reasoning, or embodied interaction still requires expensive post-hoc steps that break the feed-forward promise.
- This limitation is especially pronounced in pose-free settings, where scene structure and camera parameters must be estimated jointly from sparse observations.
Links: 📄 Paper | 📥 PDF
Actions:
- ✅ Approve: Add label
approved and comment "approve"
- ❌ Reject: Add label
rejected and comment "reject"
- ⭐ Important: Add label
starred
8. AnyScene: Towards Highly Controllable Driving Scene Generation at Anywhere and Beyond
Score: 4.1/10 | arXiv: 2605.26113v1
Authors: Haiming Zhang, Junfei Zhou, Feng Jiang...
Relevance:
- 🎯 Field Match: 0.59/10 - Matches: 3d reconstruction
- 🏆 Venue: None (5.0/10)
- 💻 Code: ✅ Available
AI Summary:
Generating high-fidelity and controllable synthetic data is critical for advancing end-to-end autonomous driving, particularly for addressing the long tail of rare safety-critical scenarios. Existing occupancy-guided methods typically rely on shallow conditioning mechanisms and reference-frame-dependent video synthesis, which limits fine-grained controllability from arbitrary BEV layouts and restr...
Key Contributions:
- Generating high-fidelity and controllable synthetic data is critical for advancing end-to-end autonomous driving, particularly for addressing the long tail of rare safety-critical scenarios.
- Existing occupancy-guided methods typically rely on shallow conditioning mechanisms and reference-frame-dependent video synthesis, which limits fine-grained controllability from arbitrary BEV layouts and restricts their applicability for scalable simulation.
- In this paper, we propose AnyScene, a unified occupancy-centric framework for driving scene generation.
Links: 📄 Paper | 📥 PDF
Actions:
- ✅ Approve: Add label
approved and comment "approve"
- ❌ Reject: Add label
rejected and comment "reject"
- ⭐ Important: Add label
starred
9. Where Concept Erasure Should Occur: Concept-Layer Alignment in Text-to-Video Diffusion Models
Score: 4.0/10 | arXiv: 2605.25941v1
Authors: Yiwei Xie, Ping Liu, Zheng Zhang
Relevance:
- 🎯 Field Match: 0.0/10 - Matches:
- 🏆 Venue: ICML (10/10)
- 💻 Code: ❌ Not mentioned
AI Summary:
Text-to-video diffusion transformers encode semantic information unevenly across model depth, which constrains effective concept erasure. We identify a representational bottleneck, termed concept-layer topological alignment, under which target concepts exhibit higher separability at certain representational depths. Outside these depths, concept and non-target signals remain strongly entangled, lim...
Key Contributions:
- Text-to-video diffusion transformers encode semantic information unevenly across model depth, which constrains effective concept erasure.
- We identify a representational bottleneck, termed concept-layer topological alignment, under which target concepts exhibit higher separability at certain representational depths.
- Outside these depths, concept and non-target signals remain strongly entangled, limiting the effectiveness of depth-specific erasure.
Links: 📄 Paper | 📥 PDF
Actions:
- ✅ Approve: Add label
approved and comment "approve"
- ❌ Reject: Add label
rejected and comment "reject"
- ⭐ Important: Add label
starred
10. F-RNG: Feed-Forward Relightable Neural Gaussians
Score: 4.0/10 | arXiv: 2605.25975v1
Authors: Guangming Fu, Jiahui Fan, Jian Yang...
Relevance:
- 🎯 Field Match: 1.69/10 - Matches: 3d gaussian, gaussian splatting
- 🏆 Venue: None (5.0/10)
- 💻 Code: ❌ Not mentioned
AI Summary:
Capturing relightable 3D assets from real-world objects is a widely researched problem. Several per-scene optimization-based methods, based on 3D Gaussian splatting (3DGS), support relighting; however, they usually require dense input views, and their overfitting nature makes it difficult to generalize across scenes. Unlike per-scene optimization methods, generalized feed-forward models can direct...
Key Contributions:
- Capturing relightable 3D assets from real-world objects is a widely researched problem.
- Several per-scene optimization-based methods, based on 3D Gaussian splatting (3DGS), support relighting; however, they usually require dense input views, and their overfitting nature makes it difficult to generalize across scenes.
- Unlike per-scene optimization methods, generalized feed-forward models can directly reconstruct Gaussians from sparse input views.
Links: 📄 Paper | 📥 PDF
Actions:
- ✅ Approve: Add label
approved and comment "approve"
- ❌ Reject: Add label
rejected and comment "reject"
- ⭐ Important: Add label
starred
How to Review
- Read the summaries above
- Check paper links for more details
- Add labels to indicate your decision:
approved - Add to collection
rejected - Skip this paper
starred - Mark as particularly important
- Comment "approve" or "reject" to trigger automation
Note: Papers with approved label will be automatically added to the collection.
📚 Daily Paper Review - 2026-05-27
Found 10 relevant papers today. Please review and approve/reject.
1. EchoPilot: Training-Free Ultrasound Video Segmentation via Scale-Space Semantic Prompting and Reliability-Gated Memory
Score:
5.5/10| arXiv: 2605.25944v1Authors: Ruiqiang Xiao, Zhaohu Xing, Yijun Yang...
Relevance:
AI Summary:
Ultrasound video segmentation is clinically valuable yet difficult due to speckle noise, weak boundaries, and rapid anatomical deformation. Recent promptable foundation models enable point-guided segmentation, but their direct deployment in ultrasound remains unreliable: a single point provides insufficient spatial context to resolve scale ambiguity, and greedy memory updates amplify early errors ...
Key Contributions:
Links: 📄 Paper | 📥 PDF
Actions:
approvedand comment "approve"rejectedand comment "reject"starred2. Global Structure-from-Motion Meets Feedforward Reconstruction
Score:
4.9/10| arXiv: 2605.26103v1Authors: Linfei Pan, Johannes Schönberge, Marc Pollefeys
Relevance:
AI Summary:
Structure-from-Motion -- the process of simultaneously estimating camera poses and 3D scene structure from a collection of images -- remains a central challenge in computer vision, with many open problems yet to be solved. Recent advances in feedforward 3D reconstruction have made significant strides in overcoming persistent failure cases of classical SfM methods, particularly in scenarios charact...
Key Contributions:
Links: 📄 Paper | 📥 PDF
Actions:
approvedand comment "approve"rejectedand comment "reject"starred3. Hidden in Plain Tokens: Simply Robust, Gradient-Free Watermark for Synthetic Audio
Score:
4.3/10| arXiv: 2605.25967v1Authors: Georgios Milis, Yubin Qin, Yihan Wu...
Relevance:
AI Summary:
As policy catches up with the capabilities of generative AI, watermarking is central to content provenance efforts. Inference-time watermarks for autoregressive models are unfit for continuous modalities due to discretization inconsistencies. Existing methods overcome this by finetuning the modality tokenizers, nullifying the watermark's training-free advantage. In this work, motivated by the voca...
Key Contributions:
Links: 📄 Paper | 📥 PDF
Actions:
approvedand comment "approve"rejectedand comment "reject"starred4. Conditional KRR: Injecting Unpenalized Features into Kernel Methods with Applications to Kernel Thresholding
Score:
4.2/10| arXiv: 2605.26067v1Authors: Rustem Takhanov, Zhenisbek Assylbekov
Relevance:
AI Summary:$\mathcal{F}$ . It is well known that such a kernel $K$ is associated with its native space (defined analogously to an RKHS), which in turn gives rise to a learning method -- called conditional kernel ridge regression (conditional KRR) due to its analogy with KRR -- where the estimated regression function is ...
Conditionally positive definite (CPD) kernels are defined with respect to a function class
Key Contributions:
Links: 📄 Paper | 📥 PDF
Actions:
approvedand comment "approve"rejectedand comment "reject"starred5. Towards 3D heart mesh generation using contactless radar imaging and physics-informed neural network
Score:
4.1/10| arXiv: 2605.26003v1Authors: Jinye Li, Chenxi Fu, Minghang Zheng...
Relevance:
AI Summary:
Cardiac function evaluation necessitates continuous, non-invasive monitoring, a capability limited in MRI. Millimeter-wave (mmWave) radar and its Synthetic Aperture Radar (SAR) mode offer a privacy-preserving and portable point-of-care clinical applications. However, reconstructing high-fidelity 3D cardiac geometry from SAR remains an open challenge. Traditional radar methods generate sparse point...
Key Contributions:
Links: 📄 Paper | 📥 PDF
Actions:
approvedand comment "approve"rejectedand comment "reject"starred6. Reinforcing Few-step Generators via Reward-Tilted Distribution Matching
Score:
4.1/10| arXiv: 2605.26108v1Authors: Yushi Huang, Xiangxin Zhou, Ruoyu Wang...
Relevance:
AI Summary:
Recent advances in few-step diffusion distillation have enabled efficient image generation, yet aligning these models with human preferences remains challenging. We propose Reward-Tilted Distribution Matching Distillation (RTDMD), a two-stage framework that unifies distribution matching distillation with reward-guided reinforcement learning for few-step flow generators. We show that minimizing the...
Key Contributions:
Links: 📄 Paper | 📥 PDF
Actions:
approvedand comment "approve"rejectedand comment "reject"starred7. TriSplat: Simulation-Ready Feed-Forward 3D Scene Reconstruction
Score:
4.1/10| arXiv: 2605.26115v1Authors: Weijie Wang, Zimu Li, Jinchuan Shi...
Relevance:
AI Summary:
Sparse-view 3D reconstruction is increasingly addressed with feed-forward splatting networks that predict explicit primitives directly from images. Yet most existing methods remain centered on Gaussian primitives and expose surfaces only indirectly: extracting a usable mesh for downstream simulation, physics reasoning, or embodied interaction still requires expensive post-hoc steps that break the ...
Key Contributions:
Links: 📄 Paper | 📥 PDF
Actions:
approvedand comment "approve"rejectedand comment "reject"starred8. AnyScene: Towards Highly Controllable Driving Scene Generation at Anywhere and Beyond
Score:
4.1/10| arXiv: 2605.26113v1Authors: Haiming Zhang, Junfei Zhou, Feng Jiang...
Relevance:
AI Summary:
Generating high-fidelity and controllable synthetic data is critical for advancing end-to-end autonomous driving, particularly for addressing the long tail of rare safety-critical scenarios. Existing occupancy-guided methods typically rely on shallow conditioning mechanisms and reference-frame-dependent video synthesis, which limits fine-grained controllability from arbitrary BEV layouts and restr...
Key Contributions:
Links: 📄 Paper | 📥 PDF
Actions:
approvedand comment "approve"rejectedand comment "reject"starred9. Where Concept Erasure Should Occur: Concept-Layer Alignment in Text-to-Video Diffusion Models
Score:
4.0/10| arXiv: 2605.25941v1Authors: Yiwei Xie, Ping Liu, Zheng Zhang
Relevance:
AI Summary:
Text-to-video diffusion transformers encode semantic information unevenly across model depth, which constrains effective concept erasure. We identify a representational bottleneck, termed concept-layer topological alignment, under which target concepts exhibit higher separability at certain representational depths. Outside these depths, concept and non-target signals remain strongly entangled, lim...
Key Contributions:
Links: 📄 Paper | 📥 PDF
Actions:
approvedand comment "approve"rejectedand comment "reject"starred10. F-RNG: Feed-Forward Relightable Neural Gaussians
Score:
4.0/10| arXiv: 2605.25975v1Authors: Guangming Fu, Jiahui Fan, Jian Yang...
Relevance:
AI Summary:
Capturing relightable 3D assets from real-world objects is a widely researched problem. Several per-scene optimization-based methods, based on 3D Gaussian splatting (3DGS), support relighting; however, they usually require dense input views, and their overfitting nature makes it difficult to generalize across scenes. Unlike per-scene optimization methods, generalized feed-forward models can direct...
Key Contributions:
Links: 📄 Paper | 📥 PDF
Actions:
approvedand comment "approve"rejectedand comment "reject"starredHow to Review
approved- Add to collectionrejected- Skip this paperstarred- Mark as particularly importantNote: Papers with
approvedlabel will be automatically added to the collection.