Table of Contents
Publish Date | Title | Authors | Code | |
---|---|---|---|---|
2024-06-11 | A3VLM: Actionable Articulation-Aware Vision Language Model | Siyuan Huang et.al. | 2406.07549v1 | null |
2024-06-11 | Commonsense-T2I Challenge: Can Text-to-Image Generation Models Understand Commonsense? | Xingyu Fu et.al. | 2406.07546v1 | null |
2024-06-11 | Situational Awareness Matters in 3D Vision Language Reasoning | Yunze Man et.al. | 2406.07544v1 | null |
2024-06-11 | QuickLLaMA: Query-aware Inference Acceleration for Large Language Models | Jingyao Li et.al. | 2406.07528v1 | link |
2024-06-11 | Neural Gaffer: Relighting Any Object via Diffusion | Haian Jin et.al. | 2406.07520v1 | null |
2024-06-11 | TextGrad: Automatic "Differentiation" via Text | Mert Yuksekgonul et.al. | 2406.07496v1 | link |
2024-06-11 | Paraphrasing in Affirmative Terms Improves Negation Understanding | MohammadHossein Rezaei et.al. | 2406.07492v1 | null |
2024-06-11 | VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs | Zesen Cheng et.al. | 2406.07476v1 | link |
2024-06-11 | On the Robustness of Document-Level Relation Extraction Models to Entity Name Variations | Shiao Meng et.al. | 2406.07444v1 | null |
2024-06-11 | GemNet: Menu-Based, Strategy-Proof Multi-Bidder Auctions Through Deep Learning | Tonghan Wang et.al. | 2406.07428v1 | null |