OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
-
Updated
Jun 1, 2024 - Python
OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
Aircraft design optimization made fast through modern automatic differentiation. Composable analysis tools for aerodynamics, propulsion, structures, trajectory design, and much more.
The Cradle framework is a first attempt at General Computer Control (GCC). Cradle supports agents to ace any computer task by enabling strong reasoning abilities, self-improvment, and skill curation, in a standardized general environment with minimal requirements.
A reading list for large models safety, security, and privacy.
[CVPR 2024 🔥] GeoChat, the first grounded Large Vision Language Model for Remote Sensing
Ptera Software is a fast, easy-to-use, and open-source software package for analyzing flapping-wing flight.
A curated list of awesome papers on Embodied AI and related research/industry-driven resources.
Famous Vision Language Models and Their Architectures
[NeurIPS 2023 Oral] Quilt-1M: One Million Image-Text Pairs for Histopathology.
Matlab implementation to simulate the non-linear dynamics of a fixed-wing unmanned areal glider. Includes tools to calculate aerodynamic coefficients using a vortex lattice method implementation, and to extract longitudinal and lateral linear systems around the trimmed gliding state.
Official code for Paper "Mantis: Multi-Image Instruction Tuning"
🧘🏻♂️KarmaVLM (相生):A family of high efficiency and powerful visual language model.
This repo is a live list of papers on game playing and large multimodality model - "A Survey on Game Playing Agents and Large Models: Methods, Applications, and Challenges".
[ICRA 2024] Dream2Real: Zero-Shot 3D Object Rearrangement with Vision-Language Models
Add a description, image, and links to the vlm topic page so that developers can more easily learn about it.
To associate your repository with the vlm topic, visit your repo's landing page and select "manage topics."