π Currently pursuing a PhD in Visual Perception and Reasoning.
π My research interests revolve around Vision-Language Models π§ , Embodied Agents π€, and Scene Graph Generation πΈ. I am passionate about creating generalist AI models capable of understanding and interacting with complex visual data.
-
Visual Generalist Models: Developing models that process diverse visual data (e.g., images, videos, 3D, audio, IMU) to tackle various tasks in perception, reasoning, generation, robotics, and gaming. Notable projects include EgoLife, Octopus, FunQA, and Otter.
-
AI Safety for Foundation Models: Investigating how to mitigate hallucinations in large language models (LLMs) and multimodal models (LMMs). A key contribution is the introduction of UPD to withhold answers when faced with unsolvable questions.
-
PSG Series (2022-2023): Led the development of the PSG, PVSG, and PSG4D models, focusing on relation modeling for scene understanding. I also collaborated on works like Relate-Anything and PairNet.
-
OOD Detection (2021-2022): Led a comprehensive survey and developed OpenOOD, a popular codebase for Out-of-Distribution detection in AI safety.
-
Prompt Tuning (2022): Contributed to foundational works like CoOp and CoCoOp for prompt tuning in vision-language models.
- Email: yangjingkang001@gmail.com
- LinkedIn: Jingkang Yang
- Twitter: @JingkangY
Feel free to reach out for collaboration or just to chat about AI and technology!
Thanks for visiting my profile!