Skip to content

Asynchronous 3D understanding of photorealistic tiles #561

@ngoiyaeric

Description

@ngoiyaeric

Look at the example and find a solution using this codebase

3D bounding boxes is a new experimental feature from Gemini 2.0 that will continue to improve in future models.

To get 3D bounding boxes, you need to tell the model exactly what you need for the output format. This is the recommended one as it's the one the model knows the best.

To prevent the model from repeating itself, it is recommended to use a temperature over 0, in this case 0.5. Limiting the number of items (10 in this case) is also a way to prevent the model from looping and to speed up the decoding of the bounding boxes. You can experiment with these parameters and find what works best for your use-case.

https://colab.research.google.com/github/google-gemini/cookbook/blob/main/examples/Spatial_understanding_3d.ipynb#scrollTo=WbWHzjtT6ELv

Image depth estimation and geometry in 3D tiles

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions