-
Notifications
You must be signed in to change notification settings - Fork 0
WIP: Map comprehension experiments #860
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Conversation
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
Greptile SummaryThis PR adds point detection capabilities to the VLM (Vision-Language Model) system, enabling VLMs to identify specific point locations in images rather than just bounding boxes. Key changes:
Critical issue:
Confidence Score: 2/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant User
participant VlModel
participant MoondreamVlModel
participant Detection2DPoint
participant ImageDetections2D
participant ImageAnnotations
User->>VlModel: query_points(image, query)
VlModel->>VlModel: _prepare_image(image)
VlModel->>VlModel: query_json(scaled_image, full_query)
loop For each point tuple
VlModel->>Detection2DPoint: vlm_point_to_detection2d_point()
Detection2DPoint-->>VlModel: Detection2DPoint instance
VlModel->>ImageDetections2D: append detection
end
VlModel-->>User: ImageDetections2D
User->>ImageDetections2D: to_foxglove_annotations()
ImageDetections2D->>Detection2DPoint: to_circle_annotation()
ImageDetections2D->>Detection2DPoint: to_text_annotation()
ImageDetections2D-->>User: ImageAnnotations
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
11 files reviewed, 2 comments
| class Detection2DPoint(Detection2D): | ||
| """A 2D point detection, visualized as a circle.""" | ||
|
|
||
| x: float | ||
| y: float | ||
| name: str | ||
| ts: float | ||
| image: Image | ||
| track_id: int = -1 | ||
| class_id: int = -1 | ||
| confidence: float = 1.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: Missing abstract method to_ros_detection2d() required by base class Detection2D. This will cause TypeError: Can't instantiate abstract class Detection2DPoint with abstract method to_ros_detection2d at runtime when creating instances.
See Detection2DBBox.to_ros_detection2d() at dimos/perception/detection/type/detection2d/bbox.py:395 for reference implementation.
|
|
||
| if state.target: | ||
| pose: LCMTransport[PoseStamped] = LCMTransport("/target", PoseStamped) | ||
| pose.publish(target) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
syntax: target is undefined. Should be state.target.
| pose.publish(target) | |
| pose.publish(state.target) |
* captioner modules implemented in models/vl, flake.nix fixes * model structure rework * refactor * bugfix * removed double update_intrinsic on metric3d * mypy * typing fixes * embedding models rewrite * mobileclip preprocess accessor rewrite * torch reid models added to lfs, reid/embedding model cleanup * mobileclip upload * batch vlm querying * moondream batch queries and tests * type fixes * proper model resource management, speed tests, auto-resizing, plotting * type fixes * tests, mypy, correct cleanup * metric3d tests * attempting to remove dead code * scaling bugfix for visual models * docstring fix * plotext dep * open clip dep * open clip dep fix * gdown dep * tensorboard dep * typing fixes for detections and plotter * person tracker typing fix * py 3.10 typing fix * last type fix * ignore missing imports (for ros deps) * nicer init for florence * type fixes * mypy ignore ros/mujoco * addressing PR comments * image is a fixture * captioner fixtures * all PR comments addressed Former-commit-id: 8be510d [formerly cbe68d2] Former-commit-id: b70ed02
37afea0 to
d27ee89
Compare
|
Too many files changed for review. |
1 similar comment
|
Too many files changed for review. |
Former-commit-id: 22e5cd2
|
Too many files changed for review. |
22e5cd2 to
9c27672
Compare
|
Too many files changed for review. |
9c27672 to
72353ba
Compare
|
Too many files changed for review. |
1 similar comment
|
Too many files changed for review. |
run foxglove-bridge
export ALIBABA_API_KEY=... pytest -sv dimos/agents2/skills/interpret_map/eval/test_ivan_eval.py