Skip to content

Ml 685 agent commands move mouse to target without click#28

Merged
programminx-askui merged 22 commits intoML-649-ask-ui-integration-add-ai-element-command-via-askui-inferencefrom
ML-685-Agent-Commands-Move-mouse-to-target-without-click
Feb 27, 2025
Merged

Ml 685 agent commands move mouse to target without click#28
programminx-askui merged 22 commits intoML-649-ask-ui-integration-add-ai-element-command-via-askui-inferencefrom
ML-685-Agent-Commands-Move-mouse-to-target-without-click

Conversation

@programminx-askui
Copy link
Collaborator

Hello,

I added a mouse_move command

@programminx-askui programminx-askui marked this pull request as ready for review February 25, 2025 20:26
@programminx-askui programminx-askui changed the base branch from ML-718-agent-os-make-controller-compatible-with-25-2-1-remote-device-controller to main February 25, 2025 20:38
Copy link
Contributor

@adi-wan-askui adi-wan-askui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solid :)

programminx-askui and others added 15 commits February 26, 2025 17:51
Co-authored-by: Samir Mlika <105347215+mlikasam-askui@users.noreply.github.com>
Co-authored-by: Samir Mlika <105347215+mlikasam-askui@users.noreply.github.com>
Co-authored-by: Samir Mlika <105347215+mlikasam-askui@users.noreply.github.com>
Co-authored-by: Samir Mlika <105347215+mlikasam-askui@users.noreply.github.com>
Co-authored-by: Samir Mlika <105347215+mlikasam-askui@users.noreply.github.com>
Co-authored-by: Samir Mlika <105347215+mlikasam-askui@users.noreply.github.com>
Co-authored-by: adi-wan-askui <105295410+adi-wan-askui@users.noreply.github.com>
Co-authored-by: adi-wan-askui <105295410+adi-wan-askui@users.noreply.github.com>
…patible-with-25-2-1-remote-device-controller

feat: #ML-718 add support for new RemoteDeviceController
Copy link
Contributor

@adi-wan-askui adi-wan-askui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice :) It is coming together

@@ -94,7 +94,7 @@ def click(self, instruction: Optional[str] = None, button: Literal['left', 'midd
def __mouse_move(self, instruction: str, model_name: Optional[str] = None) -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about also renaming instruction in all of the public api that used ModelRouter.locate() underneath to locator?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about it, too.

but I wouldn't do this. I want to avoid that the users have tu deal with the concept of the locator.

We (PM and me) want to expose as less internals alá GroundingModel and the locator concept, ..., to the users. So they have not to learn new concepts.

Copy link
Contributor

@adi-wan-askui adi-wan-askui Feb 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that we don't want the user to learn new concepts if we can omit it. But in this case I think we actually have different concepts but just gave them the same name which I think is even less optimal.

  • While act() actually accepts an instruction (either as high level goal, e.g., Book a flight from Frankfurt to San Francisco, or step by step description of what to do which works better in my experience than just a goal, e.g., Open skyscanner.com, search for flights from Frankfurt to San Francisco, choose the first one etc.),
  • get() (or ask()) actually needs a query, e.g., Is there a flight cheaper than $ 1000?.
  • Last but not least, click(), mouse_move() etc. are actually about describing the element that you want to interact with, e.g., click on the magnifying glass icon. That is why I would go for different names for the parameters:

So I would propose going for the following

  • act()instruction (like it better than goal as it makes no assumption about how abstract it needs to be)
  • ask()query
  • click(), mouse_move() etc. ⟶ locator (which for me is a description of the element that you would like to interact with which may also be an image)

Also the term locator is not super new to users that are engineers working on automation (RPA, test etc.), see

programminx-askui and others added 6 commits February 27, 2025 09:11
Co-authored-by: adi-wan-askui <105295410+adi-wan-askui@users.noreply.github.com>
Co-authored-by: adi-wan-askui <105295410+adi-wan-askui@users.noreply.github.com>
Co-authored-by: adi-wan-askui <105295410+adi-wan-askui@users.noreply.github.com>
…k' of github.com:askui/vision-agent into ML-685-Agent-Commands-Move-mouse-to-target-without-click
@programminx-askui programminx-askui changed the base branch from main to ML-649-ask-ui-integration-add-ai-element-command-via-askui-inference February 27, 2025 08:42
@programminx-askui
Copy link
Collaborator Author

I merge this to AI element. and then test everything through. so we can move in direction of the release

…kui-inference' into ML-685-Agent-Commands-Move-mouse-to-target-without-click
@programminx-askui programminx-askui merged commit dd6651e into ML-649-ask-ui-integration-add-ai-element-command-via-askui-inference Feb 27, 2025
@adi-wan-askui adi-wan-askui deleted the ML-685-Agent-Commands-Move-mouse-to-target-without-click branch February 28, 2025 17:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants