You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, just dropping a friendly pointer to my project at Tophness/ChatGPT-PC-Controller.
I noticed you're using pyautogui.
I experimented with this for a long time, but it isn't compatible with the latest windows ui frameworks like UIA and WPF, only the very old Win32 API that's deprecated in windows 11.
I notice you're just using pyautogui.click() and pyautogui.write() instead of directly finding/reading/editing/triggering windows control elements anyway, but it's much, much more powerful if you do.
GPT-Vision wasn't even available at the time I made it. It just directly knew what to do blindly.
Directly using control elements means it can run in the background without needing to take over the user's mouse and keyboard, or even hide the app it's controlling entirely.
It could just browse the web, crunch some numbers on some data it found and send off emails about it in parallel while you're editing a video, and you wouldn't even notice the difference.
It's also instant (no need to wait for delays between clicks and presses), and there's no room for error.
Even an unexpected window popping up is no issue, since it doesn't need the window to be active to control it, and it can activate it automatically and wait for it to be active if need be.
ChatGPT can also write out whole scripts that do the job from a single response.
For this and many other reasons (like reading pixel rgb values), I recommend using AutoIt.
I started off making an interpreter for Autoit that took in more natural language and wrote the code itself, but it seems ChatGPT is well versed enough in Autoit that you can just directly hook the dll calls using an AST.
I did this before OpenAI's Function Calling API existed, so it would only be that much more powerful now.
Feel free to copy anything I did, and hit me up if you'd like to merge these 2 in some way.
I would say you might as well leave a separate pyautogui mode that works as-is while we merge things like vision over to autoit mode.
The text was updated successfully, but these errors were encountered:
Hi, just dropping a friendly pointer to my project at Tophness/ChatGPT-PC-Controller.
I noticed you're using pyautogui.
I experimented with this for a long time, but it isn't compatible with the latest windows ui frameworks like UIA and WPF, only the very old Win32 API that's deprecated in windows 11.
I notice you're just using pyautogui.click() and pyautogui.write() instead of directly finding/reading/editing/triggering windows control elements anyway, but it's much, much more powerful if you do.
GPT-Vision wasn't even available at the time I made it. It just directly knew what to do blindly.
Directly using control elements means it can run in the background without needing to take over the user's mouse and keyboard, or even hide the app it's controlling entirely.
It could just browse the web, crunch some numbers on some data it found and send off emails about it in parallel while you're editing a video, and you wouldn't even notice the difference.
It's also instant (no need to wait for delays between clicks and presses), and there's no room for error.
Even an unexpected window popping up is no issue, since it doesn't need the window to be active to control it, and it can activate it automatically and wait for it to be active if need be.
ChatGPT can also write out whole scripts that do the job from a single response.
For this and many other reasons (like reading pixel rgb values), I recommend using AutoIt.
I started off making an interpreter for Autoit that took in more natural language and wrote the code itself, but it seems ChatGPT is well versed enough in Autoit that you can just directly hook the dll calls using an AST.
I did this before OpenAI's Function Calling API existed, so it would only be that much more powerful now.
Feel free to copy anything I did, and hit me up if you'd like to merge these 2 in some way.
I would say you might as well leave a separate pyautogui mode that works as-is while we merge things like vision over to autoit mode.
The text was updated successfully, but these errors were encountered: