idea #66

alafortu · 2023-12-02T19:35:25Z

I played with gpt4V on other projects and it definitely has a hard time figuring out coordinates. I used other model trained on image identification to find the coordinates of the box made around the object detected and then I can pass it to gpt 4 to perform an action. For your use case, I juste tested this model "https://huggingface.co/foduucom/web-form-ui-field-detection" Far from being perfect, but maybe an idea to build on. If you auto computer can detect and get the proper coordinates of the input fields in an image, it could help or at least add a level of redundancy to improve accuracy in clicking and inputing stuff at the right places.

Bunger-Beesechurger · 2023-12-02T21:01:45Z

@rohanarun
I'm not a contributor to this github, just part of the audience usually, but this seems earlier than your video. Early August is when this article came out, so it's been in the works even earlier than that. Stop spamming every issue. You said you've been working on your thing for over a year, but how much of the info came out before your video? I don't know whether it's plagiarizing or not, and if it is, I'm sorry. However, I can still be annoyed that on what should be a cool new project for tech advancement, we have to figure out if something is stealing or not.

Article says "HyperWriteAI" and from this github's own main page: "Ongoing Development
At HyperwriteAI, we are developing Agent-1-Vision a multimodal model with more accurate click location predictions" so it is referencing this project.

Kreijstal · 2023-12-03T11:05:21Z

I mean you are saying you have a custom model, but all I see it's propietary and business products, your custom model is handwritten for the cases, but this is gpt-4V so it's not a rip off, they just had the idea (wouldn't it be cool if gpt-4 could control computers) and open sourced it first 🤷. It can't be a rip off because you started without gpt-4v, you trained a propietary custom model, these guys just did prompt engineering and got it wit gpt-4v to work, without taking any custom models.

If these guys get more fame it's because they open sourced it first, and then it's first come first serve. I think it's fair. imho.

Also your insecurity is showing, if your product was really good there is no need to spam it on every issue. Just give us something better and people will naturally flock to it.

James4Ever0 · 2023-12-03T14:40:29Z

Keep posting these will not help. AGI is for everyone, truely democratic.
It has been a long time that not a single company has wielded the wand towards the field of autonomous computers, until now. I have been waiting for this very moment for so long. It must be open source, and it will change the human history for good.

James4Ever0 · 2023-12-03T14:41:20Z

For inspiration, please check #37 #32

michaelhhogue · 2023-12-03T16:57:36Z

@alafortu Thanks for the suggestion. Low accuracy with GPT-4v is a known issue at the moment, and support for other models is planned in the future.

michaelhhogue closed this as completed Dec 3, 2023

OthersideAI deleted a comment from rohanarun Dec 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

idea #66

idea #66

alafortu commented Dec 2, 2023

Bunger-Beesechurger commented Dec 2, 2023 •

edited

Loading

Kreijstal commented Dec 3, 2023

James4Ever0 commented Dec 3, 2023 •

edited

Loading

James4Ever0 commented Dec 3, 2023 •

edited

Loading

michaelhhogue commented Dec 3, 2023

idea #66

idea #66

Comments

alafortu commented Dec 2, 2023

Bunger-Beesechurger commented Dec 2, 2023 • edited Loading

Kreijstal commented Dec 3, 2023

James4Ever0 commented Dec 3, 2023 • edited Loading

James4Ever0 commented Dec 3, 2023 • edited Loading

michaelhhogue commented Dec 3, 2023

Bunger-Beesechurger commented Dec 2, 2023 •

edited

Loading

James4Ever0 commented Dec 3, 2023 •

edited

Loading

James4Ever0 commented Dec 3, 2023 •

edited

Loading