Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

please add install instructions for linux on the https://openadapt.ai/#start page #631

Open
hemangjoshi37a opened this issue Apr 19, 2024 · 7 comments
Labels
enhancement New feature or request

Comments

@hemangjoshi37a
Copy link

Feature request

please add install instructions for linux on the https://openadapt.ai/#start page

Motivation

please add install instructions for linux on the https://openadapt.ai/#start page

@hemangjoshi37a hemangjoshi37a added the enhancement New feature or request label Apr 19, 2024
@abrichr
Copy link
Contributor

abrichr commented Apr 23, 2024

Hi @hemangjoshi37a , thank you for your interest!

Unfortunately at this time, OpenAdapt does not currently support linux for two reasons:

  1. This is non-trivial additional effort for minimal gain. According to https://gs.statcounter.com/os-market-share/desktop/worldwide, Linux currently occupies about 4% of global desktop OS market share.

  2. The input control library we use (pynput) does not support differentiating between "injected" (synthetic) and regular (human) input on Linux (see Using event suppression to create a remapper moses-palmer/pynput#105 (comment)). While we do not yet make use of this functionality, we do have plans to in the near future.

That said, we would welcome a Pull Request to add installation instructions for Linux! The relevant repo is at https://github.com/OpenAdaptAI/OpenAdapt.web. Of course, this would require testing the core library on Linux as well.

@hemangjoshi37a
Copy link
Author

hemangjoshi37a commented Apr 25, 2024

but you should consider that 96% of the developers use linux who ultimately are going to use openadapt and not your average everyday Karens. LOL also i believe you response is AI generated and not any person is responsible for your response.

@metatrot
Copy link

metatrot commented Jul 8, 2024

The strength of this project's approach seems to be that it uses SAM and multimodal models to visually parse GUI layouts, instead of relying on OS-specific features like window's accessibility api. Every month I check again to see if I can use it on my OS yet. The feature that I'm really excited about is just having a way to parse a whole-screen screenshot into something that can be described in detail by an LLM model. The automation/interactive parts of the project aren't necessary for me. I just want a super powerful OCR-like tool that works on whole-screen screenshots to give structured output like text, window, buttons and other input field locations.

@hemangjoshi37a
Copy link
Author

@metatrot I have tried building something very similar but at very initial stage : https://github.com/microsoft/graphrag

@abrichr
Copy link
Contributor

abrichr commented Jul 12, 2024

@metatrot thank you for the information!

I just want a super powerful OCR-like tool that works on whole-screen screenshots to give structured output like text, window, buttons and other input field locations.

Can you please clarify what this would be useful for?

@abrichr
Copy link
Contributor

abrichr commented Jul 12, 2024

@hemangjoshi37a I believe you pasted the wrong link

@metatrot
Copy link

@abrichr I would use it for the same purposes as this: https://github.com/louis030195/screen-pipe
(sadly that project is still mac-only at the moment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants