-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Suggest depictions using image recognition #75
Comments
A friend has left Google to create an AI company and is looking for people to test his library. He promises to open source it soon. Unlike Google libraries, it is usable offline. This looks like a great opportunity to develop this feature, since no such library existed so far (as far as I know) |
Sounds great, and yeah should definitely be opt-in. I could chuck this into my IEG renewal proposal, but that probably won't be for another couple of months, so anyone who wants to work on it sooner is most welcome. |
There is a grant proposal to create an API for that: |
@nicolas-raoul Sounds very useful! How did you hear of it? I wanted to post an endorsement, but their community notifications section is still empty so I was hesitant. :) |
@misaochan I learned about it here: https://www.wikidata.org/wiki/Wikidata:Project_chat#Interesting_initiative |
I did the same. :) Even if the grant is approved though, it will probably be about a year before the API is usable (the grant is 8 months, and I believe the next Project Grant round starts in July). |
Thanks for the endorsement @nicolas-raoul! I am one of the guys behind the proposal. We welcome any suggestions and advice! |
Recent WMF blog post https://blog.wikimedia.org.uk/2018/02/structured-data-on-commons-is-the-most-important-development-in-wikimedias-usability/ :
This sounds very similar to the present issue. The idea of swiping left/right is interesting, let's gather the pros/cons:
Cons of swiping:
The other new idea we can steal from this blog is that category suggestion could be used not only for the picture I just uploaded, but also for uncategorized pictures uploaded by other people. |
Hai , |
@aaronpp65 basic questions about this solution:
Also, if I understand correctly that library gives you a word like "leopard" or "container ship", right? How do you propose matching these strings to:
|
*It’s machine-learning on the go, without the need for connectivity. |
Yes the library gives you a word like "leopard" or "container ship" but it happens when we use a pre-trained Incpetion v3. Its trained using Imagnet data set. |
@aaronpp65 Very impressive, thanks!
So I guess we'd be better off trying to match from ImageNet categories to Commons or Wikidata. |
Yeah.....So mapping ImageNet with commons should do the trick |
@nicolas-raoul will you please check my draft and give possible feedbacks. |
@aaronpp65 Could you please post a link to your draft? Thanks! |
https://docs.google.com/document/d/1am3EbhBrwaYn2_LLKAmnrXlzTGVWgttCdALAV4fy_NU/edit?usp=sharing |
Yes, please post it on Phabricator, thanks :-)
…On Fri, Mar 23, 2018 at 1:56 AM, aaronpp65 ***@***.***> wrote:
https://docs.google.com/document/d/1am3EbhBrwaYn2_
LLKAmnrXlzTGVWgttCdALAV4fy_NU/edit?usp=sharing
@nicolas-raoul <https://github.com/nicolas-raoul> here is the link to the
draft. I should make one in phabricator too right?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#75 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAGFBg4ZoDeKOeG6RxeFEOexRjr-wttXks5tg9eigaJpZM4HiYl2>
.
|
Could you please explain in more details the following steps:
*- Convert the model to the TensorFlow Lite file format.*
*- Integrating the converted model into Android application*
*Also, please add a step-by-step description of what the user will see,
what screen they will go to, what button they click, so that we understand
what this project will bring to the app. Feel free to include hand-drawn
screens to make it clearer if necessary.*
*Thanks! :-)*
On Fri, Mar 23, 2018 at 1:34 PM, Nicolas Raoul <nicolas.raoul@gmail.com>
wrote:
… Yes, please post it on Phabricator, thanks :-)
On Fri, Mar 23, 2018 at 1:56 AM, aaronpp65 ***@***.***>
wrote:
> https://docs.google.com/document/d/1am3EbhBrwaYn2_LLKAmnrXlz
> TGVWgttCdALAV4fy_NU/edit?usp=sharing
> @nicolas-raoul <https://github.com/nicolas-raoul> here is the link to
> the draft. I should make one in phabricator too right?
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#75 (comment)>,
> or mute the thread
> <https://github.com/notifications/unsubscribe-auth/AAGFBg4ZoDeKOeG6RxeFEOexRjr-wttXks5tg9eigaJpZM4HiYl2>
> .
>
|
@nicolas-raoul i have made the required changes and have added a basic wireframe |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Here is a web-based tool that suggests categories for any image: https://youtu.be/Y9lvXVJCiyc?t=1932
If I understand correctly the wiki page calls a mediawiki API which in turn calls a third-party image recognition tool. Having mediawiki in the middle allows the IP address of the user to not be leaked, so I guess we could actually use this right now. |
It looks like this uses a Toolforge tool (https://tools.wmflabs.org/imagery/api.php) which is currently down(?) - it returns a 500 error on a query from the script for me. It's been a long time, I believe it was meant to be a proof of concept that was not going to be maintained as it was. |
I hope the source code is still available somewhere and someone turns it into a more permanent tool :-) |
My understanding is that we still need to find either:
The API or library must output either Commons category(ies) (example: "the submitted image contains a https://commons.wikimedia.org/wiki/Category:Dogs") or Wikipedia/Wikidata item(s) (example: "the submitted image contains a https://www.wikidata.org/wiki/Q144"). |
@nicolas-raoul I agree that using third party API such as Azure will be a concern for privacy. There is an alternative to it https://wadehuang36.github.io/2017/07/20/offline-image-classifier-on-android.html |
Thanks @madhurgupta10 ! |
@nicolas-raoul I managed to build it, If you would like, I shared the apk file . |
Thanks! Did you modify any of that project's files? If yes please fork/commit and push your fork to GitHub, thanks :-) wow 32 MB is very big. I am sure the tensorflow libraries contain many unused classes/network types/etc. Ideally image recognition should not add more than a few MB to the total size of our app's APK. Anyone willing to take on this issue for GSoC/Outreachy, please include that trimming task in your schedule, thanks! |
@nicolas-raoul Sure, I will add that in my proposal :) and will commit the files soon, also the TF 2.0 is out so it would be much more optimized and would be better than this example, it's pretty old. |
I believe the project above uses the normal Tensorflow. Using Tensorflow Lite will certainly reduce size a lot, but still not enough I am afraid. Other things to try: |
@nicolas-raoul |
There is a pre-built APK at https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/android#bazel |
I would take a look at this and hopefully, we can incorporate this in the app for category suggestions and later maybe to suggest depicts for Wikidata. |
Most image classification implementations output WordNet 3.0 concepts. I just wrote this query that shows the mapping between WordNet concepts, Wikidata items, and Commons categories. It takes a while to execute, so here is a screenshot: There are currently 474 mappings, and it has not increased in a year. I will try to motivate people to add more mappings. |
Good news, this is starting to get implemented on commons.wikimedia.org : |
This page seems to do exactly what we want: https://commons.wikimedia.org/wiki/Special:SuggestedTags I have asked whether an API could be made for us: https://commons.wikimedia.org/wiki/Commons_talk:Structured_data/Computer-aided_tagging/Archive_2020#API_to_retrieve_%22depicts%22_suggestions_from_Commons_app? (no reply unfortunately) |
Wow! The suggestions look quite useful. For each of the first 5 images, I found at least 1 relevant tag suggested. |
Their algorithm is fantastic IMO!! Had at least 2 relevant tags for 3/3 of the photos I saw. @macgills , is this something that you and Mark have identified as a future potential task for you (after getting our SDC branch merged)? |
I couldn't say! Will for sure discuss it with him at our next meeting on monday. |
Awesome! Let us know how that goes. :) |
In my attempt at testing 6 or 7 images, the suggestions were mostly relevant. In cases there were even 10 appropriate suggestion! Also, none of the suggestions can be called totally irrelevant. This looks great!
Yeah, guess what they are using in the backend, Google cloud vision. 😎 [ref] On a related note, the Wikipedia app is adding a new option in their Suggested edits feature that allows users to tag Commons images with suggested image tags [ref 1] [ref 2] [ref 3]. This is already in their alpha app. Not sure if it's in the production version, though. I suppose they're using an API related to the |
Ideally in the future we could use on-device models to do this. This would remove the need to either call a we service or embed a bulky model in our APK. https://ai.google.dev/tutorials/android_aicore :
Hopefully image-to-text will come soon. |
The idea is nice. I'm nust unsure what the community consensus is about using machine assistance in order to edit depictions. Do you happen to be aware of any guidelines about it Nicolas? |
@sivaraam I don't think there are any guidelines about this currently. The AICAT experiment was stopped due to some strong opposing voices, but I believe our app is a very different use case. In our app:
|
Embedded models would be a good way to avoid privacy issues (everything is done on the device), but currently it only supports text-to-text, not image-to-text: https://developer.android.com/ai/gemini-nano https://ai.google.dev/edge |
It would be great if I was proposed the item Elephant when I take a picture of an elephant.
There are some APIs for this, not sure if any is usable for free.
The API would provide a few words such as {elephant, zoo} and we would perform a Wikidata item search on these words and add the resulting items to the list of suggestions.
In using an online service, the feature should probably be opt-in since the privacy policy of the API will most probably be incompatible with the Wikimedia privacy policy.
The text was updated successfully, but these errors were encountered: