Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Testing dataset likely poorly represents the set of images that a Firefox description writer would be used for #1

Closed
hkolbeck opened this issue Jun 5, 2024 · 2 comments

Comments

@hkolbeck
Copy link

hkolbeck commented Jun 5, 2024

The dataset used in the tool is located here https://huggingface.co/datasets/Mozilla/alt-text-validation

Having scanned through the photos being used in testing the AI being developed, they seem similar in important ways to the dataset I used in testing my options around direct comparison algorithms for very close but not equal images, think the same image resized and/or compressed, explained here: https://github.com/alt-text-org/image-algo-testing.

I realized in my exploration of the results that the image data I used, snagged from some facebook algorithm invention event, poorly represented the set of photo categories that were posted by users of social media, my prioritized userbase. I prioritized those users in part because I believed the vast majority of usages of the tool I was building would be pictures being sent to social media. I believe that an AI description engine would be used by a very similar population and dominated by people posting images to social media. That belief is based on my last 3 years of deep involvement in and creation of alt text writing tools, but is worth more in depth and documented exploration and research into classes of image likely to be fed to the AI for description.

After some consideration I decided that given the nature of my tests mentioned above that the dataset used was not reliable, but given the still clear results of that testing and the existence of several other "in future testing..." items I did not seek to re-run with a more representative collection at that time. I assert that the image type is much more important here due to the nature of the algorithms being tested and their likely usage.

I think it is important to make decisions on this project based not just on my claims here but active research and questions to community about how such a tool is likely to be used, and then integration of that into design and testing of this tool. I assert that a public release prior to said research and communication is not responsible if the goal is increasing web accessibility.

@tarekziade
Copy link
Collaborator

Thanks a lot for your feedback. We'd love to engage more with the community to make sure we do the right thing. Do you have some tips on how we can improve the training? or some folks we can reach out?

@hkolbeck
Copy link
Author

hkolbeck commented Jun 6, 2024

Thanks a lot for your feedback. We'd love to engage more with the community to make sure we do the right thing. Do you have some tips on how we can improve the training? or some folks we can reach out?

I just emailed, but want to say here that I now know that my understanding of Mozilla plans was mistaken, and so while the concerns above are valid I do not think this issue applies to this repo and will close it. As I said in the email, I still think it's worth discussing with y'all and hope to do so.

@hkolbeck hkolbeck closed this as completed Jun 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants