Testing dataset likely poorly represents the set of images that a Firefox description writer would be used for #1

hkolbeck · 2024-06-05T08:15:49Z

Line 113 in 9c301fd

    
           The dataset used in the tool is located here https://huggingface.co/datasets/Mozilla/alt-text-validation

Having scanned through the photos being used in testing the AI being developed, they seem similar in important ways to the dataset I used in testing my options around direct comparison algorithms for very close but not equal images, think the same image resized and/or compressed, explained here: https://github.com/alt-text-org/image-algo-testing.

I realized in my exploration of the results that the image data I used, snagged from some facebook algorithm invention event, poorly represented the set of photo categories that were posted by users of social media, my prioritized userbase. I prioritized those users in part because I believed the vast majority of usages of the tool I was building would be pictures being sent to social media. I believe that an AI description engine would be used by a very similar population and dominated by people posting images to social media. That belief is based on my last 3 years of deep involvement in and creation of alt text writing tools, but is worth more in depth and documented exploration and research into classes of image likely to be fed to the AI for description.

After some consideration I decided that given the nature of my tests mentioned above that the dataset used was not reliable, but given the still clear results of that testing and the existence of several other "in future testing..." items I did not seek to re-run with a more representative collection at that time. I assert that the image type is much more important here due to the nature of the algorithms being tested and their likely usage.

I think it is important to make decisions on this project based not just on my claims here but active research and questions to community about how such a tool is likely to be used, and then integration of that into design and testing of this tool. I assert that a public release prior to said research and communication is not responsible if the goal is increasing web accessibility.

tarekziade · 2024-06-06T16:36:42Z

Thanks a lot for your feedback. We'd love to engage more with the community to make sure we do the right thing. Do you have some tips on how we can improve the training? or some folks we can reach out?

hkolbeck · 2024-06-06T18:00:31Z

Thanks a lot for your feedback. We'd love to engage more with the community to make sure we do the right thing. Do you have some tips on how we can improve the training? or some folks we can reach out?

I just emailed, but want to say here that I now know that my understanding of Mozilla plans was mistaken, and so while the concerns above are valid I do not think this issue applies to this repo and will close it. As I said in the email, I still think it's worth discussing with y'all and hope to do so.

hkolbeck closed this as completed Jun 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Testing dataset likely poorly represents the set of images that a Firefox description writer would be used for #1

Testing dataset likely poorly represents the set of images that a Firefox description writer would be used for #1

hkolbeck commented Jun 5, 2024

tarekziade commented Jun 6, 2024

hkolbeck commented Jun 6, 2024

Testing dataset likely poorly represents the set of images that a Firefox description writer would be used for #1

Testing dataset likely poorly represents the set of images that a Firefox description writer would be used for #1

Comments

hkolbeck commented Jun 5, 2024

tarekziade commented Jun 6, 2024

hkolbeck commented Jun 6, 2024