Skip to content

OlehOnyshchak/WikiImageRecommendation

Repository files navigation

Image Recommendation for Wikipedia Articles

Abstract

Multimodal learning, which is simultaneous learning from different data sources such as audio, text, images; is a rapidly emerging field of Machine Learning. It is also considered to be learning on the next level of abstraction, which will allow us to tackle more complicated problems such as creating cartoons from a plot or speech recognition based on lips movement.

In this paper, we propose to research whether state-of-the-art techniques of multimodal learning, will solve the problem of recommending the most relevant images for a Wikipedia article. In other words, we need to create a shared text-image representation of an abstract notion paper describes, so that having only a text description machine would "understand" which images would visualize the same notion accurately.

Data

Dataset was collected from Wikipedia and uploaded to Kaggle using pyWikiMM library. For more details about dataset collection and structure, please refer to its description on Kaggle.

Reproduce Results

You can reproduce the results either by setting up the environment locally or by cloning notebooks from Kaggle. Following Kaggle notebooks are available:

More Details

You can find more details about problem statement and our solution approach in "Image Recommendation for Wikipedia Articles" thesis.

Acknowledgments

Special thanks to Miriam Redi for actively mentoring me in this project.