Skip to content

sabrinasu424/Picture-Your-Way

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 

Repository files navigation

Picture-Your-Way

Building a picture based attraction recommendation system by combing LDA and K-Means in the structure of Scatter/Gather

Webpage | 3-minute demo/introduction

Abstract

The main purpose of this Picture Your Way is to build a picture-based attraction recommendation system. The proposed algorithm combines Latent Dirichlet Allocation (LDA) and K-Means in the structure of Scatter/Gather. Through this system, thousands of tourist attractions in Taiwan are recommended to users via a direct and prompt process, selecting pictures. We leveraged the information of attractions listed in the governement open tourist attraction data and selected corresponding pictures from Instagram. The pictures ranked by Elo Rating System is moderately correlated to the realistic public preferences. The accuracy of LDA topics is 66%. The precision value for this proposed algorithm is 74%. And the system effectiveness calculated through the feedback of users is PR 72. Therefore, this system is proven to succeed in recommending suitable and beautiful attractions in Taiwan to users by using the proposed algorithm in this study.

Features

  • Instant Demonstration: Distinguish attraction type through browsing pictures without reading long discriptions.
  • Convenience: Reduce massive search time for users.
  • Accuracy: Achieve optimal clustering and recommendation through machine learning algorithm training and careful evaluation.
  • Aesthetic: Weight used pictures from the rankings aligned with public preferences to better meet user's need.

Core Methods

  • Scatter/Gather: A document clustering algorithm used to cluster massive documents within a short period of time, which allowed us to provide the optimal tourist attractions to users effectively.
  • LDA Topic Modeling: A machine learning algorithm used to extract latent topics from text data, which was leveraged in our first "Scatter" in the Scatter/Gather structure. (Unlike TF-IDF, LDA considers the term distributions in addition to term frequencies).
  • K-means Clustering: A unsupervised learning algorithm that provides instant clustering in this system. Under evaluation, it was proven feasible to extract the topic distribution from LDA for each tourist spot as the attributes in euclidean distance calcuation in K-means. This replaces the tradition sparse term-document matrix and reduces the time and space cost.
  • Elo Rating System: A picture ranking algorithm inspired by the film, The Social Network (2010). Considering aesthetic as an important factor in picture selection, we trained the pictures using elo-rating algorithm to assign rankings and integrate in our system as the one of the demonstration basis.

Techniques

Areas Techniques
Analytics Python
Front-end Javascript, CSS, HTML
Back-end PHP
Database MySQL
Cloud Service Google Cloud Platform

About

Building a picture based attraction recommendation system by combing LDA and K-Means in the structure of Scatter/Gather

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published