Skip to content

letterboxd & twitter data collection project for movie recommendation system

Notifications You must be signed in to change notification settings

celikfatih/data-collector

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

data-collector

Introduction

data-collector application is a data collection application. It collects data for use in studies for movie recommendation systems.

What does it use?

It uses the movie review site Letterboxd to collect data. It collects profiles that share their Twitter profiles on the Letterboxd site and their 1800 movie ratings.

It then collects Twitter user profiles and the last 3000 tweets using the Twitter usernames of those users.

Used technologies

This work is a Spring-Boot project. This framework provides various facilities for database I/O operations and scheduled operations.

Cassandra was used as the database. It is preferred because it is NoSQL, fast when working on big data, and it is an open source project.

Jsoup is the most common tool used for web scraping in Java. It was used in data scraping on the Letterboxd site.

Twitter4J was used to collect profiles and related tweets of Twitter users.

Finally, Lombok. It saves us from writing various methods such as Getter/Setter and Constructor.

How to use?

Soon.

About

letterboxd & twitter data collection project for movie recommendation system

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages