Skip to content

louisdo/MoviesSearch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MoviesSearch

A simple search engine using term frequency-inverse document frequency (TFIDF)
I saw my cousin building this, got curious and tried to build my own

Prepare data

rem "download general data from:  https://drive.google.com/file/d/1CZsJGWS9hZ7z2t_fJcmxnn-fVo_EtM5P/view?usp=sharing"
rem "create a folder named 'data'"
rem "move the csv file just downloaded in to 'data'"

python prepare_data/create_tokenize_data.py --csv-in ./data/general_movies_data.csv --csv-out ./data/tokenized_data.csv

How to run

>>> from MoviesSearch import MoviesSearchEngine
>>> search_engine=MoviesSearchEngine("path/to/tokenized/data","path/to/general/data")
>>> search_engine.search("woody and buzz lightyear")
[(1, 'Toy Story'), (2, 'Toy Story 3'), (3, 'Toy Story 2'), (4, 'In the Shadow of the Moon'), (5, 'For Your Consideration')]

About

Simple movies search engine

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages