Skip to content

ndvinh98/CS336.J21-Music-Search-Engine

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CS336.J21-Music-Search-Engine

Build a simple search engine based on TF-IDF

Introduction

The data is crawled from the top-100-song chart in nhaccuatui.com

The code editor used for this project is the Visual studio Code

The crawl methods is presented in crawl.py

The string normalization method is presented in textprocessing.py

The construction methods, indexing data, are presented in indexing.py

The methods of building GUI, querying, ranking results, are presented in main.py


You will need to install some package:

pip install requests

pip install beautifulsoup4

pip install selenium

pip install underthesea

pip install PyQt5

How to run

Step 1: Crawl data: (If using existing crawled data, skip this step)

python crawl.py

Step 2: Build Data, inverted_index: (If using existing crawled data, skip this step)

python indexing.py

Step 3: start GUI

python main.py

Demo

About

Build a simple search engine based on TF-IDF using python

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages