Jiadong Yan
Jiaming Xu
Xinyi Jiang
Jiadong Yan
May 10th 2017
- basic search: search the description on title, lyric, album and artist_name
- advanced search: search on every possible filed, including title, lyric, album, artist_name, location, duration, genres, year
- More like this: search similar tracks, based on its title, lyric and similarity of the artist
- Sorting:
- default: sort by relevance
- sort by song hotness
- sort by danceability
- install homebrew
brew tap homebrew/science
brew install hdf5
pip install Cython
- install Tables by
sudo pip install git+https://github.com/PyTables/PyTables
- install other packages mentioned in Dependency.
- build elasticsearch as mentioned in Build Elasticsearch
- type
redis-server
in terminal - open another terminal window, type:
cd elasticsearch-<version>
./bin/elasticsearch
python query.py
cp lib/name_syn.txt [your elasticsearch path]/config/name_syn.txt
cp lib/cat_syn.txt [your elasticsearch path]/config/cat_syn.txt
(It is not recommended, but if you really want to let your web application access a folder outside its deployment directory. You need to add permission in java.policy file. Details see http://stackoverflow.com/questions/10454037/java-security-accesscontrolexception-access-denied-java-io-filepermission)- open elasticsearch server:
cd elasticsearch-<version> ./bin/elasticsearch
- run
python ./lib/buildElaticSearch.py
- build time: 9s
- use another terminal to run
redis-server
- We support baisc title and lyrics search for whatever you want!
- We support many filters, like duration, artist, genre, etc!
- You can find hotttest songs near your position!!!
- hdf5
- Cython
- Flask
- PyTables
- elasticsearch_dsl
- elasticsearch
- json
- math
- redis
{"1":{
"trackID": string,
"title": (song's name) string,
"year": int,
"song_hotttnesss": float,
"artistName": string,
"artistID": string,
"artist_hotttnesss": float,
"artist_location":String,
"duration": (seconds) int,
"release": (album name) string,
"similar_artists": a list of (artistID) string,
"lyrics": string,
"artist_longitude": float,
"artist_latitude": float,
"artist_location": String,
"danceability": float
},
"2":{
},
...
}
We have a test corpus sample_corpus.json
. To build elasticsearch with this corpus, call build method with the path of sample corpus as parameter: build("sample_corpus.json")
- simple query: {'description': u'love'}}
- advanced search query: d_query = {'title': u'', 'lyric': u'', 'album': u'', 'max_longitude': u'', 'min_longitude': u'', 'description': u'love', 'max_duration': u'','min_duration': u'', 'artist_name': u'', 'min_latitude': u'', 'max_latitude': u'', 'year': u'', 'genre': u'', 'artist_location': u''}
The parameter is the track_id
use 'hot' or 'dance' as parameter, the results will sort by song_hotttnesss or danceability.
search({'description': u'love'},'hot') search(d_query,'hot') search_track(1,'dance')
- Main entry or runtime app of the Music Information Retrieval System.
- Integrate all the models and handle all the http request.
- Search Algorithm implementation.
- Session management.
css file and images
- search box view
- search results view
the corpus of 10000 songs, the format is described above.
This file defines a doc_tpye track and its field, as well as the analyzers
This file builds the elasticsearch index from the music_corpus.json
This file takes different types of query as input, builds elasticsearch search query, search in elastic and return the results. It is responsible of search on all fields and sort by specific features.
get raw information from hdf5 file ==> raw_music_corpus.json
get lyrics information according to train data.txt file ==> music_corpus.json
get lyrics information and genre information using API to MXM website ==> new_music_corpus.json
get desired attributes from the corpus