To build and query the ES_example application:
-
Run the elasticsearch server in the background (e.g., double click on bin\elasticsearch.bat in Windows)
-
run: python index.py (builds an index called simple_film_index)
-
run: python query.py
-
open http://127.0.0.1:5000 in browser in order to query the application
Title: Assignment 5: Learning Elasticsearch
Author: Zhiheng Wang
Date: 4/14/2019
Description: Build a search engine on 2018 movies data with Elasticsearch and flask.
Dependencies:
Python 3.6.5
Flask (http://flask.pocoo.org)
Elasticsearch (https://www.elastic.co/downloads/elasticsearch)
Elasticsearch (https://pypi.org/project/elasticsearch)
Elasticsearch-dsl (https://elasticsearch-dsl.readthedocs.io/en/latest)
Build Instructions: Install these packages in any sequences.
Run Instructions:
Run the elasticsearch server in the background
index.py: building an inverted index for the database
query.py: calling the search engine.
Tokenization:
Elasticsearch standard tokenizer for text search, and whitespace tokenizer for others.
Text Normalization:
Porter stemmer, lowercase, asciifolding for text. Lowercase for others.
Testing:
Top 3 search results for Search Text: crime drama “philip roth”, with min runtime 130:
Drama, score: 10.024916
Abrahaminte Santhathikal, score: 9.678776
My Brother's Name Is Robert and He Is an Idiot, score: 8.974666