Skip to content

Dato/Turi DS Conf talk on NLP and Elasticsearch analysis of reviews, plus JS implementation

Notifications You must be signed in to change notification settings

arnicas/nlp_elasticsearch_reviews

Repository files navigation

NLP in Python and ElasticSearch

Conference talk materials for the Dato/Turi data science symposium in SF, July 12-13, 2016. Online slides to go with: http://ghostweather.slides.com/lynncherny/deck-6.

This talk is really a long tutorial -- how to go from a giant mess of data to a search app on the web. The ES api is a little tricky and there aren't a lot of good end-to-end tutorials and use cases out there. If you are Python fluent, this will get you to the web.

Three parts:

    1. Mostly pandas exploratory analysis of Yelp eviews in Python, and some simple NLP. Create dataframes that can be read into Elasticsearch. Simple sentiment by dictionary. Trends over time, and other exploration including aggregations by user and business. This is the "1. Yelp Reviews Explored in Python" notebook.
    1. Elasticsearch demo and examples in another notebook, using saved df's, in "2. ElasticSearch-Tricks-and-Tips.ipynb". This file explores how you will set up your own ES indexing, custom analyzers in ES, non-trivial queries, etc.
    1. Web demo using the elasticsearch-js library, in the web directory. Assumes a running localhost instance of your indexed data.

All of them assume localhost usage.

Python Environment

Install and use miniconda from Continuum. Create your environment with:

conda env create -f environment.yml
[.... say yes to everything...]
source activate esnlp

It's called "esnlp".

Slides

Again: online slides to go with: http://ghostweather.slides.com/lynncherny/deck-6.

I'm @arnicas on Twitter.

About

Dato/Turi DS Conf talk on NLP and Elasticsearch analysis of reviews, plus JS implementation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published