Skip to content
Branch: master
Find file History
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.
01_Introduction updated ch7 May 20, 2018
03_Implementing_tf_idf updated ch7 May 20, 2018
05_Working_With_CBOW_Embeddings final edits Aug 29, 2018
07_Sentiment_Analysis_With_Doc2Vec final edits Aug 29, 2018

Ch 7: Natural Language Processing

  1. Introduction
  • We introduce methods for turning text into numerical vectors. We introduce the TensorFlow 'embedding' feature as well.
  1. Working with Bag-of-Words
  • Here we use TensorFlow to do a one-hot-encoding of words called bag-of-words. We use this method and logistic regression to predict if a text message is spam or ham.
  1. Implementing TF-IDF
  • We implement Text Frequency - Inverse Document Frequency (TFIDF) with a combination of Sci-kit Learn and TensorFlow. We perform logistic regression on TFIDF vectors to improve on our spam/ham text-message predictions.
  1. Working with Skip-Gram
  • Our first implementation of Word2Vec called, "skip-gram" on a movie review database.
  1. Working with CBOW
  • Next, we implement a form of Word2Vec called, "CBOW" (Continuous Bag of Words) on a movie review database. We also introduce method to saving and loading word embeddings.
  1. Implementing Word2Vec Example
  • In this example, we use the prior saved CBOW word embeddings to improve on our TF-IDF logistic regression of movie review sentiment.
  1. Performing Sentiment Analysis with Doc2Vec
  • Here, we introduce a Doc2Vec method (concatenation of doc and word embeddings) to improve out logistic model of movie review sentiment.
You can’t perform that action at this time.