Skip to content
Branch: master
Find file History
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
..
Failed to load latest commit information.
01_Introduction updated ch7 May 20, 2018
02_Working_with_Bag_of_Words
03_Implementing_tf_idf updated ch7 May 20, 2018
04_Working_With_Skip_Gram_Embeddings
05_Working_With_CBOW_Embeddings final edits Aug 29, 2018
06_Using_Word2Vec_Embeddings
07_Sentiment_Analysis_With_Doc2Vec final edits Aug 29, 2018
images
readme.md

readme.md

Ch 7: Natural Language Processing

  1. Introduction
  • We introduce methods for turning text into numerical vectors. We introduce the TensorFlow 'embedding' feature as well.
  1. Working with Bag-of-Words
  • Here we use TensorFlow to do a one-hot-encoding of words called bag-of-words. We use this method and logistic regression to predict if a text message is spam or ham.
  1. Implementing TF-IDF
  • We implement Text Frequency - Inverse Document Frequency (TFIDF) with a combination of Sci-kit Learn and TensorFlow. We perform logistic regression on TFIDF vectors to improve on our spam/ham text-message predictions.
  1. Working with Skip-Gram
  • Our first implementation of Word2Vec called, "skip-gram" on a movie review database.
  1. Working with CBOW
  • Next, we implement a form of Word2Vec called, "CBOW" (Continuous Bag of Words) on a movie review database. We also introduce method to saving and loading word embeddings.
  1. Implementing Word2Vec Example
  • In this example, we use the prior saved CBOW word embeddings to improve on our TF-IDF logistic regression of movie review sentiment.
  1. Performing Sentiment Analysis with Doc2Vec
  • Here, we introduce a Doc2Vec method (concatenation of doc and word embeddings) to improve out logistic model of movie review sentiment.
You can’t perform that action at this time.