Skip to content

andrewdblevins/beyond_word2vec

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 

Repository files navigation

beyond_word2vec

Slides for my doc2vec workshop/talk

Setup Instructions

1) Clone repo

git clone git@github.com:andrewdblevins/beyond_word2vec.git

once you clone this repo, you can follow along these instructions in this notebook

2) Python Installs:

Please note, that we'll need several packages installed on your laptop.

If you have python, gensim and keras installed on your laptop you are probably good to go.

If you haven't performed installs yet, here are some steps to follow:

I generally recommend using the anaconda package

Create a conda environment:

conda create -n beyondw2v python=3    
source activate beyondw2v
conda install anaconda     

Then install additional packages:

conda install gensim
conda install -c conda-forge keras     
conda install -c conda-forge theano

3) Pretrained Word Vectors

Download the pretrained google news word vectors

warning: This file is 3.6GB

!wget https://s3.amazonaws.com/mordecai-geo/GoogleNews-vectors-negative300.bin.gz

Optional you may want to test other pretrained vectors. If so, download those

4) Evaluation Dataset

https://nlp.stanford.edu/projects/snli/

Download the following ~100MB:

!wget https://nlp.stanford.edu/projects/snli/snli_1.0.zip

About

Slides for my doc2vec workshop/talk

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages