Slides and coding demo for word2vec
Jupyter Notebook
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
images
.gitignore
LICENSE
README.md
environment.yml
setup_instructions.md
word2vec_demo.ipynb
word2vec_slides.ipynb

README.md

Word2vec Algorithm: Made as simple as possible, but no simpler


Summary

This talk will give a Pythonic introduction to the word2vec algorithm. Word2vec, translating words (strings) to vectors (lists of floats), is a relatively new algorithm which has proved to be very useful for making sense of text data. You will gain a conceptual understanding of the algorithm and be empowered to try it out on your favorite collection of text data.


Description

“You shall know a word by the company it keeps” is a common refrain in natural language processing (NLP). word2vec is a simple neural network that learns which words tend to co-occur and embeds the words in a vector space. From these word embeddings, it is possible to use distance measures to compare words, find neighbors by clustering, and add/subtract words to explore relationships between concepts. Actually, word2vec is a general purpose algorithm that allows any sequential data to be encoded into meaningful vectors - including emojis!


Bio

Dr. Brian Spiering is a faculty member at GalvanizeU, which offers a Master of Science in Data Science. His passions are natural language processing (NLP), deep learning, and building data products. He is active in the San Francisco data science community through volunteering and mentoring.

Drop him a line brian.spiering@galvanize.com


Presented at SF Python meetup

Disclaimer: These are interactive notebooks that are meant to be run. There might be elements not rendered correctly on static GitHub pages.