word2vec workshop - a conceptual introduction and practical application
Jupyter Notebook Python
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
images
.gitignore
0_welcome.ipynb
1_word2vec_intro.ipynb
2_word2vec_application.ipynb
3_word2vec_activity.ipynb
4_everything2vec.ipynb
LICENSE
README.md
environment.yml
setup_instructions.md
word2vec_basic.py

README.md

Word2vec Algorithm: Made as simple as possible, but no simpler


Description

A Pythonic introduction to the word2vec algorithm. Word2vec, translating words (strings) to vectors (lists of floats), is a relatively new algorithm which has proven to be very useful for making sense of text data. You should walk out at the end with a conceptual understanding of the algorithm and be empowered to try it out on your favorite collection of text data.

“You shall know a word by the company it keeps” is a common refrain in Natural Language Processing (NLP). word2vec does that by training a neural network to learn which words tend to co-occur together and embeds the words in a meaningful vector space. From these "word embeddings", it is possible to compare words with distance measures, add/subtract words to explore relationships between concepts, and clustering to find semantically related words. Actually, word2vec is a general purpose algorithm that allows any sequential data to be encoded into meaningful vectors - including emojis!


Bio

Dr. Brian Spiering is a faculty member at GalvanizeU which offers a Master of Science in Data Science. His passions are Natural Language Processing (NLP), deep learning, and building data products. He is active in the San Francisco Data Science community through volunteering and mentoring.

Drop him a line brian.spiering@galvanize.com


Disclaimer: These are interactive notebooks that are meant to be run. There might be elements not rendered correctly on static GitHub pages.