Skip to content
Switch branches/tags
Go to file

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

Hello Everyone,

Here is the list of packages needed for our Text Mining Lab Session scheduled for 6/29/2017 (2:00-5:00 p.m.)


  • I have uploaded some poster examples of some past students. (Check the posters folder)
  • For the guys intereted in the slack community, send me your email to ellfae@gmail and I will provide an invite
  • If you have any other questions or technical problems, feel free to stop by Idea Lab Delta 701. I will be more than happy to assist.
  • I may extend the python notebook based on the excellent questions you guys asked (e.g., more statistics, visuals, etc.)
  • Lastly, good luck and enjoy your stay here.


Computing Resources:

  • Operating System: Preferably Linux or MacOS (Windows break but you can try it out)
  • RAM: 4GB
  • Disk Space: 8GB (mostly to store word embeddings)


Once you have installed all the necessary packages, you can test to see if everything is working by running the following python code:

import logging
logging.root.handlers = []  # Jupyter messes up logging so needs a reset
logging.basicConfig(format='%(asctime)s : %(levelname)s : %(message)s', level=logging.INFO)
from smart_open import smart_open
import pandas as pd
import numpy as np
from numpy import random
import gensim
import nltk
from sklearn.cross_validation import train_test_split
from sklearn import linear_model
from sklearn.feature_extraction.text import CountVectorizer, TfidfVectorizer
from sklearn.metrics import accuracy_score, confusion_matrix
import matplotlib.pyplot as plt
from gensim.models import Word2Vec
from sklearn.neighbors import KNeighborsClassifier
from sklearn import linear_model
from nltk.corpus import stopwords
%matplotlib inline

If you have any further questions please feel free to contact me at

Have Fun,

Elvis Saravia (Text Mining TA)


Requirements for Text Mining Summer Course (Lab Session)




No releases published


No packages published