Skip to content
Data Science Workshop
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.
Intro to Data Science.pdf

Open Data Day DC


  • Pri Oberoi, Data Scientist, Commerce Data Service
  • Star Ying, Data Scientist, Commerce Data Service


This is a quick introduction to data science and short example of topic clustering using National Institute of Standards and Technology newsfeed.

Getting Started

To follow the example in the workshop, Python 2.7 and pip is required. Here are the steps required for getting started:

  1. You can use sudo easy_install pip or brew install python to install pip.
  2. Clone or download a copy of this repo to your local machine.
  3. Install required packages through pip with this command: pip install requirements.txt.
  4. Open a local jupyter-notebook instance with this command: jupyter-notebook <dir_of_cloned_repo>.
  5. An instance of jupyter should have launched on your default browser. Open kMeansClustering.ipynb.
You can’t perform that action at this time.