Skip to content
CSIE-63: Big Data Analytics course tought at Harvard - Fall semester 2017
HTML Jupyter Notebook Java Python Scala Shell Other
Branch: master
Clone or download
Latest commit 731af65 Dec 20, 2017
Type Name Latest commit message Commit time
Failed to load latest commit information.
hw1 add final hw Dec 20, 2017
hw10 add hw10 Nov 4, 2017
hw11 add final hw Dec 20, 2017
hw12 add final hw Dec 20, 2017
hw2 delete spark logs Sep 19, 2017
hw3 add final hw Dec 20, 2017
hw5 add hw5 and hw6 Oct 9, 2017
hw6 add final hw Dec 20, 2017
hw7 add hw8 Oct 23, 2017
hw8 add final hw Dec 20, 2017
hw9 add final hw Dec 20, 2017
.gitignore delete metastore_db Sep 20, 2017 Update Dec 20, 2017
Syllabus e63 2017 Fall-1.pdf add hw 1-3 Sep 18, 2017

CSCI E-63 Big Data Analytics

Professional Graduate Data Science Coursework - Fall semester 2017

Professor: Zoran B. Djordjević, PhD, Senior Enterprise Architect, NTT Data, Inc.


The emphasis of this course is on mastering two important big data technologies: Spark 2 and TensorFlow. The focus is on Spark Core, Spark ML (machine learning), and Spark Streaming which allows analysis of data in flight, that is, in near real time. Furthermore the so-called NoSQL storage solutions exemplified by Cassandra are examined. An additional focus lies on memory-resident databases and graph databases (Spark GraphX and Ne4J) and scalable messaging systems like Kafka and Amazon Kinesis.

File Layout

The hw directory structure is as follows:

. Files such as README and gitignore
./docs/ Different files and presentations
./data/ Folder with all the necessary data
./scripts/ Folder with all the code


You can access all the coursework etc. here.

You can’t perform that action at this time.