Skip to content
Materials for Big Data in Finance (Baruch MFE 9898)
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.

Big Data in Finance

Masters in Financial Engineering

Baruch College, City University of New York, Department of Mathematics, MFE 9898, 2016.


Andrew Peterson (New York University)


This repository includes basic information for class sessions on analyzing big data in Finance and on Assignment C. First, we will cover how to use a Message Passing Interface (MPI) for parallel programing in C/C++, as well as how to access and use the high performance computing cluster at Baruch. Next, we learn about Hadoop and Apache Spark to analyze data at a massive scale and streaming data, and then apply them to see if Twitter data can be used to predict stock returns (Assignment C).


Content Materials
1 MPI Programming in C++ Slides, code
2 Apache Spark for Analyzing Tweets and Streaming data Slides, code
3 Launching Spark on Amazon AWS, Slides, code
4 Assignment C: Predicting Stocks from Twitter data using Spark Assignment
5 Assignment C: Cleaning Tweet Data code
6 Assignment C: Gathering Stock Data code
You can’t perform that action at this time.