Big Data in Finance
Masters in Financial Engineering
Baruch College, City University of New York, Department of Mathematics, MFE 9898, 2016.
Andrew Peterson (New York University)
This repository includes basic information for class sessions on analyzing big data in Finance and on Assignment C. First, we will cover how to use a Message Passing Interface (MPI) for parallel programing in C/C++, as well as how to access and use the high performance computing cluster at Baruch. Next, we learn about Hadoop and Apache Spark to analyze data at a massive scale and streaming data, and then apply them to see if Twitter data can be used to predict stock returns (Assignment C).
|1||MPI Programming in C++||Slides, code|
|2||Apache Spark for Analyzing Tweets and Streaming data||Slides, code|
|3||Launching Spark on Amazon AWS,||Slides, code|
|4||Assignment C: Predicting Stocks from Twitter data using Spark||Assignment|
|5||Assignment C: Cleaning Tweet Data||code|
|6||Assignment C: Gathering Stock Data||code|