Skip to content
Materials for Big Data in Finance (Baruch MFE 9898)
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
AWS
MPI
Spark
clean_public_tweets
yahoo_stock_data
Assignment_C.pdf
LICENSE
README.md

README.md

Big Data in Finance

Masters in Financial Engineering

Baruch College, City University of New York, Department of Mathematics, MFE 9898, 2016.

Instructor

Andrew Peterson (New York University)

Description

This repository includes basic information for class sessions on analyzing big data in Finance and on Assignment C. First, we will cover how to use a Message Passing Interface (MPI) for parallel programing in C/C++, as well as how to access and use the high performance computing cluster at Baruch. Next, we learn about Hadoop and Apache Spark to analyze data at a massive scale and streaming data, and then apply them to see if Twitter data can be used to predict stock returns (Assignment C).

Outline

Content Materials
1 MPI Programming in C++ Slides, code
2 Apache Spark for Analyzing Tweets and Streaming data Slides, code
3 Launching Spark on Amazon AWS, Slides, code
4 Assignment C: Predicting Stocks from Twitter data using Spark Assignment
5 Assignment C: Cleaning Tweet Data code
6 Assignment C: Gathering Stock Data code
You can’t perform that action at this time.