Skip to content

Commit

Permalink
adding process notebook and updating readme file
Browse files Browse the repository at this point in the history
  • Loading branch information
William Chen authored and William Chen committed Dec 13, 2013
1 parent e966d25 commit 5aee35c
Show file tree
Hide file tree
Showing 2 changed files with 95 additions and 6 deletions.
82 changes: 82 additions & 0 deletions Process.ipynb
@@ -0,0 +1,82 @@
{
"metadata": {
"name": ""
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
{
"cell_type": "heading",
"level": 2,
"metadata": {},
"source": [
"Overview and Motivation"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Through kaggle.com, we participated in predictive stock price modeling competition in the hack/reduce space at the Boston Data Festival, sponsored by DataRobot. Our challenge was to predict whether a stock would go up or down on a particular day, given the stock's pricing data from the 10 days prior. After experimenting with various models and attempting to blend models, we eventually reached 94.119% accuracy with our predictions. In the end, we won 1st place in the competition. Through this engaging and rewarding experience with predictive stock price modeling, we became very excited about the real-world applications of this kind of model and wanted to go more in depth into our predictive analysis after the hackathon. Our CS109 final project gave us a structured opportunity to extend our work on this model. Overall, our project goals were: \n",
"\n",
"1. Build well-adjusted models for predicting stock price movements in various contexts, such as growth stocks, value stocks, and penny stocks. Quantify how successful our models are. \n",
"2. Glean insight about how stock prices work in various contexts and and how predictable they are by our models or predictive models in general.\n",
"3. ???\n",
"4. Profit"
]
},
{
"cell_type": "heading",
"level": 2,
"metadata": {},
"source": [
"Related Work"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We were inspired by the successes we've seen in algorithmic trading and computer-assisted stock analysis, we \n",
"\n",
"Sebastian's prior work with finance data?\n",
"Any papers or websites that inspired us?"
]
},
{
"cell_type": "heading",
"level": 2,
"metadata": {},
"source": [
"Initial Questions"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Based on information about a stock's opening and closing prices over a series of days, we initially tried to determine the directional movement of a stock's price on a particular day. Over the course of our project, we refined our question to consider differences in our model's prediction potential among different categories of stocks, such as technology, energy, and consumer services. "
]
},
{
"cell_type": "heading",
"level": 2,
"metadata": {},
"source": [
"Data"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The initial dataset we used was a dataset used by Hack/Reduce at the Boston Data Festival Predictive Modeling Hackathon. The training data consisted of the opening, closing, maximum and minimum prices for 94 stocks over 500 days. The hackathon dataset used data from 20 years ago, and we wanted to "
]
}
],
"metadata": {}
}
]
}
19 changes: 13 additions & 6 deletions README.md 100644 → 100755
@@ -1,9 +1,10 @@
################################################################################
# Description
################################################################################
This model was created for the Boston Data Week hackathon hosted at Hack/Reduce.
Information about the competition is available at
https://inclass.kaggle.com/c/boston-data-festival-hackathon

Our code and process notebook for our analysis and predictive modeling
approaches to understand directional stock movements. We have launched a website
to showcase our work at https://sites.google.com/site/predictingstockmovement/

################################################################################
# Objective
Expand All @@ -21,11 +22,19 @@ Team Members: William Chen, Sebastian Chiu, Salena Cui, Carl Gao
################################################################################
# Result
################################################################################
We placed 1st out of 21 teams, and were able to achieve a 94.119% AUC
We submitted our Ridge-Random Forest model to the Boston Data Week hackathon
hosted at Hack/Reduce. Information about the competition is available at
https://inclass.kaggle.com/c/boston-data-festival-hackathon

We placed 1st out of 21 teams, and were able to achieve a 94.119% AUC on the
private leaderboard

################################################################################
# Files
################################################################################
Process notebook
Notebook describing our work and our main contributions

model_tuner.py or model_tuner.ipynb
Find the parameters for the ridge regression and random forest regression
that we used
Expand All @@ -49,8 +58,6 @@ test.csv - data to create prediction. Data provided for 25 time segments. Each s

Each line in train.csv and test.csv contains consecutive trading days. Days when market was closed were excluded. Thus day N may be Friday and day N+1 may be Monday or even Tuesday if Monday was a holiday.



Value to predict - probability of stock moving up from opening of day 10 to closing of day 10. Prediction should be in 0-1 range, where 1 - "stock surely will go up", 0- "stock surely will go down".

Test set is randomly sampled without overlapping from year following training data time period.

0 comments on commit 5aee35c

Please sign in to comment.