Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.
This branch is even with bsuhagia:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Week of 6/7
Week of 6/14
- Change topic from sentiment analysis to stock market clustering and predictions.
- Create informal project proposal.
Week of 6/21
- Determine as a group which stocks we would like to perform our analysis on. Currently, we are looking forward to analyzing SP500 stocks.
- Determine as a group what time periods we would like to look at in order to avoid outlier years.
- Get all group members familiar with scikit-learn and R through individual exploration.
- Gather all data from the stocks and convert into a format needed for analysis
Week of 6/28
- Produce visualization graphics using dummy data
- Create and test different models created using different algorithms
Week of 7/5
- Create a visualization that demonstrates our results
- If we get good results play with the data and attempt to do predictions on stock prices given related stocks. This would be a form of a supervised learning done by altering the data to be given stock prices of the cluster and have to predict what our stock will be.
- Begin work on the project progress report.
Week of 7/12
- Finish project progress report.
- Attempt to use alternative algorithms to cluster the data.
- Begin work on final project report.
Task 1: Gathering data using R or Python (everyone)
- Finish final project report.
- Begin working on the project presentation.
Task 2: Determining the most important attributes to use and what types of machine learning techniques should be implemented (in short Data manipulation)
- Gather data using R or Python techniques
- Saving the data in correct file formats for future analysis
Task 3: Generating and testing models.
- Analyze importance of each attribute
- Adding or removing attributes
- Determine what type of algorithms would work best
Task 4: Visualizing results
- Design and create the optimal models using basic and advanced algorithms
- Support Vector Machines
- K-Nearest Neighbor
- Expectation Maximization
- Density-Based Clustering
- Test the methods on the data
- Modify and optimize the methods based on the testing
Task 5: Writing the final report
- Finding trends in the data results
- Creating charts and graphs to visualize the trends
- Creating network structure to represent similarities between different stocks
Task 6: Create the Presentation
- Combine the visual results along with the concluding ideas to form a final report
- Use charts and graphs to present the trends found in our data and analysis results