- Utilizing the Reddit API, this project extracts thread ID data from the "Daily Discussion" threads within the subreddit r/doge for each day of the three-month period.
- The PMAW third-party wrapper facilitates the batch extraction of a total of 554k comments contained in these "Daily Discussion" threads.
- The sentiment analysis process involves determining the polarity/compound, positive, negative, and neutral scores of each comment on a scale from -1 to 1 using VADER, a sentiment analysis tool specifically attuned to social media content.
- Based on the polarity/compound score, each comment is attributed an overall positive, negative, or neutral rating.
- Utilizing the CoinGecko API, this project extracts 5-minute interval data of Dogecoin's stock value over the span of three months.
- By parsing comment scores and using timestamps by interval, the project calculates the mean average of each comment score (compound, positive, negative, neutral) for every 5-minute interval.
- This data is then analyzed successively in tandem with the stock ticker value for plot/chart use, allowing for correlation analysis.
- Using the VADER compound score of each comment, an overall determination of comment rating (positive, negative, neutral) is attributed to each comment.
- These ratings are then applied in training models to determine prediction accuracy using Naive Bayes and Random Forests based on the VADER classification.
- The project includes detailed analyses of the process and performance results of the machine learning classification.
- Accuracy prediction results for Naive Bayes and Random Forest models are presented, along with a classification report and Confusion Matrix Heatmap display as determined by the Random Forest model.
Crypto Sentiment Analysis provides valuable insights into the social sentiment surrounding Dogecoin, offering a nuanced understanding of its fluctuations in correlation with stock ticker value. This repository serves as a comprehensive resource for those interested in sentiment analysis and machine learning applications within the cryptocurrency domain.
application deployment [heroku down, use local]
git clone https://github.com/cspence001/crypto_sentiment_analysis.git
cd crypto_sentiment_analysis
python3 app.py