Skip to content

Perform a regression task to predict the number of comments on a post after a certain period, applying it to Reddit

License

Notifications You must be signed in to change notification settings

TNguyen1228/Comments_Prediction_Data_Mining

 
 

Repository files navigation

Blog comment prediction

Introduction

Perform a regression task to predict the number of comments on a post after a certain period, applying it to Reddit

Demo: https://comment-reddits.streamlit.app/

Data

The BlogFeedback data can be downloaded from here

Setup

1. Clone this repository

git clone https://github.com/TranMinhDuc190103/Data_mining_finals.git

or download directly instead.

2. Install Dependencies

Create a virtual environment and install the required packages:

pip install -r requirements.txt

Run Jupyter notebooks

  1. Training the Model: Use the following notebook to train your model.

In Models folder we provide 3 pre-trained models saved as .joblib and 3 Jupyter notebooks used to train model. You can run each notebook to get the pre-trained model or use it instead.

  1. Crawl data from Reddit

You can self crawl some data from Reddit by running reddit-crawler.ipynb in folder crawl to crawl data from Reddit. However you need some key from Reddit app to continue.

The credentials.py contain some infomation to interact with Reddit API. Due to security concerns, we are unable to provide complete information. Please contact us for further details.

Run the app

After installing important libraries and storing your infomation about Reddit app in credentials.py, you can run the app with following command

streamlit run app_T.py

Contributing

Contributions are welcome! Please open an issue or submit a pull request for any improvements.

Contact

If you have any question, please contact us via phone or email below:

Trần Minh Đức, 0344794259, tranminhduc5_t66@hus.edu.vn

Nguyễn Mạnh Tuấn, 0349292753, nguyenmanhtuan_t66@hus.edu.vn

Lê Quốc Lâm, 0337213192, lequoclam_t66@hus.edu.vn

Lê Gia Huy, 0984588603, legiahuy_t66@hus.edu.vn

About

Perform a regression task to predict the number of comments on a post after a certain period, applying it to Reddit

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Languages

  • Jupyter Notebook 90.7%
  • Python 9.3%