Skip to content

This model has a regression and classification model which uses a datasheet with pre defined parameters to determine whether a piece of information would be popular or not. The data can be scraped from varioes news website and create the datasheet.However in this case the datafile created and used are different.

License

pratham0203/news-popularity-model

Repository files navigation

news-popularity-model

Contents of the Project:-

1)Integrated scraped data and ML model to predict virality(Main Project):

Here using Regression model that uses a dataset OnlineNewsPopularityClassification.csv to train itself and then check the virality of the scraped data.The virality is checked on the basis of various information that has been scraped from Times of India website.

The file includes these data for evaluation: Capture3

Later on after using sentiment analysis and weighing the relevant words with the ones in popular news a model is created.

For this the data like number of tokens, number of shares etc are used from the respected website. Capture4

Later on the virality or popularity score is given: Capture55

The score lies between 0 and 1(0 corresponding to not popular news and 1 corresponds for popular news)

2)Classification Model:

This model has various algos like Logistics Regression,Random Forest Classifier,SVM but the one actively used is RandomForestClassifier due to its best results.

The Output shows Essentials of the model using RandomForestClassifier after being trained and tested. Capture

The Output shows the labels before and after standardization. It also shows the accuracy of this model. Capture1

3)Regression Model:

This model uses Bayesian Linear Regression to solve the problem and give us the required accuracy of the model.

4)News Aggregator(made using Django by scraping popular websites like Times of India etc):

I also made a website using Django that can be later used to project the popular news on a single site only. website

Web Scraping of Times Of India: webscraptoi

Web Scraping Of Hindustan TImes: webscrapht

Web Scraping of The Economist: webscrapet

Future of the Project:

A much proper integration of model and NewsAggregator which could predict the virality of the news and display the link to the site on my website asap.

License

The project is available as open source under the terms of the MIT License.

About

This model has a regression and classification model which uses a datasheet with pre defined parameters to determine whether a piece of information would be popular or not. The data can be scraped from varioes news website and create the datasheet.However in this case the datafile created and used are different.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published