News categorizer is docker-based web service:
- provides category of given news url
- provides category of given news body
Edit docker-compose.yml based on your server. Then run the following command:
docker-compose up
The dataset we use for the predictive model is BBC News. We split BBC News Train.csv
into %20 of the data as validation, %10 of the data test, and the rest as train set by using random seed 42.
It uses BERT to predict the category given content. BERT is fine tuned by using ktrain library in Colab. You may use same scripts on the Colab to train your model. Make sure that you replace the files names as model
and model.preproc
under the directory model
in the source code.
For url based detection, it uses rule-based approach.
MIT