Skip to content

Uses CNN model to categorize Arabic news articles based on Economy, Sports, Politics, and Entertainment. SANAD Dataset.

Notifications You must be signed in to change notification settings

JoeFarag-00/Arabic-News-Categorizer-v1.0

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Arabic News Categorizer

A simple Arabic news classification Model using TensorFlow. Takes user input through a Tkinter GUI.

Python PyPI DOI CII Best Practices OpenSSF Scorecard Fuzzing Status Fuzzing Status OSSRank Contributor Covenant

Documentation
Documentation

TensorFlow is an end-to-end open source platform for machine learning. It has a comprehensive, flexible ecosystem of tools, libraries, and community resources that lets researchers push the state-of-the-art in ML and developers easily build and deploy ML-powered applications.

TensorFlow was originally developed by researchers and engineers working on the Google Brain team within Google's Machine Intelligence Research organization to conduct machine learning and deep neural networks research. The system is general enough to be applicable in a wide variety of other domains, as well.

TensorFlow provides stable Python and C++ APIs, as well as non-guaranteed backward compatible API for other languages.

Keep up-to-date with release announcements and security updates by subscribing to announce@tensorflow.org. See all the mailing lists.

Install

See the TensorFlow install guide for the pip package, to enable GPU support, use a Docker container, and build from source.

To install the current release, which includes support for CUDA-enabled GPU cards (Ubuntu and Windows):

$ pip install tensorflow

Other devices (DirectX and MacOS-metal) are supported using Device plugins.

A smaller CPU-only package is also available:

$ pip install tensorflow-cpu

To update TensorFlow to the latest version, add --upgrade flag to the above commands.

Nightly binaries are available for testing using the tf-nightly and tf-nightly-cpu packages on PyPi.

Try your first TensorFlow program

$ python
>>> import tensorflow as tf
>>> tf.add(1, 2).numpy()
3
>>> hello = tf.constant('Hello, TensorFlow!')
>>> hello.numpy()
b'Hello, TensorFlow!'

For more examples, see the TensorFlow tutorials.

Credits

Original Dataset SANAD

Running the project...

The model has a 0.98 accuracy Categories include:

  1. Politics

  2. Entertainment

  3. Economy

  4. Sports

  5. First download the modified dataset.

  6. Train the model using the "train.py file", make sure you hooked the correct path to your dataset and stopwords list.

  7. After running the CNN model, run the "classify.py" file and input test articles.

*Make sure the generated cnn_model.h5 and word2vec.model path is hooked to the classify file.

image

image

About

Uses CNN model to categorize Arabic news articles based on Economy, Sports, Politics, and Entertainment. SANAD Dataset.

Resources

Stars

Watchers

Forks

Releases

No releases published