Skip to content

aljbri/garn

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Grep Arabic News

Using news-please to download the news articles form the arabic news websit. This is a pre-configration set for that.

How it working

To run the collecter, you need to run the following code in the terminal.

news-please -resume -c config

-c flag used to refer to the configuration folder path

-resume flag used to allow resuming the download

Working Path

On file /config/config.cfg, you can change the working path that will be used to safe the new article. by looking for the "working_path = " and add the path that your are going to use for saving the crawled pages

working_path =  arc-repo

News website config

if you want to add new news websit you need to change the file config/sitelist.hjson you have to add a json object

{
   "url":"here add the news websit url",
   "overwrite_heuristics":{
		"is_not_from_subdomain":true
   }
}

if you need more information you can get from news-please User Guide

About

Grep Arabic News: a pre-configuration set for news-please to download news articles from the Arabic news website

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published