Skip to content
This repository has been archived by the owner. It is now read-only.
master
Switch branches/tags
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
src
 
 
 
 
 
 
 
 

What is this

This project is dedicated to scraping the North Korean http://kcna.co.jp website for analysis for a friends masters thesis. The KCNA website requires a Japanese IP in order to view its content.

#Installing

Linux and Mac

npm install

Windows

npm install --no-bin-links

Running

Go to the directory you extracted the project to:

cd ~/kcna_scraper

From that directory, run the index file of src

node /src/index.js [dates|content|body|all|help]

Dates

Dates will go through the calendar listing and pull all available dates were content is reported to have been published

Content

Content will go through all dates discovered and find available articles

Body

Body will go through all available articles and parse the contents

All

All will run everything in order

Examining data

Data is stored in the ./cache directory.

About

A KCNA.jp scraper for a friends master thesis on Korean propaganda

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published