Skip to content
This repository has been archived by the owner. It is now read-only.
A KCNA.jp scraper for a friends master thesis on Korean propaganda
JavaScript
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.idea
cache
src
.gitignore
package-lock.json
package.json
readme.md

readme.md

What is this

This project is dedicated to scraping the North Korean http://kcna.co.jp website for analysis for a friends masters thesis. The KCNA website requires a Japanese IP in order to view its content.

#Installing

Linux and Mac

npm install

Windows

npm install --no-bin-links

Running

Go to the directory you extracted the project to:

cd ~/kcna_scraper

From that directory, run the index file of src

node /src/index.js [dates|content|body|all|help]

Dates

Dates will go through the calendar listing and pull all available dates were content is reported to have been published

Content

Content will go through all dates discovered and find available articles

Body

Body will go through all available articles and parse the contents

All

All will run everything in order

Examining data

Data is stored in the ./cache directory.

You can’t perform that action at this time.