Skip to content
Easy Namuwiki Extractor
Branch: master
Clone or download
Type Name Latest commit message Commit time
Failed to load latest commit information.
namuwiki Update Python 2 Compatibility Oct 20, 2016 Update tqdm and sample output Nov 29, 2016 Update tqdm and Python 2-3 compatibility Nov 29, 2016
namuwiki_sample.json initial commit Oct 20, 2016
test_output initial commit Oct 20, 2016

Easy NamuWiki Extractor

Simple Namuwiki Extractor extension of Namu Wiki Extractor

This module strips the namu mark from a namu wiki document and extracts its plain text only.



  • Clone this repo : git clone

  • Download Namuwiki json dump inside directory of repo : wget

  • You can find latest dumps here

  • Run extractor: python -i input_json_file -o outputfile_name

  • Tags:

--input (-i) : input filename
--output (-o) : output filename
--multiprocess (-m) : run multiprocessing module
--title (-t) : include titles of documents while extracting

How Namuwiki Json looks like

alt tag

Sample Output

alt tag

You can’t perform that action at this time.