Skip to content
Simple command-line script to clean up data copy-pasted from Rikaikun Chrome Plug-in (https://code.google.com/p/rikaikun/)
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
test
.gitignore
README.md
main.rb

README.md

Rikaikun Copy-Paste File cleaner

A simple command-line utility to clean up files created by copy-pasting data from Rikaikun.

Rikaikun (https://code.google.com/p/rikaikun/) is a great Chrome extension that does Japanese kanji lookups on-the-fly when you hover over Japanese words with your cursor. When you have looked up a word, you can copy it into your clipboard by pressing "c", and then paste the words into a text file.

Rikaikun provides you with a great deal of information for each lookup. A sample file generated from two lookup-copy-pastes is shown below (lines snipped for display):

通報	つうほう	(n,vs) report; tip; bulletin; ...
通	つう	(adj-na,n,ctr) connoisseur; ...
中西部	ちゅうせいぶ	(n) Mid-west
中	うち	(n,adj-no,pn,arch) inside; within; ...
中	なか	(n) inside; in; among; within; ...
中	じゅう	(suf) through; throughout; ...
中	ちゅう	(suf,abbr,n-suf) medium; average; ...

The above was generated by looking up "通報" and "中西部", and hitting "c" to copy the data from Rikaikun.

Rikaiklean is a simple script that gets rid of the duplicated entries (presumably, I am not interested in things that I did not look up), and outputs a condensed version of the same file to the console. The file above would be condensed to the following:

通報	つうほう	(n,vs) report; tip; bulletin; ...
中西部	ちゅうせいぶ	(n) Mid-west

This can be redirected to a file, and then imported into an SRS (Spaced Repetition Software, such as Anki).

Rikaiklean does a few other small operations, such as combining spelling variations and pronunciations. See test/test_input.txt for a test file and the output.

Usage

Installation and Set Up

Other than setting up Ruby, and perhaps making main.rb executable, there shouldn't be anything to set up.

Usage

A sample run:

$ ruby main.rb test/test_input.txt

This has been written and tested on a Mac (ruby 2.0.0p481 (2014-05-08 revision 45883) [universal.x86_64-darwin13]). It does not use any additional Ruby Gems.

You can’t perform that action at this time.