Compile Pinyinbase glossaries to JSON files. Export to CEDICT text files. Requires Node.js.
Switch branches/tags
Nothing to show
Clone or download
Latest commit 2e68ba2 Apr 24, 2018
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
dist readme-edit Oct 16, 2015
obj Update node-pinyinbase.js Oct 12, 2015
pinyinbase @ 8b60e98 submodule update Oct 19, 2015
.gitignore dotfile-edit Oct 19, 2015
.gitmodules submodule update Aug 31, 2015
LICENSE Initial commit Aug 31, 2015
README.md Update README.md Apr 24, 2018
pod.js fixes #2 Oct 19, 2015

README.md

pinyinpod

  • Requires Node.js.
  • Compiles Pinyinbase glossaries to a single JSON file.
  • Sorts all entries by pinyin, alphabetically.
  • (Optionally) Export as raw JSON or CEDICT text files.

Quick Start

Install Node.js with a package manager.

Clone this repo.

  • $ git clone --recursive https://github.com/pffy/pinyinpod

Need to update repo?

  • In pinyinpod repo root, type: $ git submodule update --remote --recursive

Ooops! Need to remove a submodule?

Build a new Pinyinbase.

  • From the pinyinpod repo root directory, type $ node pod.js
  • Outfile (pinyinbase) appears in the dist folder.

Command Line Parameters

pod.js

  • $ node pod.js
  • Performs DEFAULT ACTIONS.
    • Entries sorted alphabetically by pinyin.
    • Outputs a JavaScript file with a Pinyinbase object.
    • Outfile name is pinyinbase-outfile-{unix-timestamp}.js.
    • Pinyinbase object (an array of Pinyinbase objects) is called IdxCustomPinyinBase.
    • You can search this object.
    • Outfile created in dist folder.

--clean

  • $ node pod.js --clean
  • Cleans dist outfile folder.
    • Deletes all files in dist folder.
    • Deletes dist folder.
    • Creates fresh dist folder with a new README file.
  • Performs default actions.

--clean-only

  • $ node pod.js --clean-only
  • Toggles clean-only mode.
  • Performs cleaning actions.
  • Promptly exits when cleaning is done.
    • Does not perform default actions.
    • Exports nothing.
    • When used, all other flags (except --verbose) are ignored.

--cedictfile

  • $ node pod.js --cedictfile
  • Performs default actions.
  • Outputs a CEDICT-formatted dictionary file.
    • Works with legacy Chinese dictionary and learning software.
    • You can distribute your own custom-branded dictionary in a single file.
    • Outfile name is pb-cedict-ts-u8.txt. Overwrites file, if exists.
    • Outfile created in dist folder.

--jsonfile

--pbjs

  • $ node pod.js --pbjs
  • Performs default actions.
  • Outputs copy of default file with convenience filename.
    • Outfile name is pb.js. Overwrites file, if exists.
    • Outfile created in dist folder.

--verbose

  • $ node pod.js --verbose
  • Toggles verbose mode.
    • Prints compile information to screen.
  • Performs default actions.

Command Line Examples

  • $ node pod.js --cedictfile --clean --verbose --jsonfile

  • $ node pod.js --jsonfile --clean-only

  • $ node pod.js --verbose --clean-only

  • $ node pod.js --verbose --pbjs

    • Performs default actions.
    • Exports JavaScript library file named pb.js.
    • Prints program info to screen.
  • $ node pod.js --cedictfile --jsonfile

    • Performs default actions.
    • Generates CEDICT export file.
    • Generates JSON export file.
    • No output to screen.
  • $ node pod.js --verbose --jsonfile >> foo.txt

    • Performs default actions.
    • Generates JSON export file.
    • Appends compile info to a file named foo.txt
    • No output to screen.
  • $ node pod.js --verbose --cedictfile > log.txt

    • Performs default actions.
    • Generates CEDICT export file.
    • Creates (or overwrites) compile info to a file named log.txt
    • No output to screen.

Using Pinyinpod

NOTE: Please keep in mind that Pinyinpod is not the product. The product is Pinyinbase. Pinyinpod is an example of the many ways you may compile Pinyinbase glossaries into a single, searchable Pinyinbase JSON document. While this JSON document can be used in solutions, such as Firebase, MongoDB, Cassandra, or HBase (just to name a few NoSQL options), Pinyinpod and its output files are simply reference implementations. If you have better solutions, feel free to add them to your version of Pinyinpod.

Git.io Short URL

Compiling CC-CEDICT

Using a CC-CEDICT dictionary source file is possible, but NEVER recommended for several reasons.

If you really need to use CC-CEDICT dictionary data (e.g., as a customer deliverable), you should rename the dictionary file from cedict_ts.u8 to something like vocab-cmn-cedict-data.txt (anything with the vocab-cmn- prefix will do); then, move it to the pinyinbase folder.

The Pinyinbase schema is pinyin-optimized for JSON or NoSQL search. However, you may be looking for something else.

An alternative to the Pinyinbase schema is PFFYDICT schema, which is pinyin-optimized for relational database search. So, are you using SQLite? MySQL? MariaDB?

Then, try out PFFYDICT: