Skip to content
A small JavaScript library for transliterating and/or sanitizing strings
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.github Update MAINTAINERS.md Jan 27, 2019
build DEV: update build script Jan 27, 2019
constants TEST: naughty strings (#20) Apr 7, 2019
docs TEST: naughty strings (#20) Apr 7, 2019
.eslintignore
.eslintrc.yml
.gitignore
.travis.yml
CODE_OF_CONDUCT.md Create CODE_OF_CONDUCT.md Jan 24, 2019
LICENSE.md
README.md TEST: naughty strings (#20) Apr 7, 2019
jasmine.json setup testing framework Jan 23, 2019
jsdoc.json
package-lock.json
package.json
transliterate.bundle.js setup Babel Jan 23, 2019
transliterate.js
transliterate.test.js TEST: naughty strings (#20) Apr 7, 2019
webpack.config.js

README.md

Transliterate

A small JavaScript library for transliterating and/or sanitizing strings. Works on Node (LTS) or in the browser. Tested against a wide variety of edge cases and unusual inputs.

View the complete documentation for this library here.

npm version npm downloads GitHub issues license

DOI GitHub stars GitHub forks

Overview

This library is useful for linguists and data analysts working with language data. It can be used to convert a string from one writing system to another (a process known as "transliteration"), or to remove unwanted characters or sequences of characters from a string (a process known as "sanitization"). This library handles common problems that arise during transliteration and sanitization, including bleeding and feeding issues.

Demo

Check out the Transliterator tool to see this library in use.

Issues & Feature Requests

Click here to open an issue or make a feature request.

Citation & Attribution

This library is maintained by Daniel W. Hieber. To cite this library, please see the citation information on this repository's Zenodo page.

Installation

Install with npm or yarn:

npm install @digitallinguistics/transliterate # npm
yarn add @digitallinguistics/transliterate    # yarn

Or link to the library using the DLx CDN. The library is available, versioned or unversioned, as ES modules or a UMD library, at the following URLs (where X.X.X represents the version number):

  • https://cdn.digitallinguistics.io/scripts/transliterate.js (latest, ES modules)
  • https://cdn.digitallinguistics.io/scripts/transliterate-latest.js (latest, ES modules)
  • https://cdn.digitallinguistics.io/scripts/transliterate-X.X.X.js (versioned, ES modules)
  • https://cdn.digitallinguistics.io/scripts/transliterate.bundle.js (latest, UMD)
  • https://cdn.digitallinguistics.io/scripts/transliterate.bundle-latest.js (latest, UMD)
  • https://cdn.digitallinguistics.io/scripts/transliterate.bundle-X.X.X.js (versioned, UMD)

Importing the Library

In the browser, include the library in your HTML (adjust the src to point to the location of the transliterate.js file in your project):

<!-- Using ES6 modules -->
<script src=transliterate.js type=module></script>

<!-- As a global variable -->
<script src=transliterate.bundle.js></script>

In Node, simply require the library:

const { transliterate } = require(`@digitallinguistics/transliterate`);

Basic Usage

The transliterate library exports an object with four methods:

  • transliterate
  • Transliterator
  • sanitize
  • Sanitizer

The sanitize and Sanitizer exports are essentially just aliases for transliterate and Transliterator respectively.

To transliterate a string, use the transliterate method:

// Import just the "transliterate" method from the library
const { transliterate } = transliterate;

// The list of substitutions to make
const substitutions = {
  p: `b`,
  t: `d`,
  k: `g`,
};

// The string to transliterate
const input = `patak`;

// Transliterate the string
const output = transliterate(input, substitutions);

console.log(output); // --> "badag"

To save a set of transliteration rules for reuse on more than one string, use the Transliterator class:

// Import just the Transliterator class
const { Transliterator } = transliterate;

// The list of substitutions to use for transliteration
const substitutions = {
  p: `b`,
  t: `d`,
  k: `g`,
};

// Create a transliterate function that always
// applies the same substitutions
const transliterate = new Transliterator(substitutions);

// The string to transliterate
const input = `patak`;

// Transliterate the string
const output = transliterate(input);

console.log(output); // --> "badag"

Contributing

Check out the Contributing Guide

You can’t perform that action at this time.