Skip to content

Stand-alone Language Identification for Node.js JavaScript based on FastText

License

Notifications You must be signed in to change notification settings

loretoparisi/fastLangID

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

fastLangID

Stand-alone Language Identification for Node.js JavaScript based on FastText.js

Table of Contents

fastLangID APIs

This version of fastLangID comes with the following JavaScript APIs

fastLangID.new({})
fastLangID.load()
fastLangID.unload()
fastLangID.detectDocument(String)
fastLangID.detectSentences(String)

How to Install

git clone https://github.com/loretoparisi/fastLangID.git
cd fastLangID
npm install

Install via NPM

fastLangID is available as a npm module here. To add the package to your project

npm install --save fastLangID

How to Use

To load fastLangID language detector call the load api

var langId = new FastLangID({});
langId.load()
.then(done => {
})
.catch(error => {

});

Detect Document

To detect the languages in a document call the detectDocument api after the fastLangID has been loaded

var document="I caught a glimpse of him from the bus.";
var langId = new FastLangID({});
langId.load()
.then(done => {
    return langId.detectDocument(document);
})
.then(detection => {
})
.catch(error => {

});

This will return a json object with labels predictions, where label is the most likely predicted language, while scores containes the probabilities for each language code. Language codes are ISO-639-2 formatted (two characters code, like IT).

{
    "label": "EN",
    "scores": {
      "EN": 0.989217,
      "ES": 0.00151357
    }
  }

Detect Sentences

To detect the languages of sentences in a document call the detectSentences api after the fastLangID has been loaded. The input text must have newlines terminators (\n, \r\n). It supports multiple line terminators like \n\n.

var document = "I caught a glimpse of him from the bus.\n\Ich habe gewusst, dass ihr Tom nicht vergessen würdet.\nVoi avete una famiglia numerosa?";
var langId = new FastLangID({});
langId.load()
.then(done => {
    return langId.detectSentences(document);
})
.then(detections => {
})
.catch(error => {

});

This will return a json array of object with labels predictions for each predicted sentence. The line contains the input sentence, while detection contains the labels as seen before.

[
  {
    "line": "I caught a glimpse of him from the bus.",
    "detection": {
      "label": "EN",
      "scores": {
        "EN": 0.989217,
        "ES": 0.00151357
      }
    }
  },
  {
    "line": "Ich habe gewusst, dass ihr Tom nicht vergessen würdet.",
    "detection": {
      "label": "DE",
      "scores": {
        "DE": 0.999884,
        "FR": 0.0000696995
      }
    }
  },
  {
    "line": "Voi avete una famiglia numerosa?,",
    "detection": {
      "label": "IT",
      "scores": {
        "IT": 0.995784,
        "ES": 0.0011519
      }
    }
  }
]

About

Stand-alone Language Identification for Node.js JavaScript based on FastText

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published