Skip to content

YujiroTakahashi/fastText-php

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

fastText-php

fastText-php is a PHP bindings for fastText.

fastText is a library for efficient learning of word representations and sentence classification.

Requirements

PHP 7.x
fastText shard object

$ curl -fSL "https://github.com/facebookresearch/fastText/archive/v0.9.1.tar.gz" -o "./fastText-0.9.1.tgz"
$ tar xf fastText-0.9.1.tgz
$ cd fastText-0.9.1
$ mkdir build && cd build && cmake .. -DCMAKE_BUILD_TYPE=Release
$ make -j $(nproc)
$ sudo make install

Building fastText for PHP

$ cd fastText-php
$ phpize
$ ./configure
$ make -j $(nproc)
$ sudo make install

edit your php.ini and add:

extension=fasttext.so

Class synopsis

fastText {
    public __construct ( void )
    public int load ( string filename )
    public int getWordRows ( void )
    public int getLabelRows ( void )
    public int getWordId ( string word )
    public string getWord ( int word_id )
    public string getLabel ( int label_id )
    public array getWordVectors ( string word )
    public array getSentenceVectors ( string sentence )
    public mixed getPredict ( streing word [, int k] )
    public mixed getNN ( streing word [, int k] )
    public mixed getAnalogies ( streing word [, int k] )
    public mixed getNgramVectors ( streing word )
}

Table of Contents

fastText::__construct
fastText::load
fastText::getWordRows
fastText::getLabelRows
fastText::getWordId
fastText::getWord
fastText::getLabel
fastText::getWordVectors
fastText::getSentenceVectors
fastText::getPredict
fastText::getNN
fastText::getAnalogies
fastText::getNgramVectors

return value format


Instantiates a fastText object.

$ftext = new fastText();

load a model.

$model = 'result/model.bin';
$ftext->load($model);

get the number of vocabularies.

$rows = $ftext->getWordRows();
$words = [];
for ($idx = 0; $idx < $rows; $idx++) {
    $words[$idx] = $ftext->getWord($idx);
}

get the number of labels.

$rows = $ftext->getLabelRows();
$labels = [];
for ($idx = 0; $idx < $rows; $idx++) {
    $labels[$idx] = $ftext->getLabel($idx);
}

get the word ID within the dictionary.

$word = 'Bern';
$rowId = $ftext->getWordId($word);

converts a ID into a word.

$rows = $ftext->getWordRows();
$words = [];
for ($idx = 0; $idx < $rows; $idx++) {
    $words[$idx] = $ftext->getWord($idx);
}

converts a ID into a label.

$rows = $ftext->getLabelRows();
$labels = [];
for ($idx = 0; $idx < $rows; $idx++) {
    $labels[$idx] = $ftext->getLabel($idx);
}

get the vector representation of word.

$vectors = $ftext->getWordVectors('Beijing');
print_r($vectors);

get the vector representation of sentence.

$sentence = "It's fine day";

$vectors = $ftext->getSentenceVectors($sentence);
print_r($vectors);

  • array fastText::getPredict(string word)
  • FALSE fastText::getPredict(string word)

predict most likely labels with probabilities.

$probs = $ftext->getPredict('Berlin');
foreach ($probs as $row) {
    echo $row['label'].'  '.$row['prob'];
}

  • array fastText::getNN(string word)
  • FALSE fastText::getNN(string word)

query for nearest neighbors.

$probs = $ftext->getNN('Washington, D.C.');
foreach ($probs as $row) {
    echo $row['label'].'  '.$row['prob'];
}

  • array fastText::getAnalogies(string word)
  • FALSE fastText::getAnalogies(string word)

query for analogies.

$probs = $ftext->getAnalogies('Paris + France - Spain');
foreach ($probs as $row) {
    echo $row['label'].'  '.$row['prob'];
}

  • array fastText::getNgramVectors(string word)
  • FALSE fastText::getNgramVectors(string word)

get the ngram vectors.

$res = $ftext->getNgramVectors('London');
print_r($res);

$probs =
[
    ['label'=> '__label__1', 'prob'=> 0.4234 ],
    ['label'=> '__label__2', 'prob'=> 0.2345 ],
    ['label'=> '__label__3', 'prob'=> 0.1456 ],
                        :
                        :
                        :
]