Load into Memory #1

myoldusername · 2017-10-18T01:29:39Z

Dear @loretoparisi
I installed your fasttext.js in order to solve memory problem that we discus about in facebookresearch/fastText#276 (comment)

Now when i run :
node fasttext_predict.js
it take like 5 sec to load the module,

"use strict";

(function() {

var DATA_ROOT='./data';

var FastText = require('./fasttext.js/lib/index');
var fastText = new FastText({
    loadModel: DATA_ROOT + '/model_gender.bin' // must specifiy filename and ext
});

var sample="Bashar Al Masri";
fastText.load()
.then(done => {
    return fastText.predict(sample);
})
.then(labels=> {
    console.log("TEXT:", sample, "\nPREDICT:",labels );
    sample="Hisahm al mjude";
    return fastText.predict(sample);
})
.then(labels=> {
    console.log("TEXT:", sample, "\nPREDICT:",labels );
   fastText.unload();
})
.catch(error => {
    console.error("predict error",error);
});

}).call(this);

and It return to stdout the prediction and exit , due to fastText.unload();
Now i need to call this file "node fasttext_predict.js UserName" from any place passing some args [UserName] to it and return to the stdout the result directly , since you saide it will be loaded into memory , in order to be able to get this result from the php webserver.

It is the same problem with the C++ file loading , i need it to be run in the background !

The text was updated successfully, but these errors were encountered:

loretoparisi · 2017-10-18T08:54:53Z

@myoldusername I have just updated the library with several improvements for the child process run. I have also added a server example that will help your needs. Please check the README.

myoldusername · 2017-10-18T10:57:45Z

Upstanding.... I will test it today since i am out of town, today i will give you a feedback.

You are awesome......

myoldusername · 2017-10-19T00:40:28Z

It is working as expected , THANK YOU SO MUCH .
You made my day !

Can i send you a donation ?

myoldusername · 2017-10-19T15:29:45Z

Some times it crach when i pass the string text if the string is unicode, like Chinese

I advice you to add normalization method to remove all non characters, e.g all special characters and smiles characters...

loretoparisi · 2017-10-19T16:34:50Z

@myoldusername yes this is a good point there are minor functions in utils like Util.removeDiacritics here https://github.com/loretoparisi/fasttext.js/blob/master/lib/util.js#L238

and the dataset is normalized in FastText.normalize https://github.com/loretoparisi/fasttext.js/blob/master/lib/index.js#L438

but of course for symbolics languages it's different, since it must be handled with Unicode i.e. unicode conversion and normalization before prediction.
Be aware that this normalization must be done on the training set too i.e. you have to apply the same normalization to training/test set and to the sample for the inference.

In my backend I do unicode normalization in Java, but here I would prefer a node solution. Will look into!

myoldusername · 2017-10-19T18:28:00Z

Well i am working with language classification training set which provide by fastText with respect to them.

I use to pass some languages paragraphs to the localhost url it works, but some time it suddenly crashed even with normalized strings.. I am not sure i will make farther test to see if my copy paste string has some hidden characters.. Since unicode has some nasty stuff lol.

Regarding node solution, i think it will be an awesome idea to apply.

With respect.

Yours

loretoparisi · 2017-10-19T21:57:06Z

Yes this could be a very tricky task when dealing with languages that needs Unicode. By the way I'm using the same model too, so I have added the compressed version of the model in the example, and some env var so that you can go:

cd examples/
export MODEL=./data/lid.176.ftz 
export PORT=9001
node server

and then

http://localhost:9001/?text=%EB%9E%84%EB%9E%84%EB%9D%BC%20%EC%B0%A8%EC%B0%A8%EC%B0%A8%20%EB%9E%84%EB%9E%84%EB%9D%BC\n%EB%9E%84%EB%9E%84%EB%9D%BC%20%EC%B0%A8%EC%B0%A8%EC%B0%A8%20%EC%9E%A5%EC%9C%A4%EC%A0%95%20%ED%8A%B8%EC%9C%84%EC%8A%A4%ED%8A%B8%20%EC%B6%A4%EC%9D%84%20%EC%B6%A5%EC%8B%9C%EB%8B%A4

that will be correctly detected as KO:

{
	"response_time": 0.001,
	"predict": [{
			"label": "KO",
			"score": "1"
		},
		{
			"label": "TR",
			"score": "1.95313E-08"
		}
	]
}

NOTE
My input text was 랄랄라%20차차차%20랄랄라\n랄랄라%20차차차%20장윤정%20트위스트%20춤을%20춥시다, but when you put in a url it will be automatically encoded with the encodeUriComponent method.

myoldusername · 2017-10-20T12:52:50Z

Well i like to bring to your attention that sometime when i pass a regular string, for unknown reasons the node server file freeze and i have to kill it and restart it again..

loretoparisi · 2017-10-20T13:04:54Z

@myoldusername put here that text and the url as cut&paste from the browser

myoldusername · 2017-10-20T13:10:41Z

http://localhost:3030/?text=bader

loretoparisi · 2017-10-20T13:31:26Z

uhm I guess you have some issues in your env:

$ export PORT=3030
$ export MODEL=./data/lid.176.ftz 
$ node server.js 
model loaded
server is listening on 3030

you therefore call http://localhost:3030/?text=bader and you get:

{
	response_time: 0.002,
	predict: [{
			label: "EN",
			score: "0.125931"
		},
		{
			label: "CA",
			score: "0.0847617"
		}
	]
}

This should work without any issues:

$ time curl -s "http://localhost:3030/?text=bader"
{
  "response_time": 0,
  "predict": [
    {
      "label": "EN",
      "score": "0.125931"
    },
    {
      "label": "CA",
      "score": "0.0847617"
    }
  ]
}
real	0m0.027s
user	0m0.005s
sys	0m0.006s

and now we do some benchmarking as well calling 1, 10 and 100 times iteratively:

$ ab -n 1 "http://localhost:3030/?text=bader"
This is ApacheBench, Version 2.3 <$Revision: 1757674 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking localhost (be patient).....done


Server Software:        
Server Hostname:        localhost
Server Port:            3030

Document Path:          /?text=bader
Document Length:        164 bytes

Concurrency Level:      1
Time taken for tests:   0.001 seconds
Complete requests:      1
Failed requests:        0
Total transferred:      271 bytes
HTML transferred:       164 bytes
Requests per second:    712.76 [#/sec] (mean)
Time per request:       1.403 [ms] (mean)
Time per request:       1.403 [ms] (mean, across all concurrent requests)
Transfer rate:          188.63 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.0      0       0
Processing:     1    1   0.0      1       1
Waiting:        1    1   0.0      1       1
Total:          1    1   0.0      1       1
[loretoparisi@:mbploreto task]$ ab -n 10 "http://localhost:3030/?text=bader"
This is ApacheBench, Version 2.3 <$Revision: 1757674 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking localhost (be patient).....done


Server Software:        
Server Hostname:        localhost
Server Port:            3030

Document Path:          /?text=bader
Document Length:        164 bytes

Concurrency Level:      1
Time taken for tests:   0.011 seconds
Complete requests:      10
Failed requests:        4
   (Connect: 0, Receive: 0, Length: 4, Exceptions: 0)
Total transferred:      2726 bytes
HTML transferred:       1656 bytes
Requests per second:    941.00 [#/sec] (mean)
Time per request:       1.063 [ms] (mean)
Time per request:       1.063 [ms] (mean, across all concurrent requests)
Transfer rate:          250.50 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.0      0       0
Processing:     0    1   0.5      1       2
Waiting:        0    1   0.3      1       1
Total:          1    1   0.5      1       2

Percentage of the requests served within a certain time (ms)
  50%      1
  66%      1
  75%      1
  80%      1
  90%      2
  95%      2
  98%      2
  99%      2
 100%      2 (longest request)
[loretoparisi@:mbploreto task]$ ab -n 100 "http://localhost:3030/?text=bader"
This is ApacheBench, Version 2.3 <$Revision: 1757674 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking localhost (be patient).....done


Server Software:        
Server Hostname:        localhost
Server Port:            3030

Document Path:          /?text=bader
Document Length:        168 bytes

Concurrency Level:      1
Time taken for tests:   0.095 seconds
Complete requests:      100
Failed requests:        73
   (Connect: 0, Receive: 0, Length: 73, Exceptions: 0)
Total transferred:      27208 bytes
HTML transferred:       16508 bytes
Requests per second:    1054.37 [#/sec] (mean)
Time per request:       0.948 [ms] (mean)
Time per request:       0.948 [ms] (mean, across all concurrent requests)
Transfer rate:          280.15 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.0      0       0
Processing:     0    1   1.2      0       9
Waiting:        0    1   1.2      0       9
Total:          0    1   1.2      1       9

Percentage of the requests served within a certain time (ms)
  50%      1
  66%      1
  75%      1
  80%      1
  90%      2
  95%      3
  98%      7
  99%      9
 100%      9 (longest request)

loretoparisi · 2017-11-03T20:27:48Z

I have added here some benchmarkes therefore I'm closing this issue. Feel free to re-open it if you have any problem.

bug fix FastText.prototype.nn()

loretoparisi closed this as completed Nov 3, 2017

loretoparisi pushed a commit that referenced this issue Mar 2, 2019

Merge pull request #1 from goroakimoto/goroakimoto-patch-1

ce3c371

bug fix FastText.prototype.nn()

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Load into Memory #1

Load into Memory #1

myoldusername commented Oct 18, 2017 •

edited

Loading

loretoparisi commented Oct 18, 2017 •

edited

Loading

myoldusername commented Oct 18, 2017 •

edited

Loading

myoldusername commented Oct 19, 2017

myoldusername commented Oct 19, 2017

loretoparisi commented Oct 19, 2017 •

edited

Loading

myoldusername commented Oct 19, 2017

loretoparisi commented Oct 19, 2017

myoldusername commented Oct 20, 2017

loretoparisi commented Oct 20, 2017

myoldusername commented Oct 20, 2017

loretoparisi commented Oct 20, 2017 •

edited

Loading

loretoparisi commented Nov 3, 2017

Load into Memory #1

Load into Memory #1

Comments

myoldusername commented Oct 18, 2017 • edited Loading

loretoparisi commented Oct 18, 2017 • edited Loading

myoldusername commented Oct 18, 2017 • edited Loading

myoldusername commented Oct 19, 2017

myoldusername commented Oct 19, 2017

loretoparisi commented Oct 19, 2017 • edited Loading

myoldusername commented Oct 19, 2017

loretoparisi commented Oct 19, 2017

myoldusername commented Oct 20, 2017

loretoparisi commented Oct 20, 2017

myoldusername commented Oct 20, 2017

loretoparisi commented Oct 20, 2017 • edited Loading

loretoparisi commented Nov 3, 2017

myoldusername commented Oct 18, 2017 •

edited

Loading

loretoparisi commented Oct 18, 2017 •

edited

Loading

myoldusername commented Oct 18, 2017 •

edited

Loading

loretoparisi commented Oct 19, 2017 •

edited

Loading

loretoparisi commented Oct 20, 2017 •

edited

Loading