Training a model fails #552

campoy · 2019-01-26T00:11:14Z

Hi there,

Today I tried to use style-analyzer on a new repository, starting from scratch, as part of a demo and to better understand the whole project.

I failed, and this is a report on how.

Following the quickstart guide

I tried to follow the steps in this list.
Unfortunately I don't understand how this could work, since there's no model to be trained. Is it?

I tried it anyway:

$ python3 -m lookout run lookout.style.format -c config.yml
INFO:d823:run:Created SQLAlchemyModelRepository(db=sqlite:////tmp/lookout.sqlite, fs=/tmp)
INFO:d823:run:Created DataService(0.0.0.0:10301)
INFO:d823:run:Created AnalyzerManager(style.format.analyzer.FormatAnalyzer/1)
INFO:d823:run:Created EventListener(0.0.0.0:9930, 1 workers)
INFO:d823:run:Listening 0.0.0.0:9930
INFO:9d89:EventListener:new ReviewEvent
INFO:9d89:code-format:Reading /tmp/github.com/campoy/node/style.format.analyzer.FormatAnalyzer_1.asdf...
INFO:9d89:AnalyzerManager:cache miss: style.format.analyzer.FormatAnalyzer
INFO:9d89:DataService:Opened <grpc._channel.Channel object at 0x7f312284f5c0>
WARNING:9d89:FeaturesExtractor:could not parse file benchmark/cluster/echo.js with error 'Couldn't find the token in the specified position:
Node role: Operator
Parsed form: “=”
Raw form: “”
Start position: 0, 0, 0
End position: 0, 0, 0', skipping
INFO:9d89:AnalyzerManager:style.format.analyzer.FormatAnalyzer: 0 comments
INFO:9d89:EventListener:OK 0.870

Ok, so I do need a trained model first. I search for "train" on the README and it doesn't help, so I start to search for "train" on the filenames and I find there's a train.py file under lookout/style/format/research.

Giving up on the docs, let's train this thing!

Ok, so the docs don't really tell me much ... I'll read the python code.
It seems like I need to create an input and output directories:

$ mkdir ~/training_dir
$ cd ~/training_dir && git clone https://github.com/nodejs/node
$ mkdir ~/output_path

And finally, once everything is set-up I start training the model!

$ python3 train.py ~/training_dir/node/ ~/output_path/

It seems like it's working but it's quite slow ... I start looking at the logs of the containers started by docker-compose up for lookout and I see something interesting in the logs of bblfsh:

time="2019-01-25T20:40:45Z" level=error msg="request processed content 35313 bytes, status Fatal" elapsed=2.299299ms filename="doc/api/os.md" language=markdown
time="2019-01-25T20:40:45Z" level=error msg="error selecting pool: unexpected error: runtime failure: missing driver for language "markdown""

Wait ... why are we trying to parse markdown? And ... why is it failing? status Fatal is far from being meaningful. Anyway, it seems like we're spending a crazy amount of time parsing markdown, python, and other languages which don't seem relevant to the style-analyzer since it only supports javascript.

So I drop all the unnecessary drivers using bbflshctl now there's only one:

$ docker exec -it lookout_bblfsh_1 bblfshctl driver list
+------------+------------------------------------------+-------------+--------+-----------+--------+-----+----------+
|  LANGUAGE  |                  IMAGE                   |   VERSION   | STATUS |  CREATED  |   OS   | GO  |  NATIVE  |
+------------+------------------------------------------+-------------+--------+-----------+--------+-----+----------+
| javascript | docker://bblfsh/javascript-driver:v1.2.0 | dev-adcd1b4 | beta   | 10 months | alpine | 1.9 | 8.9.3-r0 |
+------------+------------------------------------------+-------------+--------+-----------+--------+-----+----------+
Response time 773.632µs

After doing this the training process accelerates and soon I get to 16525 iterations ... where I got this:

16525it [38:35, 46.06it/s]ERROR:2c2a:grpc._common:Exception deserializing message!
Traceback (most recent call last):
  File "/home/francesc/.local/lib/python3.5/site-packages/grpc/_common.py", line 82, in _transform
    return transformer(message)
google.protobuf.message.DecodeError: Error parsing message
Traceback (most recent call last):
  File "train.py", line 89, in <module>
    main()
  File "train.py", line 85, in main
    train(**vars(args))
  File "train.py", line 69, in train
    FakeDataService(bblfsh_client, prepare_files(filenames, bblfsh_client, language), None)
  File "/home/francesc/style-analyzer/lookout/style/format/utils.py", line 50, in prepare_files
    res = client.parse(file)
  File "/home/francesc/.local/lib/python3.5/site-packages/bblfsh/client.py", line 71, in parse
    return self._stub.Parse(request, timeout=timeout)
  File "/home/francesc/.local/lib/python3.5/site-packages/grpc/_channel.py", line 550, in __call__
    return _end_unary_response_blocking(state, call, False, None)
  File "/home/francesc/.local/lib/python3.5/site-packages/grpc/_channel.py", line 467, in _end_unary_response_blocking
    raise _Rendezvous(state, None, None, deadline)
grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with:
        status = StatusCode.INTERNAL
        details = "Exception deserializing response!"
        debug_error_string = "None"
>

Oh ... that's bad. Maybe it's bad luck and I should run it again.

16526it [07:05, 49.21it/s]ERROR:1676:grpc._common:Exception deserializing message!
Traceback (most recent call last):
  File "/home/francesc/.local/lib/python3.5/site-packages/grpc/_common.py", line 82, in _transform
    return transformer(message)
google.protobuf.message.DecodeError: Error parsing message
Traceback (most recent call last):
  File "train.py", line 89, in <module>
    main()
  File "train.py", line 85, in main
    train(**vars(args))
  File "train.py", line 69, in train
    FakeDataService(bblfsh_client, prepare_files(filenames, bblfsh_client, language), None)
  File "/home/francesc/style-analyzer/lookout/style/format/utils.py", line 50, in prepare_files
    res = client.parse(file)
  File "/home/francesc/.local/lib/python3.5/site-packages/bblfsh/client.py", line 71, in parse
    return self._stub.Parse(request, timeout=timeout)
  File "/home/francesc/.local/lib/python3.5/site-packages/grpc/_channel.py", line 550, in __call__
    return _end_unary_response_blocking(state, call, False, None)
  File "/home/francesc/.local/lib/python3.5/site-packages/grpc/_channel.py", line 467, in _end_unary_response_blocking
    raise _Rendezvous(state, None, None, deadline)
grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with:
        status = StatusCode.INTERNAL
        details = "Exception deserializing response!"
        debug_error_string = "None"
>

Ok, so at least now it fails much faster (7 minutes vs 38), but the error is still pretty cryptic.

Time to go to bed.

The text was updated successfully, but these errors were encountered:

campoy · 2019-01-26T00:21:59Z

Just to be sure I tried with a different repo, this time with vue (https://github.com/vuejs/vue) and the same error occurred (just much faster).

465it [01:02,  7.39it/s]ERROR:1863:grpc._common:Exception deserializing message!
Traceback (most recent call last):
  File "/home/francesc/.local/lib/python3.5/site-packages/grpc/_common.py", line 82, in _transform
    return transformer(message)
google.protobuf.message.DecodeError: Error parsing message
Traceback (most recent call last):
  File "train.py", line 89, in <module>
    main()
  File "train.py", line 85, in main
    train(**vars(args))
  File "train.py", line 69, in train
    FakeDataService(bblfsh_client, prepare_files(filenames, bblfsh_client, language), None)
  File "/home/francesc/style-analyzer/lookout/style/format/utils.py", line 50, in prepare_files
    res = client.parse(file)
  File "/home/francesc/.local/lib/python3.5/site-packages/bblfsh/client.py", line 71, in parse
    return self._stub.Parse(request, timeout=timeout)
  File "/home/francesc/.local/lib/python3.5/site-packages/grpc/_channel.py", line 550, in __call__
    return _end_unary_response_blocking(state, call, False, None)
  File "/home/francesc/.local/lib/python3.5/site-packages/grpc/_channel.py", line 467, in _end_unary_response_blocking
    raise _Rendezvous(state, None, None, deadline)
grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with:
        status = StatusCode.INTERNAL
        details = "Exception deserializing response!"
        debug_error_string = "None"
>

campoy · 2019-01-26T00:48:03Z

After a while, it seems something happened and maybe the analyzer that I started running initially actually trained something?

I see these logs:

/home/francesc/.local/lib/python3.5/site-packages/sklearn/metrics/classification.py:1143: UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.
0 in labels with no predicted samples.
  'precision', 'predicted', average, warn_for)
INFO:9d89:FormatAnalyzer:trained {'__init__': True,
 'created_at': datetime.datetime(2019, 1, 26, 0, 36, 12, 350828),
 'dependencies': [],
 'model': 'code-format',
 'uuid': '5bc4738b-e235-4719-b9c7-1599b5346e6d',
 'version': [1]}
style.format.analyzer.FormatAnalyzer/[1] https://github.com/campoy/node.git 4385240d999708ab6a3904095d9666c5aba5221c
# javascript
23 rules, avg.len. 6.4
DEBUG:code-format:/ruless/thresholds/ -> lz4 compression
DEBUG:code-format:/ruless/features/ -> lz4 compression
DEBUG:code-format:/ruless/cls/ -> lz4 compression
DEBUG:code-format:/ruless/support/ -> lz4 compression
DEBUG:code-format:/ruless/cmps/ -> lz4 compression
DEBUG:code-format:/ruless/conf/ -> lz4 compression
DEBUG:code-format:/ruless/lengths/ -> lz4 compression
DEBUG:code-format:/ruless/artificial/ -> lz4 compression
DEBUG:code-format:/origin_configs/feature_extractor/selected_features/ -> lz4 compression
INFO:9d89:EventListener:OK 464.649

The logs are full of messages of this style:

WARNING:9d89:FeaturesExtractor:could not parse file test/parallel/test-fs-read-stream-fd.js with error 'Couldn't find the token in the specified position:
Node role: Operator
Parsed form: “+=”
Raw form: “”
Start position: 0, 0, 0
End position: 0, 0, 0', skipping

And when I send a PR https://github.com/campoy/node/pull/4/files to the analyzer the parser fails:

INFO:9d89:EventListener:new ReviewEvent
WARNING:9d89:FeaturesExtractor:could not parse file benchmark/cluster/echo.js with error 'Couldn't find the token in the specified position:
Node role: Operator
Parsed form: “=”
Raw form: “”
Start position: 0, 0, 0
End position: 0, 0, 0', skipping
INFO:9d89:AnalyzerManager:style.format.analyzer.FormatAnalyzer: 0 comments
INFO:9d89:EventListener:OK 0.551

vmarkovtsev · 2019-01-26T07:52:59Z

According to the logs, the babelfish driver has a wrong version. We will insert the check since it is critical.

So we need to update the docs, because everything has changed recently. There are two ways to run the thing, you tried the developer's way and it is more tricky to setup.

vmarkovtsev · 2019-01-26T08:09:58Z

Version check is blocked by bblfsh/python-client#141

campoy · 2019-01-29T01:49:58Z

The babelfish driver for javascript is docker://bblfsh/javascript-driver:v1.2.0 ... isn't that the correct one?

vmarkovtsev · 2019-01-29T06:37:32Z

From my experience, sometimes you are sure that the driver is correct - but it's actually not. I had exactly the same problem before the Eng demo.

campoy · 2019-01-29T22:38:07Z

This is a serious problem, then.

Does this mean there's an issue on babelfish not pulling the right version? If so, the @src-d/language-analysis should be aware of this.

vmarkovtsev · 2019-01-29T22:56:06Z

It pulls but it is still tricky because a tiny mistake ruins everything.

creachadair · 2019-01-30T00:02:42Z

It pulls but it is still tricky because a tiny mistake ruins everything.

Sorry, I may be missing some context here: What kind of mistake can cause this to happen? If we can do something to make such errors less likely, I'm interested to know.

vmarkovtsev · 2019-01-30T06:34:06Z

I was mainly talking about docker: a container restart kills the driver if there is no volume.

m09 · 2019-01-30T09:52:51Z

In my experience it happens when you first install the recommended driver and then install the correct version.

zurk · 2019-01-30T15:10:21Z

My experience is the same as Hugo's. And there is an issue bblfsh/bblfshd#184.

campoy · 2019-02-04T14:24:23Z

Ok, I'll close this as a duplicate of bblfsh/bblfshd#184 in that case

campoy closed this as completed Feb 4, 2019

bzz mentioned this issue Feb 7, 2019

Downgrading drivers doesn't yield functional drivers. bblfsh/bblfshd#184

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training a model fails #552

Training a model fails #552

campoy commented Jan 26, 2019 •

edited

Loading

campoy commented Jan 26, 2019

campoy commented Jan 26, 2019

vmarkovtsev commented Jan 26, 2019

vmarkovtsev commented Jan 26, 2019

campoy commented Jan 29, 2019

vmarkovtsev commented Jan 29, 2019

campoy commented Jan 29, 2019

vmarkovtsev commented Jan 29, 2019

creachadair commented Jan 30, 2019

vmarkovtsev commented Jan 30, 2019

m09 commented Jan 30, 2019

zurk commented Jan 30, 2019

campoy commented Feb 4, 2019

Training a model fails #552

Training a model fails #552

Comments

campoy commented Jan 26, 2019 • edited Loading

Following the quickstart guide

Giving up on the docs, let's train this thing!

campoy commented Jan 26, 2019

campoy commented Jan 26, 2019

vmarkovtsev commented Jan 26, 2019

vmarkovtsev commented Jan 26, 2019

campoy commented Jan 29, 2019

vmarkovtsev commented Jan 29, 2019

campoy commented Jan 29, 2019

vmarkovtsev commented Jan 29, 2019

creachadair commented Jan 30, 2019

vmarkovtsev commented Jan 30, 2019

m09 commented Jan 30, 2019

zurk commented Jan 30, 2019

campoy commented Feb 4, 2019

campoy commented Jan 26, 2019 •

edited

Loading