Two identical report runs can give different results #557

zurk · 2019-01-29T09:18:47Z

Ancestor issue: #511

Commit: fdf2a03

Report 1: quality_report_20190125_fdf2a03.zip
Report 2: quality_report_2_fdf2a03.zip

Main differences appeared for several random repositories and affect precision as well as full_support which supposed to be constant anyway.

Main suspects:

bblfsh. Something can break randomly during style-analyzer work. That's why we have different full_support number.

The text was updated successfully, but these errors were encountered:

zurk · 2019-01-29T11:43:32Z

So, I was able to confirm that bblfsh behaves differently.
There is a message "request processed content 3563 bytes, status Fatal\" elapsed=5.000225686s filename=\"resources/js/routes.js\" language=javascript\n" in the second run logs and no such message for the first one.

both logs files bblfsh_logs.zip

@vmarkovtsev suggests to hash bblfsh output and save it to compare between the reports. That what I am going to do next.

vmarkovtsev · 2019-01-29T11:47:06Z

@dennwc @creachadair this makes our life much more complex ^
Also CC-ing @smacker - maybe something misbehaves on Lookout side...

zurk · 2019-01-29T12:32:07Z

Unfortunately, I haven't a good way to reproduce weird bblfsh behavior, in case you wonder how to reproduce the bug. But I can show my experiment scripts to reproduce these logs:
I run it on science-3

export NUM=20190129
docker rm -f bblfshd_style_analyzer_$NUM report_gen_$NUM
docker run -d --rm --name bblfshd_style_analyzer_$NUM --privileged bblfsh/bblfshd:v2.11.0
docker exec bblfshd_style_analyzer_$NUM bblfshctl driver install javascript docker://bblfsh/javascript-driver:v1.2.0
docker run -it --link bblfshd_style_analyzer_$NUM -e LOOKOUT_BBLFSHD=ipv4://bblfshd_style_analyzer_$NUM:9432 --entrypoint bash -v /storage/konstantin/lookout-workdir/reports_release_$NUM:/reports --name report_gen_$NUM -e BBLFSH srcd/style-analyzer:github

inside docker image

apt update; apt install -y git make
export BRANCH=fdf2a03976627967fa9422f1c06611f77c433380
export REPORT_NAME=quality
rm -rf style-analyzer/
git clone https://github.com/src-d/style-analyzer/
cd style-analyzer
git checkout $BRANCH
pip3 install -r requirements.txt
export REPORTS_DIR=/reports/$BRANCH
mkdir $REPORTS_DIR
export JOBLIB_TEMP_FOLDER=/tmp
make report-$REPORT_NAME

So If I run these scripts twice I can get different results. I see that the different number of UAST Nodes was collected for some repos. The same I see in bblfsh logs: some files are failed to be parsed in one experiment but not in another.

I think it does not help a lot but that is all I have for now.

Any ideas what can be wrong are appreciated.

zurk · 2019-01-30T14:33:52Z

My investigation output mostly described here: https://github.com/bblfsh/bblfshd/issues/236

TL;DR:

Was able to reproduce on pure Python. So it is bblfsh problem and not lookout.
Looks like a driver problem, not a daemon.
The utility I added in Fix Quality Report reproducibility #562 to check bblfsh output vnodes number is hard to use because bbflsh is much more unstable then I expected. @vmarkovtsev suggests restarting bblfshd each run. And maybe it is a good solution. I continue to work with the PR.

zurk · 2019-02-05T15:15:30Z

It was accidentally closed

zurk added bug Something isn't working format Issues related to format analyzer medium Medium size labels Jan 29, 2019

zurk self-assigned this Jan 29, 2019

zurk mentioned this issue Jan 29, 2019

Fix Quality Report reproducibility #562

Merged

zurk mentioned this issue Jan 31, 2019

Fix style analyzer comment generation and suggestion feature #573

Merged

vmarkovtsev closed this as completed in #573 Feb 1, 2019

zurk reopened this Feb 5, 2019

zurk mentioned this issue Feb 7, 2019

Two identical parse requests can return different results bblfsh/javascript-driver#54

Closed

vmarkovtsev closed this as completed in #562 Feb 9, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Two identical report runs can give different results #557

Two identical report runs can give different results #557

zurk commented Jan 29, 2019

zurk commented Jan 29, 2019

vmarkovtsev commented Jan 29, 2019

zurk commented Jan 29, 2019 •

edited

zurk commented Jan 30, 2019 •

edited

zurk commented Feb 5, 2019

Two identical report runs can give different results #557

Two identical report runs can give different results #557

Comments

zurk commented Jan 29, 2019

zurk commented Jan 29, 2019

vmarkovtsev commented Jan 29, 2019

zurk commented Jan 29, 2019 • edited

zurk commented Jan 30, 2019 • edited

zurk commented Feb 5, 2019

zurk commented Jan 29, 2019 •

edited

zurk commented Jan 30, 2019 •

edited