-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Local/Remote benchmarking tool #810
Conversation
44919a4
to
b46526f
Compare
db7c656
to
4e113b5
Compare
Tooling ready, with TaskCluster execution: https://public-artifacts.taskcluster.net/HTRYR3AhQfy9qMm_-K8Zvw/0/public/benchmark.png Sadly, it's going to be too much tricky to be able to test remote SSH from TaskCluster. |
12f6cbe
to
3855be3
Compare
FTR, basic RPi3 password-based SSH auth with host defined in ~/.ssh/config gets as:
|
fb3e52f
to
8cca1bd
Compare
Looks good so far. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a bunch of nits and comments, mostly about Python 2/3 compatibility, but one larger comment is that I think you should split benchmark_nc (what's "nc"?) into two separate scripts, one for the benchmark and one for plotting.
I wish we didn't have to do all the SSH stuff, but I can see how it makes life easier for benchmarking on a RPi.
Oh, and remember to remove the "hack" commit and fix the commit messages.
.taskcluster.yml
Outdated
@@ -327,6 +327,7 @@ tasks: | |||
export TASKCLUSTER_TASK_DIR="$(find $(dirname `pwd`) -name "task-*" -type d -mindepth 1 -maxdepth 1)" && | |||
git clone --quiet {{event.head.repo.url}} ${TASKCLUSTER_TASK_DIR}/DeepSpeech/ds/ && | |||
cd ${TASKCLUSTER_TASK_DIR}/DeepSpeech/ds && git checkout --quiet {{event.head.sha}} && | |||
patch -d ${TASKCLUSTER_TASK_DIR}/DeepSpeech/tf -p1 < brew.patch && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we just fix this in mozilla/tensorflow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's already taken care of by mozilla/tensorflow#26 and #820 :)
# We still need to get model, wav and alphabet | ||
download_data | ||
|
||
# Follow benchmark naming from parameters in bin/run-tc-ldc93s1.sh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand this comment. What benchmark naming? Which parameters in run-tc-ldc93s1.sh?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Model names are being used to extract dimensions informations. Clearly, your comments later shows it's not made clear enough :)
tc-benchmark-tests.sh
Outdated
|
||
# Follow benchmark naming from parameters in bin/run-tc-ldc93s1.sh | ||
# Okay, it's not really the real LSTM sizes, just a way to verify how things | ||
# actually behaves. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: behave.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
done; | ||
|
||
# Let's prepare another model for single-model codepath | ||
mv /tmp/${model_name} /tmp/test.frozen.e75.lstm494.ldc93s1.pb |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe name it lstmdefault
instead of lstm494
, to avoid tying this code to the n_hidden value from bin/run-tc-ldc93s1.sh?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the 494 dimension will be used when plotting later :)
tc-benchmark-tests.sh
Outdated
--dataset "TaskCluster model" ${csv} \ | ||
--title "TaskCluster model benchmark" \ | ||
--plot ${png} \ | ||
--size 1280x720 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's confusing that the same script behaves as two entirely different programs here. My suggestion would be to split it in two and change them to read/write from/to stdin/stdout, but I haven't reviewed benchmark_nc.py
yet, so maybe I'm missing something.
util/tc.py
Outdated
@@ -0,0 +1,47 @@ | |||
#!/usr/bin/env python | |||
from __future__ import print_function | |||
from __future__ import absolute_import |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: combine these lines.
Import division here as well, so you don't need to explicitly type hint divisions below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
util/tc.py
Outdated
import sys | ||
import os | ||
import stat | ||
import urllib |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be six.moves.urllib
for Python 2 and 3 compat.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pydoc says there is no urlretrieve
in six.moves.urllib
:/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
util/tc.py
Outdated
target_file = os.path.join(target_dir, tc_filename) | ||
if not os.path.isfile(target_file): | ||
print('Downloading %s ...' % tc_url) | ||
urllib.urlretrieve(tc_url, target_file, reporthook=(report_progress if progress else None)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
six.moves.urllib
mimics the Python 3 structure, so this needs to be urllib.request.urlretrieve
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Answers above then :)
util/tc.py
Outdated
|
||
def maybe_download_tc(target_dir=None, tc_url=None, progress=True): | ||
def report_progress(count, block_size, total_size): | ||
percent = int((count * block_size * 100) / total_size) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't need the int() here with __future__.division
, just use //
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
util/tc.py
Outdated
'https://index.taskcluster.net/v1/task/project.deepspeech.deepspeech.native_client.master.%(arch_string)s/artifacts/public/native_client.tar.xz') | ||
|
||
def get_tc_url(arch_string=None): | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: too much whitespace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
About splitting, I thought about that, but I felt it would not really help that much and might complexify things later. |
bcd066e
to
5bc32e9
Compare
@reuben Previous push with the default arguments cleanup was good, current push is running and it includes splitting into two scripts :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
r=me with those two comments addressed.
bin/benchmark_nc.py
Outdated
|
||
sftp = ssh_conn.open_sftp() | ||
if not stat.S_ISDIR(sftp.stat(dir).st_mode): | ||
print('No remote directory: %s' % dir) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's still missing from this call.
util/benchmark.py
Outdated
''' | ||
fs = '' | ||
for c in s: | ||
if ord(c) >= ord('0') and ord(c) <= ord('9'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use c.isdigit()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both are done :)
5bc32e9
to
ccecc62
Compare
It's all green except on OSX because a lot of other TaskCluster (big) tasks are pending there. I'm merging anyway, since benchmark code is not exercized there :) |
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
Fixes #684