Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Local/Remote benchmarking tool #810

Merged
merged 1 commit into from
Sep 11, 2017
Merged

Conversation

lissyx
Copy link
Collaborator

@lissyx lissyx commented Sep 1, 2017

Fixes #684

@lissyx lissyx self-assigned this Sep 1, 2017
@lissyx lissyx force-pushed the benchmark-tool branch 8 times, most recently from 44919a4 to b46526f Compare September 5, 2017 15:51
@lissyx lissyx force-pushed the benchmark-tool branch 8 times, most recently from db7c656 to 4e113b5 Compare September 6, 2017 09:31
@lissyx
Copy link
Collaborator Author

lissyx commented Sep 6, 2017

Tooling ready, with TaskCluster execution: https://public-artifacts.taskcluster.net/HTRYR3AhQfy9qMm_-K8Zvw/0/public/benchmark.png

Sadly, it's going to be too much tricky to be able to test remote SSH from TaskCluster.

@lissyx lissyx force-pushed the benchmark-tool branch 4 times, most recently from 12f6cbe to 3855be3 Compare September 6, 2017 13:11
@lissyx
Copy link
Collaborator Author

lissyx commented Sep 6, 2017

FTR, basic RPi3 password-based SSH auth with host defined in ~/.ssh/config gets as:

python bin/benchmark_nc.py [...] --target MozRPi3-ARMv6-Paris --autotrust --no-allowagent --no-lookforkeys [...]

@lissyx lissyx force-pushed the benchmark-tool branch 6 times, most recently from fb3e52f to 8cca1bd Compare September 6, 2017 17:10
@lissyx
Copy link
Collaborator Author

lissyx commented Sep 6, 2017

Looks good so far.

Copy link
Contributor

@reuben reuben left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a bunch of nits and comments, mostly about Python 2/3 compatibility, but one larger comment is that I think you should split benchmark_nc (what's "nc"?) into two separate scripts, one for the benchmark and one for plotting.

I wish we didn't have to do all the SSH stuff, but I can see how it makes life easier for benchmarking on a RPi.

Oh, and remember to remove the "hack" commit and fix the commit messages.

.taskcluster.yml Outdated
@@ -327,6 +327,7 @@ tasks:
export TASKCLUSTER_TASK_DIR="$(find $(dirname `pwd`) -name "task-*" -type d -mindepth 1 -maxdepth 1)" &&
git clone --quiet {{event.head.repo.url}} ${TASKCLUSTER_TASK_DIR}/DeepSpeech/ds/ &&
cd ${TASKCLUSTER_TASK_DIR}/DeepSpeech/ds && git checkout --quiet {{event.head.sha}} &&
patch -d ${TASKCLUSTER_TASK_DIR}/DeepSpeech/tf -p1 < brew.patch &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we just fix this in mozilla/tensorflow?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's already taken care of by mozilla/tensorflow#26 and #820 :)

# We still need to get model, wav and alphabet
download_data

# Follow benchmark naming from parameters in bin/run-tc-ldc93s1.sh
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this comment. What benchmark naming? Which parameters in run-tc-ldc93s1.sh?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Model names are being used to extract dimensions informations. Clearly, your comments later shows it's not made clear enough :)


# Follow benchmark naming from parameters in bin/run-tc-ldc93s1.sh
# Okay, it's not really the real LSTM sizes, just a way to verify how things
# actually behaves.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: behave.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

done;

# Let's prepare another model for single-model codepath
mv /tmp/${model_name} /tmp/test.frozen.e75.lstm494.ldc93s1.pb
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe name it lstmdefault instead of lstm494, to avoid tying this code to the n_hidden value from bin/run-tc-ldc93s1.sh?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the 494 dimension will be used when plotting later :)

--dataset "TaskCluster model" ${csv} \
--title "TaskCluster model benchmark" \
--plot ${png} \
--size 1280x720
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's confusing that the same script behaves as two entirely different programs here. My suggestion would be to split it in two and change them to read/write from/to stdin/stdout, but I haven't reviewed benchmark_nc.py yet, so maybe I'm missing something.

util/tc.py Outdated
@@ -0,0 +1,47 @@
#!/usr/bin/env python
from __future__ import print_function
from __future__ import absolute_import
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: combine these lines.
Import division here as well, so you don't need to explicitly type hint divisions below.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

util/tc.py Outdated
import sys
import os
import stat
import urllib
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be six.moves.urllib for Python 2 and 3 compat.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pydoc says there is no urlretrieve in six.moves.urllib :/

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

util/tc.py Outdated
target_file = os.path.join(target_dir, tc_filename)
if not os.path.isfile(target_file):
print('Downloading %s ...' % tc_url)
urllib.urlretrieve(tc_url, target_file, reporthook=(report_progress if progress else None))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

six.moves.urllib mimics the Python 3 structure, so this needs to be urllib.request.urlretrieve.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Answers above then :)

util/tc.py Outdated

def maybe_download_tc(target_dir=None, tc_url=None, progress=True):
def report_progress(count, block_size, total_size):
percent = int((count * block_size * 100) / total_size)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't need the int() here with __future__.division, just use //.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

util/tc.py Outdated
'https://index.taskcluster.net/v1/task/project.deepspeech.deepspeech.native_client.master.%(arch_string)s/artifacts/public/native_client.tar.xz')

def get_tc_url(arch_string=None):

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: too much whitespace.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@lissyx
Copy link
Collaborator Author

lissyx commented Sep 8, 2017

About splitting, I thought about that, but I felt it would not really help that much and might complexify things later.

@lissyx lissyx force-pushed the benchmark-tool branch 9 times, most recently from bcd066e to 5bc32e9 Compare September 11, 2017 12:53
@lissyx
Copy link
Collaborator Author

lissyx commented Sep 11, 2017

@reuben Previous push with the default arguments cleanup was good, current push is running and it includes splitting into two scripts :)

Copy link
Contributor

@reuben reuben left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r=me with those two comments addressed.


sftp = ssh_conn.open_sftp()
if not stat.S_ISDIR(sftp.stat(dir).st_mode):
print('No remote directory: %s' % dir)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's still missing from this call.

'''
fs = ''
for c in s:
if ord(c) >= ord('0') and ord(c) <= ord('9'):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use c.isdigit().

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both are done :)

@lissyx
Copy link
Collaborator Author

lissyx commented Sep 11, 2017

It's all green except on OSX because a lot of other TaskCluster (big) tasks are pending there. I'm merging anyway, since benchmark code is not exercized there :)

@lissyx lissyx merged commit b1229c0 into mozilla:master Sep 11, 2017
@lissyx lissyx deleted the benchmark-tool branch September 11, 2017 15:07
@lock
Copy link

lock bot commented Jan 3, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Jan 3, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants