New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYSTEMML-1451][Phase 1] Automate performance suite and report performance numbers #537
[SYSTEMML-1451][Phase 1] Automate performance suite and report performance numbers #537
Conversation
Can an Admin verify this patch? |
Test this Jenkins |
add to whitelist. |
Refer to this link for build results (access rights to CI server needed): |
Refer to this link for build results (access rights to CI server needed): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like (most) of your variable names. Most of them are self explanatory.
Nonetheless, please document all your functions and parameters (wherever it makes sense). When doing so myself, I have sometimes found a need to redesign the interface, usually resulting in a cleaner API. This may or may not help you the way it helped me, but it will definitely help the next person read through your code.
|
||
mat_shapes = split_rowcol(matrix_shape) | ||
|
||
if job[0] == 1: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of 0 & 1, use either an enum (i know they were added in 3.4 and may need some discussion) or a named constant.
Using "magic numbers" is bad idea.
has_predict = ['GLM', 'Kmeans', 'l2-svm', 'm-svm', 'naive-bayes'] | ||
|
||
|
||
def naive_bayes_datagen(matrix_type, mat_shapes, conf_dir): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could be missing something obvious here (since I am not very familiar with Python), but it seems like this function naive_bayes_datagen
has been defined twice (wit the same signature)
scripts/perftest/python/utils.py
Outdated
|
||
|
||
def get_algo(family, ml_algo): | ||
algo = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Best to add documentation to all the functions in this file. So that someone who wants to add perf tests in the future knows what to do.
'regression': ['LinearRegDS', 'LinearRegCG', 'GLM'], | ||
'stats': ['Univar-Stats', 'bivar-stats', 'stratstats']} | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not a fan of using the function name main
. Could you call it something else? maybe something like perf_test_entry
or something more appropriate.
from utils import split_rowcol, config_writer | ||
import sys | ||
import logging | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A little blurb about the contents/purpose of this file would be great.
Refer to this link for build results (access rights to CI server needed): |
@nakul02 thank for the review. I will incorporate theses changes to the best of my understanding. |
How to test the script. Run the line below to see the help message |
Refer to this link for build results (access rights to CI server needed): |
Refer to this link for build results (access rights to CI server needed): |
…m/krishnakalyan3/systemml into SYSTEMML-1451-automatic-perftests
Refer to this link for build results (access rights to CI server needed): |
@niketanpansare could you please review this PR and share you feedback. Some commands
This command will run Kmeans on following modes
and captures the metrics in the log file
Another Variation.
If we just want to generate data for Kmeans
PS: Please run all these scripts from root. ($SYSTEML_HOME). Right now this performance test suits supports single node execution with statistics and clustering. Please let me know if you have questions. Thanks |
Refer to this link for build results (access rights to CI server needed): |
Awesome work. LGTM for tasks completed until now :) |
@niketanpansare thank you for the review. :) |
Refer to this link for build results (access rights to CI server needed): |
Refer to this link for build results (access rights to CI server needed): |
Refer to this link for build results (access rights to CI server needed): |
Refer to this link for build results (access rights to CI server needed): |
Refer to this link for build results (access rights to CI server needed): |
Refer to this link for build results (access rights to CI server needed): |
ping @nakul02, could you please review this PR. |
Refer to this link for build results (access rights to CI server needed): |
This is great work @krishnakalyan3! For the purpose of Phase 1 of GSoC, this code is merge-able as it is. But, either in this PR or as part of your next phase, I'd like you to document the overall design and assumptions in the main entry file. This could include things like example runs, what a Towards the end, we'd also like to add a User Guide with lots of examples. In this PR thread, or in your comment where you have the list of tasks completed, could you please indicate which families, algorithms, data sizes, shapes that you have tested for. Also what is the machine that you tested on (its configuration), any spark settings (how many executors, memory sizes of driver, executor), any single node settings (JVM memory), etc. |
Refer to this link for build results (access rights to CI server needed): |
Refer to this link for build results (access rights to CI server needed): |
@krishnakalyan3 - do you have anything to add? |
@nakul02 please merge. Thanks |
Refer to this link for build results (access rights to CI server needed): |
- Single entry point to run perf tests in any combination of algoriths, families, matrix shapes & densities - Reports time taken by a single perf test by parsing the output and grep-ing for the time - Detects tests that did not run and reports in the generated log - Robust error handling and reporting, informative help message Closes apache#537
Please refer to https://issues.apache.org/jira/browse/SYSTEMML-1451 for more details.
Phase 1:
singlenode
spark-hybrid
execution modeError Handling and Reporting:
Current status of family
To test this script please navigate to the gist below
Local Machine Configuration
Standalone Configuration
Spark Configuration
Performance Test Conducted on the following configs with all families (This includes all algorithms).