Remove logs from warm up period before calculating stats and unbreak bechmark-ab.py #1413

mreso · 2022-02-05T01:45:48Z

Description

Please include a summary of the feature or issue being fixed. Please also include relevant motivation and context. List any dependencies that are required for this change.

When running a benchmark with benchmark-ab.py script we run a warmup phase before benchmarking a model in the main stage. Part of the statistics collected for the report are extracted from the TorchServe log file. During this extraction we include the log lines from the warmup phase which dilutes the statistics (i.e. PredictionTime, HandlerTime, QueueTime, WorkerThreadTime) in the report. Due to this we systematically underestimate TorchServe's performance.

Additionally, this RP unbreaks benchmarkss/benchmark-ab.py which got broken due to a missing positional parameter (report_location) in a recent commit. A new parameter tmp_dir is introduced which allows final report location and tmp directory be in separate locations.

Fixes #(issue)

Type of change

Please delete options that are not relevant.

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update

Feature/Issue validation/testing

Please describe the tests [UT/IT] that you ran to verify your changes and relevent result summary. Provide instructions so it can be reproduced.
Please also list any relevant details for your test configuration.

To test run:
$ cd serve/benchmark
$ python benchmark-ab.py --config config.json --requests 1000
content config.json:
{
"url":"https://torch-deploy-benchmarks.s3.amazonaws.com/BERTSeqClassification.mar",
"requests": 10000,
"concurrency": 100,
"input": "sample.txt",
"batch_delay": 100,
"batch_size": 8,
"content_type":"application/text",
"workers": 4
}
content sample.txt:
Apples are especially bad for your health [SEP] Eating apples is a health risk

Test A
run before fix
Test B
run after fix
UT/IT execution results
During generation of the benchmark report we create several txt files in /tmp/benchmark and filter out the times we're interested in (

serve/benchmarks/benchmark-ab.py

Line 287 in 65cd16b

def extract_metrics():

)
before fix:
$ wc -l /tmp/benchmark/*.txt
138 /tmp/benchmark/handler_time.txt
138 /tmp/benchmark/predict.txt
46 /tmp/benchmark/result.txt
1100 /tmp/benchmark/waiting_time.txt
146 /tmp/benchmark/worker_thread.txt

(1000 request + 100 warmup requests) / 8 samples per batch = 138 batches

after fix:
$ wc -l /tmp/benchmark/*.txt
125 /tmp/benchmark/handler_time.txt
125 /tmp/benchmark/predict.txt
46 /tmp/benchmark/result.txt
1000 /tmp/benchmark/waiting_time.txt
125 /tmp/benchmark/worker_thread.txt

1000 request / 8 samples per batch = 125 batches

Logs
log_before_fix.txt
log_after_fix.txt

Checklist:

Have you added tests that prove your fix is effective or that this feature works?
New and existing unit tests pass locally with these changes?
Has code been commented, particularly in hard-to-understand areas?
Have you made corresponding changes to the documentation?

sagemaker-neo-ci-bot · 2022-02-05T02:12:47Z

AWS CodeBuild CI Report

CodeBuild project: torch-serve-build-win
Commit ID: d7de214
Result: SUCCEEDED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-neo-ci-bot · 2022-02-05T02:27:58Z

AWS CodeBuild CI Report

CodeBuild project: torch-serve-build-cpu
Commit ID: d7de214
Result: SUCCEEDED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-neo-ci-bot · 2022-02-05T02:33:28Z

AWS CodeBuild CI Report

CodeBuild project: torch-serve-build-gpu
Commit ID: d7de214
Result: SUCCEEDED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

HamidShojanazeri · 2022-02-07T17:15:17Z

Thanks @mreso, can you please add the logs for the fix as well.

mreso · 2022-02-07T19:37:55Z

Sure, added logs and test plan @HamidShojanazeri

benchmarks/benchmark-ab.py

sagemaker-neo-ci-bot · 2022-02-08T06:18:12Z

AWS CodeBuild CI Report

CodeBuild project: torch-serve-build-win
Commit ID: a1f2359
Result: SUCCEEDED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-neo-ci-bot · 2022-02-08T06:33:58Z

AWS CodeBuild CI Report

CodeBuild project: torch-serve-build-cpu
Commit ID: a1f2359
Result: SUCCEEDED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-neo-ci-bot · 2022-02-08T06:39:54Z

AWS CodeBuild CI Report

CodeBuild project: torch-serve-build-gpu
Commit ID: a1f2359
Result: SUCCEEDED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

mreso · 2022-02-14T18:59:57Z

@HamidShojanazeri do we need any additional logs or any other changes?

sagemaker-neo-ci-bot · 2022-02-14T19:21:01Z

AWS CodeBuild CI Report

CodeBuild project: torch-serve-build-win
Commit ID: 3bc395b
Result: SUCCEEDED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-neo-ci-bot · 2022-02-14T19:39:31Z

AWS CodeBuild CI Report

CodeBuild project: torch-serve-build-cpu
Commit ID: 3bc395b
Result: SUCCEEDED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-neo-ci-bot · 2022-02-14T19:46:44Z

AWS CodeBuild CI Report

CodeBuild project: torch-serve-build-gpu
Commit ID: 3bc395b
Result: SUCCEEDED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

msaroufim

Just a minor nit comment around using None for warmup_lines, otherwise worth mentioning in the description that this PR actually shows that we've been underestimating torchserve performance with our benchmarks.

benchmarks/benchmark-ab.py

sagemaker-neo-ci-bot · 2022-02-18T18:27:06Z

AWS CodeBuild CI Report

CodeBuild project: torch-serve-build-win
Commit ID: bde615f
Result: SUCCEEDED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-neo-ci-bot · 2022-02-18T18:43:20Z

AWS CodeBuild CI Report

CodeBuild project: torch-serve-build-cpu
Commit ID: bde615f
Result: SUCCEEDED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-neo-ci-bot · 2022-02-18T18:48:33Z

AWS CodeBuild CI Report

CodeBuild project: torch-serve-build-gpu
Commit ID: bde615f
Result: SUCCEEDED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

mreso · 2022-02-18T20:43:29Z

@nskool could you take another look? I remove the global but had to make some more changes due to a commit in master that broke the script. The breakage was due to a missing positional parameter (report_location). I introduced another tmp_dir parameter so you final report location and tmp directory can be in separate locations now.

sagemaker-neo-ci-bot · 2022-02-18T20:59:13Z

AWS CodeBuild CI Report

CodeBuild project: torch-serve-build-win
Commit ID: f16a8a2
Result: SUCCEEDED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-neo-ci-bot · 2022-02-18T21:18:22Z

AWS CodeBuild CI Report

CodeBuild project: torch-serve-build-cpu
Commit ID: f16a8a2
Result: SUCCEEDED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-neo-ci-bot · 2022-02-18T21:22:49Z

AWS CodeBuild CI Report

CodeBuild project: torch-serve-build-gpu
Commit ID: f16a8a2
Result: SUCCEEDED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-neo-ci-bot · 2022-02-18T22:22:13Z

AWS CodeBuild CI Report

CodeBuild project: torch-serve-build-win
Commit ID: f16a8a2
Result: SUCCEEDED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-neo-ci-bot · 2022-02-18T22:48:21Z

AWS CodeBuild CI Report

CodeBuild project: torch-serve-build-gpu
Commit ID: f16a8a2
Result: SUCCEEDED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

HamidShojanazeri requested review from nskool and HamidShojanazeri February 7, 2022 17:14

nskool reviewed Feb 8, 2022

View reviewed changes

benchmarks/benchmark-ab.py Outdated Show resolved Hide resolved

msaroufim self-requested a review February 18, 2022 17:48

msaroufim approved these changes Feb 18, 2022

View reviewed changes

benchmarks/benchmark-ab.py Outdated Show resolved Hide resolved

mreso added 2 commits February 18, 2022 10:01

Remove logs from warm up period before calculating stats

a27057b

Remove global variable

bde615f

mreso force-pushed the remove_warm_up_from_stats branch from 3bc395b to bde615f Compare February 18, 2022 18:01

Unbreak benchmark-ab.py; remove default parameter for warmup lines

f16a8a2

mreso changed the title ~~[WIP] Remove logs from warm up period before calculating stats~~ Remove logs from warm up period before calculating stats and unbreak bechmark-ab.py Feb 18, 2022

lxning approved these changes Feb 18, 2022

View reviewed changes

nskool approved these changes Feb 18, 2022

View reviewed changes

mreso merged commit 390573b into pytorch:master Feb 18, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove logs from warm up period before calculating stats and unbreak bechmark-ab.py #1413

Remove logs from warm up period before calculating stats and unbreak bechmark-ab.py #1413

mreso commented Feb 5, 2022 •

edited

sagemaker-neo-ci-bot commented Feb 5, 2022

sagemaker-neo-ci-bot commented Feb 5, 2022

sagemaker-neo-ci-bot commented Feb 5, 2022

HamidShojanazeri commented Feb 7, 2022

mreso commented Feb 7, 2022

sagemaker-neo-ci-bot commented Feb 8, 2022

sagemaker-neo-ci-bot commented Feb 8, 2022

sagemaker-neo-ci-bot commented Feb 8, 2022

mreso commented Feb 14, 2022

sagemaker-neo-ci-bot commented Feb 14, 2022

sagemaker-neo-ci-bot commented Feb 14, 2022

sagemaker-neo-ci-bot commented Feb 14, 2022

msaroufim left a comment

sagemaker-neo-ci-bot commented Feb 18, 2022

sagemaker-neo-ci-bot commented Feb 18, 2022

sagemaker-neo-ci-bot commented Feb 18, 2022

mreso commented Feb 18, 2022

sagemaker-neo-ci-bot commented Feb 18, 2022

sagemaker-neo-ci-bot commented Feb 18, 2022

sagemaker-neo-ci-bot commented Feb 18, 2022

sagemaker-neo-ci-bot commented Feb 18, 2022

sagemaker-neo-ci-bot commented Feb 18, 2022

Remove logs from warm up period before calculating stats and unbreak bechmark-ab.py #1413

Remove logs from warm up period before calculating stats and unbreak bechmark-ab.py #1413

Conversation

mreso commented Feb 5, 2022 • edited

Description

Type of change

Feature/Issue validation/testing

Checklist:

sagemaker-neo-ci-bot commented Feb 5, 2022

AWS CodeBuild CI Report

sagemaker-neo-ci-bot commented Feb 5, 2022

AWS CodeBuild CI Report

sagemaker-neo-ci-bot commented Feb 5, 2022

AWS CodeBuild CI Report

HamidShojanazeri commented Feb 7, 2022

mreso commented Feb 7, 2022

sagemaker-neo-ci-bot commented Feb 8, 2022

AWS CodeBuild CI Report

sagemaker-neo-ci-bot commented Feb 8, 2022

AWS CodeBuild CI Report

sagemaker-neo-ci-bot commented Feb 8, 2022

AWS CodeBuild CI Report

mreso commented Feb 14, 2022

sagemaker-neo-ci-bot commented Feb 14, 2022

AWS CodeBuild CI Report

sagemaker-neo-ci-bot commented Feb 14, 2022

AWS CodeBuild CI Report

sagemaker-neo-ci-bot commented Feb 14, 2022

AWS CodeBuild CI Report

msaroufim left a comment

Choose a reason for hiding this comment

sagemaker-neo-ci-bot commented Feb 18, 2022

AWS CodeBuild CI Report

sagemaker-neo-ci-bot commented Feb 18, 2022

AWS CodeBuild CI Report

sagemaker-neo-ci-bot commented Feb 18, 2022

AWS CodeBuild CI Report

mreso commented Feb 18, 2022

sagemaker-neo-ci-bot commented Feb 18, 2022

AWS CodeBuild CI Report

sagemaker-neo-ci-bot commented Feb 18, 2022

AWS CodeBuild CI Report

sagemaker-neo-ci-bot commented Feb 18, 2022

AWS CodeBuild CI Report

sagemaker-neo-ci-bot commented Feb 18, 2022

AWS CodeBuild CI Report

sagemaker-neo-ci-bot commented Feb 18, 2022

AWS CodeBuild CI Report

mreso commented Feb 5, 2022 •

edited