New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

feat: advanced benchmarking #25

Draft

jpbourgeon wants to merge 29 commits into sam-goodwin:main from jpbourgeon:advanced-benchmarking

jpbourgeon commented Apr 23, 2023 •

edited

Loading

Hi

I refactored the project architecture. Now the benchmark workspace is @itty-aws/benchmark
I changed the tsconfig setup to exclude the benchmark package from the build process
The fix to package.json proposed in chore: add "main" to package.json to fix compatibility with some frameworks #24 is also included
The benchmark package is documented in /benchmark/README.md
There is a first benchmark report in /benchmark/data/advanced-benchmarking/README.md

To keep things seperated, I propose that we talk about the result of the benchmark in issue #21 (project status)

Closes #22, Closes #25

jpbourgeon added 14 commits

March 30, 2023 00:21


          chore: git ignore .code-workspace

adf1ebd


          chore: folder setup and initial docs

01c9d49


          fix: configuration errors

9e9c658


          chore: additional config tweaks

8a70d2d


          chore: additional config tweaks to the benchmark

cf5192a


          feat: first benchmark

7eb2427


          feat: consolidate

05b9c97


          chore: docs

4ffcd5a


          chore: typo

04a88cb


          chore: dummy chart.js

45b24b9


          feat: nice charts

5549f49


          feat: generate markdown report

13f93b8


          chore: docs

7939a55


          chore: fix typos

075463e

Author

jpbourgeon commented Apr 26, 2023

The scripts are a bit "rough around the edges" and would definitely benefit from a good refactoring. However, I wanted to publish the results as soon as possible.

sam-goodwin requested changes

View reviewed changes

Owner

sam-goodwin left a comment

Thanks so much for putting this together! It's super thorough and very helpful.

I added some comments but nothing too critical. Just questions and some proposals for simplifying the impl.

benchmark/scripts/consolidate.ts Outdated

Comment on lines 68 to 70

+                const gitBranch = execSync("git rev-parse --abbrev-ref HEAD 2>/dev/null", {
+                  encoding: "utf-8",
+                }).trim();

Owner

sam-goodwin Apr 27, 2023

nit: use async IO

Author

jpbourgeon Apr 28, 2023

benchmark/scripts/consolidate.ts Outdated

+                const raw: IResult = JSON.parse(
+                  await fsPromises.readFile(rawFilePath, "utf8")
+                );
+                let result: IResult = {

Owner

sam-goodwin Apr 27, 2023

nit: can it be const?

benchmark/scripts/consolidate.ts Outdated

+                let count = { functionCalls: 0, logEntries: 0 };
+                let consolidatedEntry: Record<string, any> = {};
+                let requestId: string | undefined;
+                for (const entry of raw.log) {

Owner

sam-goodwin Apr 27, 2023

some comments might be helpful - it looks like this is aggregating the results of multiple runs into a single result?

Author

jpbourgeon Apr 28, 2023

benchmark/scripts/consolidate.ts Outdated

+                console.log(`- Build datasets`);
+                const datasetPointers: string[] = [];
+                for (const entry of result.log) {
+                  const label = entry.name.substring(0, entry.name.lastIndexOf("-"));

Owner

sam-goodwin Apr 27, 2023

i'm curious, do we control the format of this log? Is there a reason it's not in an easy to parse JSON format?

Author

jpbourgeon Apr 28, 2023 •

edited

Loading

The stack deploys n instances of the same lambda (10 by default). the instance name is suffixed with its iteration number. Thus, we need to strip the suffix to consolidate the data of the same kind of a function.
And of course CloudWatch log is text only. You need to find your json in the message, extract the logs of the API call and then you can parse it.
Only then comes the aggregation part by type and metric.

benchmark/scripts/consolidate.ts Outdated

Comment on lines 199 to 219

+                    if (!result.datasets.coldStarts.totalDuration[label]) {
+                      datasetPointers.push(
+                        `datasets.${targetDataset}.totalDuration.${label}`
+                      );
+                      result.datasets.coldStarts.totalDuration["title"] =
+                        "Cold start: total duration";
+                      result.datasets.coldStarts.totalDuration[label] = {
+                        label: label,
+                        order: CONFIG.functions.reduce(
+                          (prev, curr) =>
+                            curr.functionName === label ? curr.chart.order : prev,
+
+                        ),
+                        data: [entry.function.initDuration + entry.function.duration],
+                      };
+                    } else {
+                      result.datasets.coldStarts.totalDuration[label].data.push(
+                        entry.function.initDuration + entry.function.duration
+                      );
+                    }
+                  }

Owner

sam-goodwin Apr 27, 2023

In general, I'd recommend extracting these into dedicated functions and leaning towards pure functions instead of mutating state. I'm not too pure/opinionated, but procedures with a ton of mutation operations can be hard to refactor and read. This could be computeTotalDuration and return the properties you need. Extract each computation into pure functions and then compose them together with map/reduce logic.

Feel free to ignore this feedback as it would require substantial rework which is not necessary at this time.

Author

jpbourgeon Apr 28, 2023

True. This is part of the refactoring I mentioned previously. I posted the results quickly however I will do some refactoring now to avoid technical debt accumulation.

benchmark/scripts/run.ts Outdated

Comment on lines 76 to 77

		const command = new DeleteLogGroupCommand({ logGroupName });
		console.log(` - log group '${logGroupName}'`);

Owner

sam-goodwin Apr 27, 2023

why are you deleting cloudwatch logs?

Author

jpbourgeon Apr 28, 2023 •

edited

Loading

I flush any logs from potential previous runs to avoid data corruption.
It will also be useful to prevent redeploying the benchmark stack across features/branche if it has not changed

benchmark/scripts/run.ts Outdated

Comment on lines 79 to 81

+                      const cloudwatchClient = new CloudWatchLogsClient({});
+                      await cloudwatchClient.send(command);
+                      cloudwatchClient.destroy();

Owner

sam-goodwin Apr 27, 2023

why create and destroy? Usually better to create once per node runtime

Author

jpbourgeon Apr 28, 2023

This is mandatory. I issued random timeouts from the SDK when reusing the same instance. I had no other choice but creating a fresh instance every time ... 😞

Owner

sam-goodwin Apr 28, 2023

Hmm that is odd. Not sure why a separate instance would fix that. I'd be interested in seeing if that's still the behavior

benchmark/scripts/run.ts Outdated

Comment on lines 96 to 98

+                } catch {
+                  null;
+                }

Owner

sam-goodwin Apr 27, 2023

nit: why is null here

tsconfig.esm.json Outdated

@@ @@ -1,9 +1,11 @@ @@
               {
                 "extends": "./tsconfig.base.json",
-                "include": ["src"],
+                "include": ["src/**/*"],

Owner

sam-goodwin Apr 27, 2023

i don't think this is necssary is it?

tsconfig.build.json

Comment on lines +1 to +7

+              {
+                "files": [],
+                "references": [
+                  { "path": "./tsconfig.esm.json" },
+                  { "path": "./tsconfig.test.json" }
+                ]
+              }

Owner

sam-goodwin Apr 27, 2023

Should this be the default tsconfig.json?

Author

jpbourgeon Apr 28, 2023 •

edited

Loading

I don't think so.
I didn't want the build command to build the benchmark. This is handled by the CDK. That's why I removed the benchmark project from tsconfig.build.json.
However I left every project in the default tsconfig.json for the IDE. I'm not sure about the consequences of not referencing every project in the root tsconfig.json, however I think it may cause unwanted side effects like unnecessary TSServer reloads when switching files.

Owner

sam-goodwin Apr 28, 2023

I usually have the root tsconfig.json reference every sub project because the IDE uses the root by default. The consequence is that it will build everything when you run tsc-b from the root, which is probably what we want. Not high priority - we can figure it out.

Author

jpbourgeon Apr 29, 2023

We agree. That's what is implemented here: tsconfig.json references every sub project including @itty-aws/benchmark for the IDE.

But this is tsconfig.build.json that we are talking about, and I removed benchmark/tsconfig.json from it since the benchmark sub-project is not to be build through the main pnpm build command. In this subproject, the CDK takes care of building the lambda functions automatically with esbuild when deploying through pnpm bench:deploy.

jpbourgeon added 2 commits

April 28, 2023 18:15


          chore: refactoring

22886aa


          refactor: run script

083a074

Author

jpbourgeon commented May 2, 2023

Thanks so much for putting this together! It's super thorough and very helpful.

I added some comments but nothing too critical. Just questions and some proposals for simplifying the impl.

I will do a thorough refactoring of the benchmark. Scripts do too many things at once and I'm going to break them down into simpler functions. I will also reshape the data model and the corresponding types to simplify the implementation.

I will include your reviews and comments in the refactoring, as they all make sense. Thank you for your much appreciated feedback.


          refactor: deploy and run

e3d9c2b

jpbourgeon marked this pull request as draft

May 5, 2023 10:14

jpbourgeon added 9 commits

May 11, 2023 09:47


          chore: republish previous benchmark results


          chore: edit jsdoc in run script

951d363


          chore: edit comments in run script

5882f95


          chore: better types

b19b631


          fix: AWS SDK v3 client unexpected freezes

53257a6


          refactor: consolidate to createFunctionExecutionsList

856207b


          refactor: better types naming

261b068


          refactor: revert useless util extraction

b231148


          refactor: better types and renaming

872d59c

jpbourgeon added 3 commits

May 21, 2023 10:19


          refactor: build charts datasets from function executions list

5b93568


          chore: benchmark typecheck script hoisting

f51796c


          fix: calculate mean and standard deviation for both warm and cold starts

d4150e6

Owner

sam-goodwin commented May 25, 2023

I saw that you recently pushed some changes. Ready for another review? I also need to take some time to fix the build - something is up with that.

Author

jpbourgeon commented May 25, 2023

I'm in the final stretch of my refactoring. I'll let you know as soon as I'm done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet