Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

log-loss-framework: all test materials developed for AWSLogs benchmark #670

Merged

Conversation

PettitWesley
Copy link
Contributor

Issue #, if available:

Description of changes:

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@PettitWesley PettitWesley requested a review from a team as a code owner June 5, 2023 01:36
@PettitWesley
Copy link
Contributor Author

Original review and comments here were addressed: #630

12000
"
export BUFFER_SIZES="
1m
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it better have a comment to tell what is the unit for the buffer size , 1MB ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

Comment on lines 25 to 40
{
"name": "SIZE_IN_KB",
"value": "${SIZE_IN_KB}"
},
{
"name": "TOTAL_SIZE_IN_MB",
"value": "${TOTAL_SIZE_IN_MB}"
},
{
"name": "THROUGHPUT_IN_KB",
"value": "${THROUGHPUT}"
},
{
"name": "CYCLE_TIME_IN_SECONDS",
"value": "1"
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we fix the indention ??

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, why others are env variable but not this "1"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CYCLE_TIME_IN_SECONDS doesn't change in any of the tests, so I didn't add it as an env

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed indentation

Comment on lines 69 to 85
{
"name": "TOTAL_SIZE_IN_MB",
"value": "${TOTAL_SIZE_IN_MB}"
},
{
"name": "THROUGHPUT_IN_KB",
"value": "${THROUGHPUT}"
},
{
"name": "TEST_NAME",
"value": "${TEST_NAME}"
},
{
"name": "CW_LOG_GROUP_NAME",
"value": "awslogs-benchmarking-output"
},
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same indention

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed indentation

Comment on lines 5 to 28
export THROUGHPUTS="
1000
2000
3000
4000
4500
5000
6000
7000
8000
9000
10000
12000
"
export BUFFER_SIZES="
1m
2m
4m
6m
8m
12m
"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we move those into util class, why it needs to be redefined here ?? Since ecs_launcher.sh has same variables

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a simple quick script to make launching easy for the AWSLogs project... it may never be used again... so I think its not worth it

@zwj102030
Copy link
Contributor

zwj102030 commented Jun 22, 2023

This seems a fairly large PR. It will be better to break them into smaller ones.

Signed-off-by: Wesley Pettit <wppttt@amazon.com>
Copy link
Contributor

@matthewfala matthewfala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Lot's of insightful metrics collected on the tests. The organization of the result data as a csv where each line represents a different test with all metric dimensions is creative and impactful as it allows for drilling down statistics on dimensions like buffer and throughput, and also obtaining aggregate statistics.

Left comments on the portions of code that I felt may need a second look. Also there are some style comments which I stopped adding half way through, since I'm not sure if you are concerned about that with these scripts.

// perform CW insights query for test name
// output is comma delimited, output insights query result as CSV
// simple python code can parse the CSV
fmt.Printf("%s %s - %s, percent lost, %d, number_lost, %d, total_input_record, %d, duplicates, %d, group=%s stream=%s TOTAL_SIZE_IN_MB=%s, SIZE_IN_KB=%s, THROUGHPUT_IN_KB=%s, %s, %s, %s, %s, %s, first_lost=%d, last_lost=%d",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like how task metadata is included along with all of the logging statistics. It allows all the data to be dumped into one CW stream making the data easily queryable rather than having a separate stream for each test/result set.

}
}
}
// fmt.Printf("\n")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider removing

fmt.Printf(".")
}

/* sleep between GetLogEvents calls to proactively reduce TPS against CW frontend */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to be consistent with style, use //

func exitErrorf(msg string, args ...interface{}) {
fmt.Fprintf(os.Stderr, msg+"\n", args...)
os.Exit(1)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic of this file makes sense. I like the use of map, and also storing the duplicated log count via the difference between the counted logs and the number of logs appearing in the map. The histogram idea is also nice. Generally the comments in the code file start with both upper and lower case, which to convention it may be better to start all of them with upper or lower, rather than both though this is really minor and not really an essential solution.


if len(sys.argv) < 3:
# Created due to the large number of test cases that passed after triple+ re-validation
print("Usage: merges at least two result files, taking the more successful run when duplicates are found. Can specify any number of files in any order.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any mention of duplicates in the code. Looks like it's just taking the data of the sample per task definition with the lowest number_lost metric

if (burst_enabled != NULL) {
burstSizeInMB = atoi(burst_enabled);
burstThroughputInKb = atoi(getenv("BURST_THROUGHPUT_IN_KB"));
if ((burstSizeInMB * 2 * 1000) > totalSizeInKb) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see why this limit needs to be imposed. burstSizeInMB seems like it needs to be less than totalSizeInKb/1000 rather than less than half that.

Comment on lines +177 to +187
def aggregation_key(data):
key = ""
for field in aggregate_on:
val = data[field]
key += val
if data['buffer'] > 1:
key += 'non-default-buffer'
data['key'] = key
return key


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function redefinition. Also defined above. I'm guessing that this is the one you want to keep? Consider removing one implementation

Comment on lines +128 to +129
end = index - 1
summary = line[:end]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect this to be:

end = index
summary = line[:end]

Because end is not included. I suppose the last value which is the test case name from the summary section added by analyze.py is really garbage and not parsed out.

@@ -0,0 +1,17 @@
FROM alpine as build-env
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting choice. I suppose alpine can be used instead of amazon2 linux as long as we don't distribute.

aws ecs --region ${REGION} run-task --cli-input-json "$RUN_TASK" >> "$OUTPUT_FILE"
echo "Started 10 tasks"
sleep 100
aws ecs --region ${REGION} run-task --cli-input-json "$RUN_TASK" >> "$OUTPUT_FILE"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So 20 tasks are started for each type of test right? So 6212*20 = 2880 tasks. That's a lot of tasks!

@PettitWesley PettitWesley changed the base branch from mainline to develop March 13, 2024 21:05
@PettitWesley PettitWesley merged commit d38bb57 into aws:develop Mar 13, 2024
@PettitWesley PettitWesley deleted the log-loss-framework-awslogs-final-code branch March 13, 2024 21:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants