fix: updates how the logs are shown from cloudwatch#142
Conversation
IllyaYalovyy
left a comment
There was a problem hiding this comment.
The "change validation" section doesn't mention running unit tests. Please make sure you run make from the root of the project.
| logsByStream[*event.LogStreamName] = append(logsByStream[*event.LogStreamName], formatEvent(event)) | ||
| } | ||
|
|
||
| var logs []string |
There was a problem hiding this comment.
Do we risk run out of memory if too many log streams are involved? How this risk was mitigated?
There was a problem hiding this comment.
I misread the AWS documentation as I thought that the response size was smaller than it could theoritically be. I'll refactor it so that we use the same memory footprint
There was a problem hiding this comment.
The new change now will only force a max memory increase of 640kb over the existing implementation of AGC and the current response from AWS max memory usage would be up to 2.56Gb
Calculations:
Max Memory usage - 64 bytes * 10 000 events = 640 kilobytes
Max Cloudwatch memory usage - 256 kb * 10 000 events = 2.56Gb
| func parseEventLogs(events []types.FilteredLogEvent, latestTimestamp *int64) []string { | ||
| logs := make([]string, len(events)) | ||
| for i, event := range events { | ||
| logsByStream := make(map[string][]string) |
There was a problem hiding this comment.
Could you please explain this change?
There was a problem hiding this comment.
The logs returned from Cloudwatch can be interleaved between different log streams as the API is a "best efforts" API that tries not to interleave the logs between log groups. This change forces the streams to be returned back together.
From the description of the PR:
Currently, logs are shown directly as configured from Cloudwatch. This can end up causing logs to be interleaved with one another which can be confusing. This PR forces the logs to be grouped together by the log group.
API Source: https://github.com/aws/aws-sdk-go-v2/blob/service/cloudwatchlogs/v1.8.0/service/cloudwatchlogs/api_op_FilterLogEvents.go#L68
|
@IllyaYalovyy I pass the tests locally but for some reason the test that is failing is visible here but not locally (not sure why that is the case). P.s. I would respond inline but Github doesn't allow me to... |
| logs[i] = formatEvent(event) | ||
| logsByStream := make(map[string][]*types.FilteredLogEvent) | ||
| for index, _ := range events { | ||
| event := events[index] |
There was a problem hiding this comment.
If we take the actual event then the pointer will only refer to the very last event
|
|
||
| func convertStreamLogsToLogs(logsByStream map[string][]*types.FilteredLogEvent, eventSize int) []string { | ||
| logs, index := make([]string, eventSize), 0 | ||
| for _, eventList := range logsByStream { |
There was a problem hiding this comment.
Although we are performing two loops, it requires the exact same number of loops as before and is effectively the equivalent of using a spread operator in Go (see here for more details: https://github.com/golang/go/blob/master/src/reflect/value.go#L2640 ). We can't use the spread operator as we only have a memory pointer and need to format the events so that they print correctly.
d5e2faf to
fba2918
Compare
* Ahmaalba/Adapter Role Least Privilege Design (#57) * Reduce role privileges for adapter role * Added adapter role output bucket permissions * Adjusted Nextflow SubmitJobBatchPolicy props * CodeQL Security Analysis (#62) * Display message if no contexts deployed (#64) * Removed redundant command "agc version" (#67) * Pull request template (#65) Creates a PR template for github * Update issue templates (#63) Updates the issue templates to provide baseline guidance for customers * update dependencies (#69) Co-authored-by: Pang, Lee <pwyming@.amazon.com> * Updates the documentation for generating minimal permissions (#71) * Add run-id flag to workflow status command (#74) `agc workflow status` command was missing the `-r` flag to indicate that the string we are passing in is the run-id. * Update workflow documentation to include more info about the URL format * Update workflows.md Updates the documentation for workflows since we are parsing them when deploying workflows. * Update workflows.md * add attribution for GATK best practices workflows (#78) * add instructions to use AGC local CDK for bootstrapping (#79) * Passing environment variables to increase tolerance for metadata service endpoint timeouts (#80) * [Bug] Creating a cromwell Spot Context also creates an on demand Compute Env (#66) Avoid creating an on demand compute env for cromwell Spot Context * markjschreiber/better-account-not-activated-message (#75) * Better error message when attempting to deploy a context with no activated account. * Ensure context names are unique and sorted so deployment is in consistent order * LPD Full Managed Policy Permission Descope (#61) * LPD Batch permission descope * Ahmaalba/Adapter Role Least Privilege Design (#57) * Reduce role privileges for adapter role * Added adapter role output bucket permissions * Adjusted Nextflow SubmitJobBatchPolicy props * Prettier fix * Removal of BatchFullAccess managed policy * Code deduplication * LPD Full Implementation * Env removal from engine options * Regin and account parameter adjustment * Nextflow onspot instance bug fix * Usage of Arn.Format rather then custom ARN creation * Made roles retrieve account and region through props rather then ArnComponents * Removal of account and region to use default values * Added batch:ListJobs permissions to nextflow adapter * workflow engine documentation (#60) * Adds windows 10 as an OS option (#81) Tested AGC on a windows 10 machine running Ubuntu * Adds amd instance types (#84) * Corrected configuration of read workflow so that no MANIFEST is required (#86) * Fixing acg typo to agc (#91) Moving two instances of `acg` to `agc`. * rnaseq pipeline to use proper inputs.json file (#90) The rnaseq pipeline was referencing the inputs.json file from the atacseq example. This PR switches it to the proper inputs.json file. The old inputs.json file: ``` { "input": "s3://healthai-public-assets-us-east-1/agc-demo-data/atacseq/design.csv", "genome": "GRCh38", "single_end": true } ``` The new inputs.json file: ``` { "reads": "s3://1000genomes/phase3/data/HG00243/sequence_read/SRR*_{1,2}.filt.fastq.gz", "genome": "GRCh37", "skip_qc": true } ``` * Add workflow output command (#85) * workflow output command implementation * Workflow and Context autocomplete implementation (#82) * Workflow and Context autocomplete implementation * Support Max vCpu in project contexts (#89) feat: Configurable max vCpu for compute environments Add maxVCpu as a property of Context to control the maximum number of vCpus a compute environment can have at a given time Support a default Context values, set when a Context is unmarshalled, any value set in agc-project.yml will override the default. refs: #31 * Latest Release Link (#92) * Cleaned up Readme (#93) * Added version checker to AGC (#94) * Added version checker to AGC * Addressed feedback on the PR * Simplified version checker code * Go compiler version 1.16.0 -> 1.17.2 (#95) * Tabular text implementation and tests (#88) * Tabular text implementation and tests * Add Stale Issue Handling (#96) * added project validate command (#97) * markjschreiber/engine-in-contex-list (#99) * add engine name to context list command output * markjschreiber/clean-up-codebase (#98) * chore: Update Pull Request Template to Follow Conventional Commits (#100) Co-authored-by: Angela Li <dzl@amazon.com> * ci: Improved ci workflow (#102) * Builds the CDK project and validates eslint, also formats and fails if any formatting changes are detected. * Checks for format changes in the CLI project * ci: Add semantics behavior overrides (#106) * fix: Shows the relevant error if the workflow logs can't be retrieved (#103) * fix: workflows from demo-wdl-project should run without errors out of the box (#108) * test: use go 1.17 features to simplify unit tests (#110) * fix: show logs for workflows with more than 100 tasks (#114) * fix: use proper go tags for windows build (#117) * fix: use proper go tags for windows build * use nf-core for this workflow (#123) * feat: context destroy --force flag (#118) * context destroy --force flag * fix: Pass engine endpoint directly the wes adapter (#122) * chore: clean up project init code (#126) * ci: Add standard version, conventional changelog and bump script (#119) * ci: Add standard version, conventional changelog and bump script * fix: Fixes how users interact with the context commands (#115) Fixes how users interact with the context commands by allowing contexts to be passed in without the -c command * fix: invalid AWS Health url (#130) correctly point AWS health link to `aws.amazon.com/health` * build: Revamp build and release process (#127) We are updating our build pipeline to better automate the release process. This requires a few build related changes in our source code. * fix: Use correct context name (#132) the context name in `/examples/demo-wdl-project` is `myContext`, which is used by the examples here. * build: use latest build images (#134) * feat: Initial infrastructure for MiniWdl support (#125) Adds a MiniWdl stack which creates the appropriate batch resources and job definition to run MiniWdl jobs. * test: Added context deploy benchmarking script (#111) * Context deploy benchmark script * fix: Adds a message when new logs aren't shown to the user immediately (#131) * Adds a message when new logs aren't shown to the user immediately * fix: correctly link to core app (#133) * fix: temporary folder potential leak in some error scenarios. unit test for cdk command execution (#140) * fix: temporary folder potential leak in some error scenarios. unit test for cdk command execution * fixed typo in method name, updated implementation for channel waiter * fix: updates context describe to be consistent with context destroy (#143) * fix: updates context describe to be consistent with context destroy * Best practice is to avoid mutation of inputs. Therefore, copy instead of move input (#145) * build: move release files one folder down (#147) * fix: miniwdl interpolation workaround The gatk4-rnaseq-germline-snps-indels workflow revealed a possible bug in miniwdl where it doesn't correctly handle string interpolation of optional values used in a calculation. This change to the workflow works around the problem in miniwdl. * fix: updates how the logs are shown from cloudwatch (#142) fix: updates how the logs are shown from cloudwatch * fix: improve contrast in docs (#149) * docs: Add information about example inputs and runtimes (#146) * add information about example inputs and runtimes * fix: Asserts order deterministically (#153) * docs: ongoing cost details (#152) * added ongoing costs section to contexts.md * added cost estimate links * fix: Workflow status now ignores unqueryable stacks (#138) fix: Workflow status now ignores unqueryable stacks * docs: miniwdl engine docs and example project for GATK best practices (#158) * add engine docs * add miniwdl examples * feat: Introducing AWS Lambda based WES Adapter for running the workflows (#155) * Introducint AWS Lambda based WES Adapter for running the workflows * Addressing the comments from PR review * fix: Deregionalize min permissions (#128) * add route53:ListHostedZonesByName * de-regionalize resource arns * split out CDK specific s3 permissions * chore(release): 1.1.0 Co-authored-by: AhmadBassyiouni <30308260+abassyiouni@users.noreply.github.com> Co-authored-by: Guy Hawkins <2242982+ghawk1ns@users.noreply.github.com> Co-authored-by: Taylor <tneely@users.noreply.github.com> Co-authored-by: Illya Yalovyy <IllyaYalovyy@users.noreply.github.com> Co-authored-by: elliot-smith <elliotsm@amazon.com> Co-authored-by: W. Lee Pang, PhD <wleepang@gmail.com> Co-authored-by: Pang, Lee <pwyming@.amazon.com> Co-authored-by: Drew Dresser <andrewjdresser@gmail.com> Co-authored-by: Andrey Dovydenko <dovydenk@amazon.com> Co-authored-by: Mark Schreiber <mrschre@amazon.com> Co-authored-by: Sean Smith <seaam@amazon.com> Co-authored-by: a-li <7497012+a-li@users.noreply.github.com> Co-authored-by: Angela Li <dzl@amazon.com> Co-authored-by: nbraid <braidn@amazon.com>
fix: updates how the logs are shown from cloudwatch
* Ahmaalba/Adapter Role Least Privilege Design (aws#57) * Reduce role privileges for adapter role * Added adapter role output bucket permissions * Adjusted Nextflow SubmitJobBatchPolicy props * CodeQL Security Analysis (aws#62) * Display message if no contexts deployed (aws#64) * Removed redundant command "agc version" (aws#67) * Pull request template (aws#65) Creates a PR template for github * Update issue templates (aws#63) Updates the issue templates to provide baseline guidance for customers * update dependencies (aws#69) Co-authored-by: Pang, Lee <pwyming@.amazon.com> * Updates the documentation for generating minimal permissions (aws#71) * Add run-id flag to workflow status command (aws#74) `agc workflow status` command was missing the `-r` flag to indicate that the string we are passing in is the run-id. * Update workflow documentation to include more info about the URL format * Update workflows.md Updates the documentation for workflows since we are parsing them when deploying workflows. * Update workflows.md * add attribution for GATK best practices workflows (aws#78) * add instructions to use AGC local CDK for bootstrapping (aws#79) * Passing environment variables to increase tolerance for metadata service endpoint timeouts (aws#80) * [Bug] Creating a cromwell Spot Context also creates an on demand Compute Env (aws#66) Avoid creating an on demand compute env for cromwell Spot Context * markjschreiber/better-account-not-activated-message (aws#75) * Better error message when attempting to deploy a context with no activated account. * Ensure context names are unique and sorted so deployment is in consistent order * LPD Full Managed Policy Permission Descope (aws#61) * LPD Batch permission descope * Ahmaalba/Adapter Role Least Privilege Design (aws#57) * Reduce role privileges for adapter role * Added adapter role output bucket permissions * Adjusted Nextflow SubmitJobBatchPolicy props * Prettier fix * Removal of BatchFullAccess managed policy * Code deduplication * LPD Full Implementation * Env removal from engine options * Regin and account parameter adjustment * Nextflow onspot instance bug fix * Usage of Arn.Format rather then custom ARN creation * Made roles retrieve account and region through props rather then ArnComponents * Removal of account and region to use default values * Added batch:ListJobs permissions to nextflow adapter * workflow engine documentation (aws#60) * Adds windows 10 as an OS option (aws#81) Tested AGC on a windows 10 machine running Ubuntu * Adds amd instance types (aws#84) * Corrected configuration of read workflow so that no MANIFEST is required (aws#86) * Fixing acg typo to agc (aws#91) Moving two instances of `acg` to `agc`. * rnaseq pipeline to use proper inputs.json file (aws#90) The rnaseq pipeline was referencing the inputs.json file from the atacseq example. This PR switches it to the proper inputs.json file. The old inputs.json file: ``` { "input": "s3://healthai-public-assets-us-east-1/agc-demo-data/atacseq/design.csv", "genome": "GRCh38", "single_end": true } ``` The new inputs.json file: ``` { "reads": "s3://1000genomes/phase3/data/HG00243/sequence_read/SRR*_{1,2}.filt.fastq.gz", "genome": "GRCh37", "skip_qc": true } ``` * Add workflow output command (aws#85) * workflow output command implementation * Workflow and Context autocomplete implementation (aws#82) * Workflow and Context autocomplete implementation * Support Max vCpu in project contexts (aws#89) feat: Configurable max vCpu for compute environments Add maxVCpu as a property of Context to control the maximum number of vCpus a compute environment can have at a given time Support a default Context values, set when a Context is unmarshalled, any value set in agc-project.yml will override the default. refs: aws#31 * Latest Release Link (aws#92) * Cleaned up Readme (aws#93) * Added version checker to AGC (aws#94) * Added version checker to AGC * Addressed feedback on the PR * Simplified version checker code * Go compiler version 1.16.0 -> 1.17.2 (aws#95) * Tabular text implementation and tests (aws#88) * Tabular text implementation and tests * Add Stale Issue Handling (aws#96) * added project validate command (aws#97) * markjschreiber/engine-in-contex-list (aws#99) * add engine name to context list command output * markjschreiber/clean-up-codebase (aws#98) * chore: Update Pull Request Template to Follow Conventional Commits (aws#100) Co-authored-by: Angela Li <dzl@amazon.com> * ci: Improved ci workflow (aws#102) * Builds the CDK project and validates eslint, also formats and fails if any formatting changes are detected. * Checks for format changes in the CLI project * ci: Add semantics behavior overrides (aws#106) * fix: Shows the relevant error if the workflow logs can't be retrieved (aws#103) * fix: workflows from demo-wdl-project should run without errors out of the box (aws#108) * test: use go 1.17 features to simplify unit tests (aws#110) * fix: show logs for workflows with more than 100 tasks (aws#114) * fix: use proper go tags for windows build (aws#117) * fix: use proper go tags for windows build * use nf-core for this workflow (aws#123) * feat: context destroy --force flag (aws#118) * context destroy --force flag * fix: Pass engine endpoint directly the wes adapter (aws#122) * chore: clean up project init code (aws#126) * ci: Add standard version, conventional changelog and bump script (aws#119) * ci: Add standard version, conventional changelog and bump script * fix: Fixes how users interact with the context commands (aws#115) Fixes how users interact with the context commands by allowing contexts to be passed in without the -c command * fix: invalid AWS Health url (aws#130) correctly point AWS health link to `aws.amazon.com/health` * build: Revamp build and release process (aws#127) We are updating our build pipeline to better automate the release process. This requires a few build related changes in our source code. * fix: Use correct context name (aws#132) the context name in `/examples/demo-wdl-project` is `myContext`, which is used by the examples here. * build: use latest build images (aws#134) * feat: Initial infrastructure for MiniWdl support (aws#125) Adds a MiniWdl stack which creates the appropriate batch resources and job definition to run MiniWdl jobs. * test: Added context deploy benchmarking script (aws#111) * Context deploy benchmark script * fix: Adds a message when new logs aren't shown to the user immediately (aws#131) * Adds a message when new logs aren't shown to the user immediately * fix: correctly link to core app (aws#133) * fix: temporary folder potential leak in some error scenarios. unit test for cdk command execution (aws#140) * fix: temporary folder potential leak in some error scenarios. unit test for cdk command execution * fixed typo in method name, updated implementation for channel waiter * Move release files one folder down * fix: updates context describe to be consistent with context destroy (aws#143) * fix: updates context describe to be consistent with context destroy * Best practice is to avoid mutation of inputs. Therefore, copy instead of move input (aws#145) * fix: miniwdl interpolation workaround The gatk4-rnaseq-germline-snps-indels workflow revealed a possible bug in miniwdl where it doesn't correctly handle string interpolation of optional values used in a calculation. This change to the workflow works around the problem in miniwdl. * fix: updates how the logs are shown from cloudwatch (aws#142) fix: updates how the logs are shown from cloudwatch * fix: improve contrast in docs (aws#149) * docs: Add information about example inputs and runtimes (aws#146) * add information about example inputs and runtimes * fix: Asserts order deterministically (aws#153) * docs: ongoing cost details (aws#152) * added ongoing costs section to contexts.md * added cost estimate links * fix: Workflow status now ignores unqueryable stacks (aws#138) fix: Workflow status now ignores unqueryable stacks * docs: miniwdl engine docs and example project for GATK best practices (aws#158) * add engine docs * add miniwdl examples * feat: Introducing AWS Lambda based WES Adapter for running the workflows (aws#155) * Introducint AWS Lambda based WES Adapter for running the workflows * Addressing the comments from PR review * fix: Deregionalize min permissions (aws#128) * add route53:ListHostedZonesByName * de-regionalize resource arns * split out CDK specific s3 permissions * fix for installation.md (aws#161) * feat: Improved Workflow logs (aws#156) * feat: Improved Workflow logs By default, workflow logs for a run will log out run status and individual task status. Tasks logs can be emitted with `--task <taskId>` for a single task log, `--all-tasks` for all task logs, and `--failed-tasks` for failed task logs. * chore(release): 1.1.0 Co-authored-by: AhmadBassyiouni <30308260+abassyiouni@users.noreply.github.com> Co-authored-by: Guy Hawkins <2242982+ghawk1ns@users.noreply.github.com> Co-authored-by: Illya Yalovyy <IllyaYalovyy@users.noreply.github.com> Co-authored-by: elliot-smith <elliotsm@amazon.com> Co-authored-by: W. Lee Pang, PhD <wleepang@gmail.com> Co-authored-by: Pang, Lee <pwyming@.amazon.com> Co-authored-by: Drew Dresser <andrewjdresser@gmail.com> Co-authored-by: Andrey Dovydenko <dovydenk@amazon.com> Co-authored-by: Mark Schreiber <mrschre@amazon.com> Co-authored-by: Sean Smith <seaam@amazon.com> Co-authored-by: a-li <7497012+a-li@users.noreply.github.com> Co-authored-by: Angela Li <dzl@amazon.com> Co-authored-by: nbraid <braidn@amazon.com>
Issue #, if available: NA
Description of Changes
Currently, logs are shown directly as configured from Cloudwatch. This can end up causing logs to be interleaved with one another which can be confusing. This PR forces the logs to be grouped together by the log group.
API Source: https://github.com/aws/aws-sdk-go-v2/blob/service/cloudwatchlogs/v1.8.0/service/cloudwatchlogs/api_op_FilterLogEvents.go#L68
Example
Description of how you validated changes
Ran the command locally several times with several different workflows
Checklist
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license