-
Notifications
You must be signed in to change notification settings - Fork 11
Add create_profiling_group call in refresh_configuration and report() #26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
As part of Lambda 1-click integration, we want the agent to create a profiling group for the user if it does not exist. To do this, we add a create_profiling_group API call in places where we could receive a ResourceNotFoundException for the PG.
| self.metadata = environment["agent_metadata"] | ||
| self.agent_config_merger = environment["agent_config_merger"] | ||
| self.is_lambda_one_click_pg_created_during_execution = False | ||
| # self.is_create_pg_called_during_submit_profile = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose you want to keep only one of this is_create_pg_called_during_submit_profile and remove the other.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Besides having all these flags for the Lambda usecase, how about creating a PG (if it doesn't exist already) in the Lambda decorator only? https://github.com/aws/amazon-codeguru-profiler-python-agent/blob/main/codeguru_profiler_agent/aws_lambda/profiler_decorator.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose you want to keep only one of this
is_create_pg_called_during_submit_profileand remove the other.
Good catch! Forgot to remove this. Fixed in the latest commit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Besides having all these flags for the Lambda usecase, how about creating a PG (if it doesn't exist already) in the Lambda decorator only? https://github.com/aws/amazon-codeguru-profiler-python-agent/blob/main/codeguru_profiler_agent/aws_lambda/profiler_decorator.py
We could use that approach but we do not want the initialization of the decorator to depend on the API calls. If let's say we do that, then we will have to check whether a PG has been created already by using DescribeProfilingGroup API and then conditionally call CreateProfilingGroup API. Then, in this case the load on control plane API describePG will be huge if the lambda scales to lets say 10000 instances. Another variation of this would be to call CreateProfilingGroup API unconditionally, in which case the load on CreatePG will be huge. Instead if we depend on refresh_configuration() or report() method calls, then we automatically get the functionality where if PG is not present already then we get ResourceNotFoundException, and we get the functionality where we will get this exception only once and we will call CreatePG only once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the explanation, good point!
| global is_create_pg_called_during_submit_profile | ||
| is_create_pg_called_during_submit_profile = True | ||
| self.create_pg_when_one_click_integration_is_active() | ||
| self._log_request_failed(operation="post_agent_profile", exception=error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say to not log exception when we create the PG, but log something at INFO level that we are creating one PG.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this throwing ClientError? also the operation should be configure_agent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say to not log exception when we create the PG, but log something at INFO level that we are creating one PG.
Changed in the next commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because both ConfigureAgent and PostAgentProfile throw a ResourceNotFoundException. So if we fail to create a PG during the first refresh_configuration() call and the agent defaults to a default configuration, we can still attempt to create a PG during the report() call.
test/unit/test_local_aggregator.py
Outdated
| from codeguru_profiler_agent.utils.time import current_milli_time | ||
| from test.pytestutils import before | ||
| from mock import MagicMock, call | ||
| from mock import MagicMock, call, patch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this new "patch" used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope removed it! :)
| try: | ||
| self.client_stubber.assert_no_pending_responses() | ||
| assert False | ||
| except AssertionError: | ||
| assert True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the usecase when the failure of the assertions should not be considered as a failure? I think this is the equivalent of not having any assertions, so if it's not reliable or useful, we can just delete it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got it, client_stubber.assert_no_pending_responses() should throw exception. What's the usecase? If this is really difficult to test, try to assert something more, like the message or something so we do not hide any failures that we do not want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah makes sense. I removed the unnecessary check and just used self.client_stubber.assert_no_pending_responses() like we have in our other test files (ex. test_end_to_end_profiling.py)
| ) | ||
| logger.info("Reported profile successfully") | ||
| return True | ||
| except ClientError as error: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please update the documentation of this functions that this creates a PG in the case of using Lamba layers if it throws ResourceNotFoundException? I would be clearer when reading the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done!
|
|
||
| logger = logging.getLogger(__name__) | ||
|
|
||
| HANDLER_ENV_NAME_FOR_CODEGURU = "HANDLER_ENV_NAME_FOR_CODEGURU" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about ending the variable in KEY, or ENV_KEY to be clearer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done!
| if fleet_info_as_map['computeType'] == 'aws_lambda': | ||
| handler_name_ = os.environ.get(HANDLER_ENV_NAME_FOR_CODEGURU) | ||
| if not handler_name_: | ||
| logger.error("Env Variables for CodeGuru Profiler Lambda Layer are not set. Cannot create PG.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this be printed if the customer is using Lambda, but without the layer integration?
If so, we should not log error then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no this is only when the customer is using Lambda layer. 1-click support is only for the layer I believe.
| """ | ||
| # https://docs.aws.amazon.com/lambda/latest/dg/lambda-runtimes.html | ||
| python_runtime_identifier_prefix = "python" | ||
| execution_environment_env_variable = os.environ.get(AWS_EXECUTION_ENV_KEY) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would be the lambda's runtime to be non-python, but this code to run?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As per above comment! Why do we need to know if the execution environment is python or not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah my bad not needed. removed it
| except AssertionError: | ||
| assert True | ||
| assert self.subject.is_profiling_group_created_during_execution is False | ||
| assert self.client_stubber.assert_no_pending_responses() is None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: this warns about no empty line at end of file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, added it!
| lambda_one_click_profiling_group_name = "aws-lambda-testLambdaName" | ||
| test_agent_metadata_for_lambda = AgentMetadata( | ||
| fleet_info=AWSLambda("arn:aws:lambda:us-east-1:111111111111:function:testLambdaName", "memory", "env", "agentId")) | ||
| test_lambda_profiling_group_name = "aws-lambda-testLambdaName" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this be renamed to something clearer e.g. autocreated_profiling_group_name_lambda or something like it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, done!
| self.reporter.refresh_configuration() | ||
|
|
||
| def _report_profile(self, now): | ||
| current_last_report_attempted_value = self.last_report_attempted |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: replace current with previous?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done!
| try: | ||
| self.client_stubber.assert_no_pending_responses() | ||
| assert False | ||
| except AssertionError: | ||
| assert True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got it, client_stubber.assert_no_pending_responses() should throw exception. What's the usecase? If this is really difficult to test, try to assert something more, like the message or something so we do not hide any failures that we do not want.
| def test_when_backends_sends_resource_not_found_it_stops_the_profiling_in_non_lambda_case(self): | ||
| self.client_stubber.add_client_error('configure_agent', service_error_code='ResourceNotFoundException', | ||
| service_message='Simulated error in configure_agent call') | ||
| os.environ.__setitem__(AWS_EXECUTION_ENV_KEY, 'AWS_Lambda_java') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test is about non-lambda, though this is set to lambda java. Either update the test name, either remove this line? (also related to my questions why are we considering there the java lambda usecase; I think it can't happen, and this should be removed).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed this!
| logger.info("ResourceNotFoundException for a Lambda PG in ConfigureAgent call. "+ | ||
| "Attempting to create PG") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change logging message to this :
"Profiling group not found. Will try to create a profiling group with name = %s and compute platform = %s and retry calling configure agent after 5 minutes"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
| self._log_request_failed(operation="configure_agent", exception=error) | ||
| self._log_request_failed(operation="configure_agent", exception=error) | ||
|
|
||
| if error.response['Error']['Code'] == 'ResourceNotFoundException': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question here : What configuration will be used in case of ResourceNotFoundException and ValidationException.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default configuration defined here
|
|
||
| logger = logging.getLogger(__name__) | ||
|
|
||
| HANDLER_ENV_NAME_FOR_CODEGURU_KEY = "HANDLER_ENV_NAME_FOR_CODEGURU" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This env var key is already defined in lambda_handler.py. Can we use that instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, fixed!
| Lambda layers are set, it tries to create a Profiling Group whenever a ResourceNotFoundException | ||
| is encountered. | ||
| """ | ||
| global is_create_pg_called_during_submit_profile |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are you defining it global here? Can't you do that at line #22?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed global altogether. was using a class variable so global was not needed!
| @with_timer("createProfilingGroup", measurement="wall-clock-time") | ||
| def create_profiling_group(self): | ||
| """ | ||
| Create a PG for the Lambda function onboarded with 1-click integration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change this comment. This will also be done if the customer doesn't use the code change option?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes done
| ) | ||
| self.is_profiling_group_created_during_execution = True | ||
| logger.info("Created Lambda Profiling Group with name " + str(self.profiling_group_name)) | ||
| except Exception as e: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handle ConflictException too by logging an appropriate message and setting is_profiling_group_created_during_execution if required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
| self._log_request_failed(operation="create_profiling_group", exception=e) | ||
|
|
||
| def should_autocreate_profiling_group(self): | ||
| return self.is_runtime_python() and self.is_handler_name_env_variable_set() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you want to know if it is python or not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I don't think checking for environment variable is the correct way. Instead we can have class variable computePlatform and the default value should be Default and when the lambda profile builder is called we should create the SdkReported with AWSLambda compute platform.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you want to know if it is python or not?
fixed it. not needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I don't think checking for environment variable is the correct way. Instead we can have class variable
computePlatformand the default value should beDefaultand when the lambda profile builder is called we should create the SdkReported withAWSLambdacompute platform.
Whenever we call build_profiler while creating a Lambda profiler, we set should_autocreate_profiling_group to True so that takes care of it.
| profilingGroupName=self.profiling_group_name, | ||
| computePlatform='AWSLambda' | ||
| ) | ||
| self.is_profiling_group_created_during_execution = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see any usage of is_profiling_group_created_during_execution . Can you explain why are we having this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah was using it before. Removed it!
| except Exception as e: | ||
| self._log_request_failed(operation="create_profiling_group", exception=e) | ||
|
|
||
| def is_compute_platform_lambda(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a wrapper method should_auto_create_profiling_group which checks if the compute platform is lambda for now. Use that wrapper method while deciding to call create_profiling_group in report and refresh_configuration methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
…uld be autocreated
pandpara
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Issue #, if available:
Description of changes:
As part of Lambda 1-click integration, we want the agent to create a profiling group for the user if it does not exist. To do this, we add a create_profiling_group API call in places where we could receive a ResourceNotFoundException for the PG.
Testing: Pytest unit tests.
Tested on a sample app in my AWS account.
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.