Skip to content

Conversation

@jonavellecuerdo
Copy link
Contributor

@jonavellecuerdo jonavellecuerdo commented Jul 3, 2024

Purpose and background context

Add an ECSClient module that can run and monitor ECS tasks.

Integrating webutils.aws.ecs.ECSClient with the Flask application is out of scope for this PR. There will be a separate effort for workflow orchestration. That said, the orchestrator (TBD) is likely to use the client like so:

ecs_client = ECSClient() # use default settings

if run_type == "review": 
   ecs_client.execute_review_run()
elif run_type == "final": 
   ecs_client.execute_final_run()

Note: The PR only appears to have a lot of changes due to the updated dependencies. The main items for review are:

How can a reviewer manually see the effects of these changes?

  1. Run make test and verify all tests are passing.

Note: For now, running the preliminary set of unit tests is the best way to confirm that the implemented webapp.utils.aws.ecs.ECSClient module is working as expected.

Includes new or updated dependencies?

YES - Installed libraries: [attrs, boto3-stubs, moto]

Changes expectations for external applications?

NO

What are the relevant tickets?

https://mitlibraries.atlassian.net/browse/IN-1015

Developer

  • All new ENV is documented in README
  • All new ENV has been added to staging and production environments
  • All related Jira tickets are linked in commit message(s)
  • Stakeholder approval has been confirmed (or is not needed)

Code Reviewer(s)

  • The commit message is clear and follows our guidelines (not just this PR message)
  • There are appropriate tests covering any new functionality
  • The provided documentation is sufficient for understanding any new functionality introduced
  • Any manual tests have been performed and verified
  • New dependencies are appropriate or there were no changes

@jonavellecuerdo jonavellecuerdo self-assigned this Jul 3, 2024
@jonavellecuerdo jonavellecuerdo force-pushed the create-ecs-client branch 2 times, most recently from 2579b74 to 2eee405 Compare July 3, 2024 13:33
Comment on lines 110 to 138
def ecs_run_task_response_success():
with open("tests/fixtures/ecs_run_task_response_success.json") as file:
return json.loads(file.read())
Copy link
Contributor Author

@jonavellecuerdo jonavellecuerdo Jul 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I initially tried to use the boto3.ECS.client.run_task mocked response from moto, but a lot of the standard response parameters were excluded from the mocked response. I opted to create a fixture by creating a JSON file of the de-identified output from a test run.

@jonavellecuerdo jonavellecuerdo force-pushed the create-ecs-client branch 2 times, most recently from 10a9e0c to e7776c7 Compare July 8, 2024 16:08
@jonavellecuerdo jonavellecuerdo force-pushed the init-app branch 2 times, most recently from 9124648 to d5aee02 Compare July 8, 2024 16:20
Base automatically changed from init-app to main July 8, 2024 16:23
@jonavellecuerdo jonavellecuerdo force-pushed the create-ecs-client branch 3 times, most recently from 9422868 to 298b284 Compare July 8, 2024 17:02
Comment on lines 70 to 101
def monitor_task(self, task: str) -> dict:
"""Polls ECS for task status updates.
This method will periodically (every 5 seconds) check for the status
of the running ECS task that matches the provided task ARN.
When the task run is complete, details of the run are returned.
"""
while True:
response = self.client.describe_tasks(cluster=self.cluster, tasks=[task])
task_run_details = self.parse_run_details(response) # type: ignore[arg-type]
task_status = task_run_details["status"]

message = f"Status for task {task}: {task_status}"
logger.info(message)

if task_status == "STOPPED":
logger.info("Task run has completed.")
break
time.sleep(5)
return task_run_details
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function will likely be updated once we have a clearer idea of how to display running tasks on the front end (i.e., loading / status updates). For now, this function demonstrates how boto3.ECS.Client.describe_task can be used to check on the status of a running ECS task.

@jonavellecuerdo jonavellecuerdo marked this pull request as ready for review July 8, 2024 17:34
@jonavellecuerdo jonavellecuerdo marked this pull request as draft July 8, 2024 18:08
Comment on lines 103 to 118
@pytest.fixture
def mock_ecs_task_state_transitions(
mock_ecs_cluster, mock_ecs_task_definition, mock_ecs_network_config
):
"""See https://github.com/getmoto/moto/blob/master/tests/test_ecs/test_ecs_boto3.py."""
state_manager.set_transition(
model_name="ecs::task", transition={"progression": "manual", "times": 1}
)

with mock_aws():
ecs = boto3.client("ecs", region_name=AWS_DEFAULT_REGION)
response = ecs.run_task(
cluster="mock-sapinvoices-ecs-test",
launchType="FARGATE",
networkConfiguration=mock_ecs_network_config,
overrides={},
taskDefinition="mock-sapinvoices-ecs-test:1",
)
yield response["tasks"][0]["taskArn"]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In implementing this fixture, I referenced the following moto test example.

@MITLibraries MITLibraries deleted a comment from coveralls Jul 8, 2024
Comment on lines +34 to +40
def _getenv(self, name: str) -> Any:
if value := os.getenv(name):
return value
if name in self.REQUIRED_ENV_VARS:
message = f"'{name}' is a required environment variable."
raise AttributeError(message)
return None
Copy link
Contributor Author

@jonavellecuerdo jonavellecuerdo Jul 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is used by the @property env access methods. This differs from __getattr__, which is only called for attributes that are not explicitly set for the class.

The purpose of _getenv() is to implement some error handling so that the @property env access methods can have a return type of str, which circumvents the following mypy error when ECSClient is instantiated:

error: Incompatible types in assignment (expression has type "str | None", variable has type "str") [assignment]

@jonavellecuerdo jonavellecuerdo marked this pull request as ready for review July 8, 2024 19:29
Copy link

@ehanson8 ehanson8 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few questions but this is nice and clean!

class ECSClient:
"""ECS Client for running and monitoring task runs."""

cluster: str = field(factory=lambda: Config().ALMA_SAP_INVOICES_ECS_CLUSTER)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious since I haven't used it before, what does the field(factory=lambda: pattern do here? Is this related to the @property decorator?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can weigh in here as a proposer of this pattern. Using factory=lambda... allows each instance of this class to get fresh, up-to-date values from Config.

This allows for easier testing where monkeypatching values in conftest will successfully show up in any instance of this class. Otherwise, whenever the class is first imported -- and this is tricky to control, kind of a whack-a-mole -- whatever the env vars are at that moment will be the class defaults for those properties forever after, regardless of monkeypatching.

I think it comes down to our team's convention of using attrs, which I do like for the most part.

If we were not using attrs, I would envision this more like this in vanilla python:

class ECSClient:
    
    def __init__(self):
        config = Config()
        self.cluster = config.ALMA_SAP_INVOICES_ECS_CLUSTER
        self.task_definition = config.ALMA_SAP_INVOICES_ECS_TASK_DEFINITION
        self.network_configuration = config.ALMA_SAP_INVOICES_ECS_NETWORK_CONFIG
        self.container = config.ALMA_SAP_INVOICES_ECR_IMAGE_NAME

That said, we could kind of split the difference using attrs by using __attrs_post_init__:

@define
class ECSClient:

    cluster: str = field(default=None)
    task_definition: str = field(default=None)
    network_configuration: str = field(default=None)
    container: str = field(default=None)

    def __attrs_post_init__(self) -> None:
        config = Config()

        if self.cluster is None:
            self.cluster = config.ALMA_SAP_INVOICES_ECS_CLUSTER

        if self.task_definition is None:
            self.task_definition = config.ALMA_SAP_INVOICES_ECS_TASK_DEFINITION

        if self.network_configuration is None:
            self.network_configuration = config.ALMA_SAP_INVOICES_ECS_NETWORK_CONFIG

        if self.container is None:
            self.container = config.ALMA_SAP_INVOICES_ECR_IMAGE_NAME

Apologies @jonavellecuerdo for making this a discussion point in this PR, but would be nice to find convention for us all to use.

FWIW, I've used it more sparingly in other projects: HRQB QBClient and DWClient, etc.

I think one argument for the factory pattern would be minimizing use of __attrs_post_init__ which some of attrs functionality kind of avoids needing.

In summary: I'm in support of this pattern, but curious what the group thinks, as I think it will continue to come up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spent some time reading documentation on attrs.Factory but I think @ghukill described it best above in regards to "why field(default=Config().<ENV> doesn't work for our unit tests":

whenever the class (webapp.config.Config is first imported / whatever the env vars are at that moment will be the class defaults for those properties forever after, regardless of monkeypatching)

I prefer the current approach over using __attrs_post_init__ (as shown in @ghukill 's example above); the latter, as the doc describes,

runs after attrs is done initializing your instance. This is useful if you want to derive some attribute from others or perform some kind of validation over the whole instance.

I interpret retrieval of the environment
variables as part of / the initialization step. Conceptually, the use of __attrs_post_init__ might be confusing. 🤔

Let me know your final thoughts, @ghukill and @ehanson8 !

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jonavellecuerdo I agree that it is a part of initialization

Copy link
Contributor

@ghukill ghukill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, think this is looking real good, and the logic and structure make sense to me.

Requested some changes (e.g. timeout for while loop in monitoring), but most other comments are discussion or optional.

Given that this ECS client is not wired into the app now, I'm feeling good about the structure and functionality at this point. Maybe when all the pieces come together, there will be some slight tweaks, but this seems like a good base.

class ECSClient:
"""ECS Client for running and monitoring task runs."""

cluster: str = field(factory=lambda: Config().ALMA_SAP_INVOICES_ECS_CLUSTER)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can weigh in here as a proposer of this pattern. Using factory=lambda... allows each instance of this class to get fresh, up-to-date values from Config.

This allows for easier testing where monkeypatching values in conftest will successfully show up in any instance of this class. Otherwise, whenever the class is first imported -- and this is tricky to control, kind of a whack-a-mole -- whatever the env vars are at that moment will be the class defaults for those properties forever after, regardless of monkeypatching.

I think it comes down to our team's convention of using attrs, which I do like for the most part.

If we were not using attrs, I would envision this more like this in vanilla python:

class ECSClient:
    
    def __init__(self):
        config = Config()
        self.cluster = config.ALMA_SAP_INVOICES_ECS_CLUSTER
        self.task_definition = config.ALMA_SAP_INVOICES_ECS_TASK_DEFINITION
        self.network_configuration = config.ALMA_SAP_INVOICES_ECS_NETWORK_CONFIG
        self.container = config.ALMA_SAP_INVOICES_ECR_IMAGE_NAME

That said, we could kind of split the difference using attrs by using __attrs_post_init__:

@define
class ECSClient:

    cluster: str = field(default=None)
    task_definition: str = field(default=None)
    network_configuration: str = field(default=None)
    container: str = field(default=None)

    def __attrs_post_init__(self) -> None:
        config = Config()

        if self.cluster is None:
            self.cluster = config.ALMA_SAP_INVOICES_ECS_CLUSTER

        if self.task_definition is None:
            self.task_definition = config.ALMA_SAP_INVOICES_ECS_TASK_DEFINITION

        if self.network_configuration is None:
            self.network_configuration = config.ALMA_SAP_INVOICES_ECS_NETWORK_CONFIG

        if self.container is None:
            self.container = config.ALMA_SAP_INVOICES_ECR_IMAGE_NAME

Apologies @jonavellecuerdo for making this a discussion point in this PR, but would be nice to find convention for us all to use.

FWIW, I've used it more sparingly in other projects: HRQB QBClient and DWClient, etc.

I think one argument for the factory pattern would be minimizing use of __attrs_post_init__ which some of attrs functionality kind of avoids needing.

In summary: I'm in support of this pattern, but curious what the group thinks, as I think it will continue to come up.

)
valid_tasks = [task.split("/")[-1] for task in response["taskDefinitionArns"]]

task_family_revision = self.task_definition.split("/")[-1]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've seen this logic in a couple of places for task_family_revision; maybe that could be a method or property as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added another @property-decorated method called task_family_revision.

@jonavellecuerdo jonavellecuerdo force-pushed the create-ecs-client branch 2 times, most recently from 4c067f1 to 46ab269 Compare July 9, 2024 19:28
Copy link

@ehanson8 ehanson8 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good but I would remove [wip] from the commit label before merging

class ECSClient:
"""ECS Client for running and monitoring task runs."""

cluster: str = field(factory=lambda: Config().ALMA_SAP_INVOICES_ECS_CLUSTER)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jonavellecuerdo I agree that it is a part of initialization

Copy link
Contributor

@ghukill ghukill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. Thanks for the updates and discussions. Looking forward to seeing this wired up in the flask app.

Why these changes are being introduced:
* Provide a client for running and monitoring ECS tasks

How this addresses that need:
* Add ECSClient module with key commands:
  * execute_review_run
  * execute_final_run
  * monitor_task
* Create custom exceptions
  * ECSTaskDefinitionDoesNotExistError
  * ECSTaskRuntimeExceededTimeoutError
* Add fixtures and unit tests for ECSClient commands

Side effects of this change:
* None

Relevant ticket(s):
* https://mitlibraries.atlassian.net/browse/IN-1015
@jonavellecuerdo jonavellecuerdo merged commit 1eee166 into main Jul 10, 2024
@jonavellecuerdo jonavellecuerdo deleted the create-ecs-client branch July 10, 2024 14:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants