Add ECSClient module #2

jonavellecuerdo · 2024-07-03T12:41:52Z

Purpose and background context

Add an ECSClient module that can run and monitor ECS tasks.

Integrating webutils.aws.ecs.ECSClient with the Flask application is out of scope for this PR. There will be a separate effort for workflow orchestration. That said, the orchestrator (TBD) is likely to use the client like so:

ecs_client = ECSClient() # use default settings

if run_type == "review": 
   ecs_client.execute_review_run()
elif run_type == "final": 
   ecs_client.execute_final_run()

Note: The PR only appears to have a lot of changes due to the updated dependencies. The main items for review are:

implementation of ECSClient in webapp/utils/aws/ecs.py;
added fixtures and supporting unit tests in tests/test_ecs.py

How can a reviewer manually see the effects of these changes?

Run make test and verify all tests are passing.

Note: For now, running the preliminary set of unit tests is the best way to confirm that the implemented webapp.utils.aws.ecs.ECSClient module is working as expected.

Includes new or updated dependencies?

YES - Installed libraries: [attrs, boto3-stubs, moto]

Changes expectations for external applications?

NO

What are the relevant tickets?

https://mitlibraries.atlassian.net/browse/IN-1015

Developer

All new ENV is documented in README
All new ENV has been added to staging and production environments
All related Jira tickets are linked in commit message(s)
Stakeholder approval has been confirmed (or is not needed)

Code Reviewer(s)

The commit message is clear and follows our guidelines (not just this PR message)
There are appropriate tests covering any new functionality
The provided documentation is sufficient for understanding any new functionality introduced
Any manual tests have been performed and verified
New dependencies are appropriate or there were no changes

jonavellecuerdo · 2024-07-03T15:23:31Z

tests/conftest.py

+def ecs_run_task_response_success():
+    with open("tests/fixtures/ecs_run_task_response_success.json") as file:
+        return json.loads(file.read())


I initially tried to use the boto3.ECS.client.run_task mocked response from moto, but a lot of the standard response parameters were excluded from the mocked response. I opted to create a fixture by creating a JSON file of the de-identified output from a test run.

jonavellecuerdo · 2024-07-08T17:11:46Z

webapp/utils/aws/ecs.py

+    def monitor_task(self, task: str) -> dict:
+        """Polls ECS for task status updates.
+
+        This method will periodically (every 5 seconds) check for the status
+        of the running ECS task that matches the provided task ARN.
+        When the task run is complete, details of the run are returned.
+        """
+        while True:
+            response = self.client.describe_tasks(cluster=self.cluster, tasks=[task])
+            task_run_details = self.parse_run_details(response)  # type: ignore[arg-type]
+            task_status = task_run_details["status"]
+
+            message = f"Status for task {task}: {task_status}"
+            logger.info(message)
+
+            if task_status == "STOPPED":
+                logger.info("Task run has completed.")
+                break
+            time.sleep(5)
+        return task_run_details


This function will likely be updated once we have a clearer idea of how to display running tasks on the front end (i.e., loading / status updates). For now, this function demonstrates how boto3.ECS.Client.describe_task can be used to check on the status of a running ECS task.

jonavellecuerdo · 2024-07-08T19:09:23Z

tests/conftest.py

+@pytest.fixture
+def mock_ecs_task_state_transitions(
+    mock_ecs_cluster, mock_ecs_task_definition, mock_ecs_network_config
+):
+    """See https://github.com/getmoto/moto/blob/master/tests/test_ecs/test_ecs_boto3.py."""
+    state_manager.set_transition(
+        model_name="ecs::task", transition={"progression": "manual", "times": 1}
+    )
+
+    with mock_aws():
+        ecs = boto3.client("ecs", region_name=AWS_DEFAULT_REGION)
+        response = ecs.run_task(
+            cluster="mock-sapinvoices-ecs-test",
+            launchType="FARGATE",
+            networkConfiguration=mock_ecs_network_config,
+            overrides={},
+            taskDefinition="mock-sapinvoices-ecs-test:1",
+        )
+        yield response["tasks"][0]["taskArn"]


In implementing this fixture, I referenced the following moto test example.

jonavellecuerdo · 2024-07-08T19:24:04Z

webapp/config.py

+    def _getenv(self, name: str) -> Any:
+        if value := os.getenv(name):
+            return value
+        if name in self.REQUIRED_ENV_VARS:
+            message = f"'{name}' is a required environment variable."
+            raise AttributeError(message)
+        return None


This is used by the @property env access methods. This differs from __getattr__, which is only called for attributes that are not explicitly set for the class.

The purpose of _getenv() is to implement some error handling so that the @property env access methods can have a return type of str, which circumvents the following mypy error when ECSClient is instantiated:

error: Incompatible types in assignment (expression has type "str | None", variable has type "str") [assignment]

ehanson8

A few questions but this is nice and clean!

tests/conftest.py

tests/test_aws/test_ecs.py

webapp/config.py

webapp/exceptions.py

ehanson8 · 2024-07-08T20:21:29Z

webapp/utils/aws/ecs.py

+class ECSClient:
+    """ECS Client for running and monitoring task runs."""
+
+    cluster: str = field(factory=lambda: Config().ALMA_SAP_INVOICES_ECS_CLUSTER)


Just curious since I haven't used it before, what does the field(factory=lambda: pattern do here? Is this related to the @property decorator?

I can weigh in here as a proposer of this pattern. Using factory=lambda... allows each instance of this class to get fresh, up-to-date values from Config.

This allows for easier testing where monkeypatching values in conftest will successfully show up in any instance of this class. Otherwise, whenever the class is first imported -- and this is tricky to control, kind of a whack-a-mole -- whatever the env vars are at that moment will be the class defaults for those properties forever after, regardless of monkeypatching.

I think it comes down to our team's convention of using attrs, which I do like for the most part.

If we were not using attrs, I would envision this more like this in vanilla python:

class ECSClient: def __init__(self): config = Config() self.cluster = config.ALMA_SAP_INVOICES_ECS_CLUSTER self.task_definition = config.ALMA_SAP_INVOICES_ECS_TASK_DEFINITION self.network_configuration = config.ALMA_SAP_INVOICES_ECS_NETWORK_CONFIG self.container = config.ALMA_SAP_INVOICES_ECR_IMAGE_NAME

That said, we could kind of split the difference using attrs by using __attrs_post_init__:

@define class ECSClient: cluster: str = field(default=None) task_definition: str = field(default=None) network_configuration: str = field(default=None) container: str = field(default=None) def __attrs_post_init__(self) -> None: config = Config() if self.cluster is None: self.cluster = config.ALMA_SAP_INVOICES_ECS_CLUSTER if self.task_definition is None: self.task_definition = config.ALMA_SAP_INVOICES_ECS_TASK_DEFINITION if self.network_configuration is None: self.network_configuration = config.ALMA_SAP_INVOICES_ECS_NETWORK_CONFIG if self.container is None: self.container = config.ALMA_SAP_INVOICES_ECR_IMAGE_NAME

Apologies @jonavellecuerdo for making this a discussion point in this PR, but would be nice to find convention for us all to use.

FWIW, I've used it more sparingly in other projects: HRQB QBClient and DWClient, etc.

I think one argument for the factory pattern would be minimizing use of __attrs_post_init__ which some of attrs functionality kind of avoids needing.

In summary: I'm in support of this pattern, but curious what the group thinks, as I think it will continue to come up.

Spent some time reading documentation on attrs.Factory but I think @ghukill described it best above in regards to "why field(default=Config().<ENV> doesn't work for our unit tests":

whenever the class (webapp.config.Config is first imported / whatever the env vars are at that moment will be the class defaults for those properties forever after, regardless of monkeypatching)

I prefer the current approach over using __attrs_post_init__ (as shown in @ghukill 's example above); the latter, as the doc describes,

runs after attrs is done initializing your instance. This is useful if you want to derive some attribute from others or perform some kind of validation over the whole instance.

I interpret retrieval of the environment
variables as part of / the initialization step. Conceptually, the use of __attrs_post_init__ might be confusing. 🤔

Let me know your final thoughts, @ghukill and @ehanson8 !

@jonavellecuerdo I agree that it is a part of initialization

ghukill

Overall, think this is looking real good, and the logic and structure make sense to me.

Requested some changes (e.g. timeout for while loop in monitoring), but most other comments are discussion or optional.

Given that this ECS client is not wired into the app now, I'm feeling good about the structure and functionality at this point. Maybe when all the pieces come together, there will be some slight tweaks, but this seems like a good base.

tests/test_aws/test_ecs.py

ghukill · 2024-07-09T13:39:20Z

webapp/utils/aws/ecs.py

+class ECSClient:
+    """ECS Client for running and monitoring task runs."""
+
+    cluster: str = field(factory=lambda: Config().ALMA_SAP_INVOICES_ECS_CLUSTER)


I can weigh in here as a proposer of this pattern. Using factory=lambda... allows each instance of this class to get fresh, up-to-date values from Config.

This allows for easier testing where monkeypatching values in conftest will successfully show up in any instance of this class. Otherwise, whenever the class is first imported -- and this is tricky to control, kind of a whack-a-mole -- whatever the env vars are at that moment will be the class defaults for those properties forever after, regardless of monkeypatching.

I think it comes down to our team's convention of using attrs, which I do like for the most part.

If we were not using attrs, I would envision this more like this in vanilla python:

class ECSClient: def __init__(self): config = Config() self.cluster = config.ALMA_SAP_INVOICES_ECS_CLUSTER self.task_definition = config.ALMA_SAP_INVOICES_ECS_TASK_DEFINITION self.network_configuration = config.ALMA_SAP_INVOICES_ECS_NETWORK_CONFIG self.container = config.ALMA_SAP_INVOICES_ECR_IMAGE_NAME

That said, we could kind of split the difference using attrs by using __attrs_post_init__:

@define class ECSClient: cluster: str = field(default=None) task_definition: str = field(default=None) network_configuration: str = field(default=None) container: str = field(default=None) def __attrs_post_init__(self) -> None: config = Config() if self.cluster is None: self.cluster = config.ALMA_SAP_INVOICES_ECS_CLUSTER if self.task_definition is None: self.task_definition = config.ALMA_SAP_INVOICES_ECS_TASK_DEFINITION if self.network_configuration is None: self.network_configuration = config.ALMA_SAP_INVOICES_ECS_NETWORK_CONFIG if self.container is None: self.container = config.ALMA_SAP_INVOICES_ECR_IMAGE_NAME

Apologies @jonavellecuerdo for making this a discussion point in this PR, but would be nice to find convention for us all to use.

FWIW, I've used it more sparingly in other projects: HRQB QBClient and DWClient, etc.

I think one argument for the factory pattern would be minimizing use of __attrs_post_init__ which some of attrs functionality kind of avoids needing.

In summary: I'm in support of this pattern, but curious what the group thinks, as I think it will continue to come up.

webapp/utils/aws/ecs.py

ghukill · 2024-07-09T13:49:21Z

webapp/utils/aws/ecs.py

+        )
+        valid_tasks = [task.split("/")[-1] for task in response["taskDefinitionArns"]]
+
+        task_family_revision = self.task_definition.split("/")[-1]


I've seen this logic in a couple of places for task_family_revision; maybe that could be a method or property as well?

I added another @property-decorated method called task_family_revision.

tests/conftest.py

ehanson8

Looks good but I would remove [wip] from the commit label before merging

webapp/utils/aws/ecs.py

ehanson8 · 2024-07-10T13:34:28Z

webapp/utils/aws/ecs.py

+class ECSClient:
+    """ECS Client for running and monitoring task runs."""
+
+    cluster: str = field(factory=lambda: Config().ALMA_SAP_INVOICES_ECS_CLUSTER)


@jonavellecuerdo I agree that it is a part of initialization

webapp/utils/aws/ecs.py

ghukill

Looks great. Thanks for the updates and discussions. Looking forward to seeing this wired up in the flask app.

tests/test_aws/test_ecs.py

Why these changes are being introduced: * Provide a client for running and monitoring ECS tasks How this addresses that need: * Add ECSClient module with key commands: * execute_review_run * execute_final_run * monitor_task * Create custom exceptions * ECSTaskDefinitionDoesNotExistError * ECSTaskRuntimeExceededTimeoutError * Add fixtures and unit tests for ECSClient commands Side effects of this change: * None Relevant ticket(s): * https://mitlibraries.atlassian.net/browse/IN-1015

jonavellecuerdo self-assigned this Jul 3, 2024

jonavellecuerdo force-pushed the create-ecs-client branch 2 times, most recently from 2579b74 to 2eee405 Compare July 3, 2024 13:33

jonavellecuerdo requested review from ehanson8 and ghukill July 3, 2024 13:39

jonavellecuerdo force-pushed the create-ecs-client branch from 2eee405 to 4637d0a Compare July 3, 2024 15:16

jonavellecuerdo commented Jul 3, 2024

View reviewed changes

jonavellecuerdo force-pushed the init-app branch from d15cd2a to 017c350 Compare July 8, 2024 13:31

jonavellecuerdo force-pushed the create-ecs-client branch 2 times, most recently from 10a9e0c to e7776c7 Compare July 8, 2024 16:08

jonavellecuerdo force-pushed the init-app branch 2 times, most recently from 9124648 to d5aee02 Compare July 8, 2024 16:20

Base automatically changed from init-app to main July 8, 2024 16:23

jonavellecuerdo force-pushed the create-ecs-client branch 3 times, most recently from 9422868 to 298b284 Compare July 8, 2024 17:02

jonavellecuerdo commented Jul 8, 2024

View reviewed changes

jonavellecuerdo force-pushed the create-ecs-client branch from 298b284 to 813d48a Compare July 8, 2024 17:30

jonavellecuerdo marked this pull request as ready for review July 8, 2024 17:34

jonavellecuerdo marked this pull request as draft July 8, 2024 18:08

jonavellecuerdo force-pushed the create-ecs-client branch from 813d48a to 326c6e4 Compare July 8, 2024 19:06

jonavellecuerdo commented Jul 8, 2024

View reviewed changes

jonavellecuerdo force-pushed the create-ecs-client branch from 326c6e4 to 79e256a Compare July 8, 2024 19:17

MITLibraries deleted a comment from coveralls Jul 8, 2024

jonavellecuerdo commented Jul 8, 2024

View reviewed changes

jonavellecuerdo marked this pull request as ready for review July 8, 2024 19:29

ehanson8 reviewed Jul 8, 2024

View reviewed changes

ghukill requested changes Jul 9, 2024

View reviewed changes

jonavellecuerdo force-pushed the create-ecs-client branch 2 times, most recently from 4c067f1 to 46ab269 Compare July 9, 2024 19:28

jonavellecuerdo force-pushed the create-ecs-client branch from 46ab269 to a67789c Compare July 9, 2024 19:31

jonavellecuerdo requested review from ehanson8 and ghukill July 9, 2024 20:35

ehanson8 approved these changes Jul 10, 2024

View reviewed changes

ghukill approved these changes Jul 10, 2024

View reviewed changes

tests/test_aws/test_ecs.py Show resolved Hide resolved

jonavellecuerdo force-pushed the create-ecs-client branch from a67789c to 84ada58 Compare July 10, 2024 14:42

jonavellecuerdo merged commit 1eee166 into main Jul 10, 2024

jonavellecuerdo deleted the create-ecs-client branch July 10, 2024 14:47

Add ECSClient module #2

Add ECSClient module #2

Uh oh!

Conversation

jonavellecuerdo commented Jul 3, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose and background context

How can a reviewer manually see the effects of these changes?

Includes new or updated dependencies?

Changes expectations for external applications?

What are the relevant tickets?

Developer

Code Reviewer(s)

Uh oh!

jonavellecuerdo Jul 3, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jonavellecuerdo Jul 8, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ehanson8 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ghukill left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

ehanson8 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ghukill left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

jonavellecuerdo commented Jul 3, 2024 •

edited

Loading

jonavellecuerdo Jul 3, 2024 •

edited

Loading

jonavellecuerdo Jul 8, 2024 •

edited

Loading