Skip to content

Conversation

@aditya-jaishankar
Copy link
Contributor

@aditya-jaishankar aditya-jaishankar commented Feb 14, 2024

Summary

There is a bug currently in the library where a call to the ec2.describe_volumes only accepts at most 199 items in the Filters argument. However, in the case of very large clusters, there can be more than 199 instance ids passed in, which causes the call to fail.

This PR splits the number of filters to chunks of size no more than 199.

Testing Performed

  • All unit tests pass
  • Dummy projects were kicked off on databricks and the resulting cluster was inspected with the functions calls in both the cases of small number of workers as well as number of workers > 200 and it was verified that the functions calls return the cluster volume data as expected.

Checklist

Before formally opening this PR, please adhere to the following standards:

  • Branch/PR names begin with the related Jira ticket id (ie PROD-31) for Jira integration
  • File names are lower_snake_case
  • Relevant unit tests have been added or not applicable
  • Relevant documentation has been added or not applicable
  • Mark yourself as the assignee (makes it easier to scan the PR list)

Related Jira Ticket

) -> List[dict]:
"""Get all ebs volumes associated with a list of instance reservations"""

def get_chunk(instance_ids: list, chunk_size: int) -> Iterator[list]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't the return type technically a -> Generator[list, None, None]?

Might want to also type the args like so:

Suggested change
def get_chunk(instance_ids: list, chunk_size: int) -> Iterator[list]:
def get_chunk(instance_ids: List[str], chunk_size: int) -> Generator[List[str], None, None]:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mentioning this because it can help the IDE auto-complete.

For example if you were to [item.count('a') for item in self.get_chunk(***) pycharm will autocomplete the .count() because it knows its a list of strings

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aah yes, thanks good catch.

romainissynced
romainissynced previously approved these changes Feb 15, 2024
Copy link
Contributor

@romainissynced romainissynced left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just left a nitpick on typing that doesn't need to be fixed before merge.

sync/__init__.py Outdated
@@ -1,4 +1,4 @@
"""Library for leveraging the power of Sync"""
__version__ = "1.0.0"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to update this to 1.0.3 now

Comment on lines +512 to +516
while next_token:
response = ec2_client.describe_volumes(Filters=filters, NextToken=next_token)
volumes += response.get("Volumes", [])
next_token = response.get("NextToken")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you had a chance to test this out on a big cluster yet?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I tested this out by kicking off a big job with 300 .large instances and then wrote a local scipt to call describe_volumes() on that cluster_id while the cluster was running and it worked fine (also worked fine for a cluster with 10 instances)

Copy link
Contributor

@gorskysd gorskysd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@aditya-jaishankar aditya-jaishankar merged commit 7a00d68 into main Feb 21, 2024
@aditya-jaishankar aditya-jaishankar deleted the bugfix-describe-volumes branch February 21, 2024 14:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants