Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Framework] Port 7567 bug ocean integration is being terminated due to OOM #528

Merged

Conversation

omby8888
Copy link
Contributor

@omby8888 omby8888 commented Apr 14, 2024

Description

What - ocean integration is being terminated due to OOM whene dealing with large amount of entities to process
Why - With asyncio.gather, all items are gathered into a single list before being processed, potentially causing memory issues if the list is too large, while asyncio.queue doesn't require holding all items in memory at once.

Type of change

Please leave one option from the following and delete the rest:

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • New Integration (non-breaking change which adds a new integration)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Non-breaking change (fix of existing functionality that will not change current behavior)
  • Documentation (added/updated documentation)

Screenshots

Include screenshots from your environment showing how the resources of the integration will look.

API Documentation

Provide links to the API documentation used for this integration.

@github-actions github-actions bot added size/M framework A change in the ocean framework files labels Apr 14, 2024
@omby8888 omby8888 force-pushed the PORT-7567-Bug-Ocean-integration-is-being-terminated-due-to-OOM branch from c79920d to 40bd6fc Compare April 15, 2024 09:55
…ntegration-is-being-terminated-due-to-OOM

# Conflicts:
#	port_ocean/core/handlers/entity_processor/jq_entity_processor.py
@omby8888 omby8888 marked this pull request as draft April 30, 2024 13:07
@omby8888 omby8888 marked this pull request as ready for review May 7, 2024 08:45
Copy link
Contributor

@Tankilevitch Tankilevitch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bump version and add changelog

Comment on lines 62 to 63
elif obj != before_dict[key]:
modified.append(obj)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not using is_same_entity?
using trying to evaluate the full object is much more expensive than just evaluating identifier and blueprint

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

saw that no need for the elif, could be replace simply with else

Comment on lines +26 to +31
async def process_in_queue(
objects_to_process: list[Any],
func: Callable[..., Coroutine[Any, Any, T]],
*args: Any,
concurrency: int = 50,
) -> list[T]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function will be available for integrations as well ( as its in the port_ocean/utils ), lets make sure we add documentation to the function and an example, super important to add the why & when to use the function.

Comment on lines 46 to 47
for i in range(concurrency):
await queue.put(None)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is that?

Copy link
Contributor Author

@omby8888 omby8888 May 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so the worker thread will exit the loop and stop working, if we wont do that then raw_params = await queue.get() will forever wait for more tasks and leave 50 workers threads open

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then add comment as it is not straight forward

Copy link
Contributor

@Tankilevitch Tankilevitch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor

Comment on lines 46 to 47
for i in range(concurrency):
await queue.put(None)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then add comment as it is not straight forward

port_ocean/utils/queue_utils.py Outdated Show resolved Hide resolved
@omby8888 omby8888 merged commit 98cf927 into main May 12, 2024
4 checks passed
@omby8888 omby8888 deleted the PORT-7567-Bug-Ocean-integration-is-being-terminated-due-to-OOM branch May 12, 2024 14:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
framework A change in the ocean framework files size/M
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants