Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix external weights loading during runtime in OpenVINO EP #7525

Conversation

agrawalsourav98
Copy link
Contributor

Some models that are large e.g. ViT-H-14-378-quickgelu__dfn5b and have their attributes split into multiple files are loaded at runtime instead of the entire model being loaded during session creation.

This PR attempts to fix that behaviour.

Copy link
Contributor

@mertalev mertalev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The concurrency issue aside, I also don't like that we're bolting on temporary code and adding overhead like this.

Comment on lines +109 to +117
cwd = os.getcwd()
if model.mode == "vision":
os.chdir(Path(model.visual_path).parent)
elif model.mode == "text":
os.chdir(Path(model.textual_path).parent)
try:
outputs = await run(model.predict, inputs)
finally:
os.chdir(cwd)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't make sense with multiple requests in flight.

@mertalev
Copy link
Contributor

FWIW I think it's possible to solve the external data not being loaded eagerly by calling run with dummy data after the session is created. That should force all the weights to be loaded. Still not ideal but it at least keeps the temporary code in one place.

@agrawalsourav98
Copy link
Contributor Author

FWIW I think it's possible to solve the external data not being loaded eagerly by calling run with dummy data after the session is created. That should force all the weights to be loaded. Still not ideal but it at least keeps the temporary code in one place.

Yeah, that is something we can do. I am not a fan of hacky stuff either it's just that I don't think intel cares enough that we would see newer packages anytime soon, so maybe we shouldn't rely on them.

I would try this out and update this PR if it is viable else let's just drop this approach.

@mertalev
Copy link
Contributor

mertalev commented Mar 1, 2024

I looked at their release cadence and they seem to like doing April releases. They probably scheduled it to be some time next month. Granted if we have a different issue with that release, it'll be half a year before they release the next one lol. I guess we can just see if this patch works and wait for April, then maybe look at building it ourselves if it comes to that.

@agrawalsourav98
Copy link
Contributor Author

I looked at their release cadence and they seem to like doing April releases. They probably scheduled it to be some time next month. Granted if we have a different issue with that release, it'll be half a year before they release the next one lol. I guess we can just see if this patch works and wait for April, then maybe look at building it ourselves if it comes to that.

Yes, I building can be a better option in the future. They provide a ready to use DockerFile for OpenVINO here, so I don't think we should be too worried about building it ourselves.

@agrawalsourav98
Copy link
Contributor Author

Closing as it has been fixed by #7854 and #8018

@agrawalsourav98 agrawalsourav98 deleted the fix-openvino-on-large-models branch March 19, 2024 15:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants