Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactored Dockerfiles #625
Refactored Dockerfiles #625
Changes from 5 commits
49ab288
91507a7
24089cc
3dd7b6a
4301ca5
67ba228
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want this library to be in /usr/ or /opt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is up for you to decide. Since various parts of TRTorch are going to be pulled to other containers (Triton, Riva SM), it's easier to handle if it's under its own root - but I am fine with either way you choose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there is a standing convention in DLFW containers to put libraries we add in /opt then I am fine with /opt. My preference is /usr otherwise since it seems more conventional and easier to link against
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes we did put things in /opt, like triton and such. Let's use that for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to do any sort of clean up for size reduction? Do we conventionally leave the source in container?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like things like bazel are probably not needed post testing and install in the container itself
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bazel is only installed in trtorch-builder image, not in target container. I do leave the source in TRTorch container - won't in Pytorch MR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
This file was deleted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dont think we want this change in master until we retarget the mainline codebase to PyTorch 1.10 (post PyTorch's 1.10 release). @andi4191 this might be a good time to start working through the DLFW release workflow. We can retarget this branch to be merged into a
DLFW-21.10
branch (lets all agree on a convention for these branch names @borisfom @andi4191 @ptrblck) and then cherry-pick any changes we want in the mainline at a later date (Like the single unified Dockerfile I think will be super useful)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this critical? It just allows TRTorch from master branch to be used in 21.07, 21.08, 21.09 containers, and it breaks nothing ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because in the typical case (like not for the state of the repo as it is right now) we only guarantee compatibility with the latest released pytorch since we depend on internal APIs. There have been cases in the past where changes needed to support pytorch-next break support for the previous version. We arent checking against pytorch ToT and if some user has a newer build installed, it might not work even though setup.py says the dependency is compatible. I'd prefer if we only make changes specifically targeted for DLFW in their own branches so that we can keep the guarantees we give to users simple to understand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, the concern is understood. However, the effect of user trying to install current ToT TrTorch in 21.07+ container would not be non-working TrTorch - it would be messing user's container (it will uninstall Pytorch 1.10a and will try to install 1.9). So I believe we'll get less users frustrated if we relax the requirement. Especially that I have checked both 21.07 and 21.08 and at least mostly it works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@borisfom: I agree with @narendasan on this. We don't have control over the end-user use cases here. It is safe to proceed with what we can guarantee that works.
Also, can you re-target this PR to release/ngc/21.10 instead of DLFW-21.10 ?