-
Notifications
You must be signed in to change notification settings - Fork 117
Upgrade sklearn to 1.4.2 and numpy to 2.1.0 with modern Python packaging #255
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Upgrade scikit-learn from 1.2.1 to 1.4.2 - Upgrade numpy to 2.1.0 - Upgrade PyArrow from 14.0.1 to 17.0.0 with proper Arrow C++ integration - Replace Miniconda with system Python 3.10 and uv package manager - Add pyproject.toml for modern Python packaging standards - Update MLIO build to support Arrow 17.0.0 compatibility - Modernize dependency management and remove Node.js dependencies - Update test fixtures and version checks across all components - Add cleanup fixture for multi-model endpoint tests
docker/1.4-2/base/Dockerfile.cpu
Outdated
| @@ -1 +1 @@ | |||
| ARG UBUNTU_VERSION=20.04 | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say we should use a newer ubuntu versions as a part of this effort. Using this older version would just make the image more vulnerable to different vulnerabilities.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ya I was wondering to do so, but wasn't sure if that big of a change would be fine to undertake. but I'll give it a shot
| RUN cd /tmp && \ | ||
| git clone --branch ${MLIO_VERSION} https://github.com/awslabs/ml-io.git mlio | ||
|
|
||
| # Patch MLIO for Arrow 17.0.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you remind me why we need to do this? its been a while for me but I know there was some vulnerability issues and MLIO is also not actively maintained by anyone right now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I honestly have no idea. and this is the step that takes 20+ mins during code build. this package is not actively maintained. and we can't even install prebuilt binaries, we have to build it from source. I was trying to replace whatever this package was doing with like a modern library but it was too much and i don't understand much of it
| # Install remaining packages via pip | ||
| COPY requirements.txt /requirements.txt | ||
| RUN python -m pip install -r /requirements.txt && \ | ||
| RUN uv pip install --system -r /requirements.txt && \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
heard a lot about uv, but have never used it. Glad to see it in here. Thanks for making these improvements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ya working with conda gave me migraines so just decided to go with modern pkg management
|
overall lgtm, I would suggest to use new ubuntu version as well to make this image more robust from potential security vulnerabilities. would be important to setup a new codebuild/buildspec file and test these changes through it before merging this. |
ack. I already sent out cfn buildspec update CR. will work on ubuntu upgrade and raise a rev |
Uh oh!
There was an error while loading. Please reload this page.