Skip to content

Conversation

@Smankusors
Copy link
Contributor

@Smankusors Smankusors commented Feb 27, 2022

Change Description

This PR optimizes the Dockerfiles so that the final build image size is reduced. The big size is mostly caused by the en_core_web_lg package is being cached on /root/.cache. So by adding ENV PIP_NO_CACHE_DIR=1, the pip will not cache it anymore, reducing the image size.

I also merge some RUN commands into one RUN, so that it results to 1 layer instead of many layers.

                         BEFORE   AFTER
presidio-analyzer        2.1GB    1.26GB
presidio-anonymizer      212MB    196MB
presidio-image-redactor  2.34GB   1.46GB

Checklist

  • I have reviewed the contribution guidelines
  • I have signed the CLA
  • My code includes unit tests
  • All unit tests and lint checks pass locally
  • My PR contains documentation updates / additions if required

  mostly to remove the pip cache from the build image
@ghost
Copy link

ghost commented Feb 27, 2022

CLA assistant check
All CLA requirements met.

@omri374
Copy link
Contributor

omri374 commented Feb 27, 2022

Cool thanks! We'll review it soon.

@omri374
Copy link
Contributor

omri374 commented Feb 27, 2022

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@omri374 omri374 requested review from SharonHart, balteravishay and navalev and removed request for SharonHart, balteravishay and navalev February 27, 2022 16:04
Copy link
Contributor

@SharonHart SharonHart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks!

@omri374 omri374 merged commit d9d59e0 into microsoft:main Feb 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants