Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ECR] [request]: Support regular expression matching for tags in lifecycle policies #1213

Open
ingshtrom opened this issue Jan 5, 2021 · 39 comments
Labels
ECR Amazon Elastic Container Registry Proposed Community submitted issue Work in Progress

Comments

@ingshtrom
Copy link

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Tell us about your request
What do you want us to build?
I to use regular expressions to match on tags in my lifecycle policies. For example, we have an image tagged <account_id>.dkr.ecr.us-east-1.amazonaws.com/<image_name>:v1.1.0_test-4db22261f6a2ca5de2cb7eae3382dba32b3676da.

Right now, we need to tag the image as <account_id>.dkr.ecr.us-east-1.amazonaws.com/<image_name>:test-4db22261f6a2ca5de2cb7eae3382dba32b3676da_v1.1.0 so that we can use prefix matching in the lifecycle policy.

Which service(s) is this request for?
ECR

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?
We want to be able to have lifecycle policies that are more dynamic. Some things are in image tags like Git SHAs, Semver, etc. which are dynamic and we cannot match on dynamic strings in lifecycle policies.

Are you currently working around this issue?
For existing images, we cannot change. For new images, we can start tagging with a prefix we can filter on to be able to match in our lifecycle policies.

Additional context
Anything else we should know?

Attachments
If you think you might have additional information that you'd like to include via an attachment, please do - we'll take a look. (Remember to remove any personally-identifiable information.)

@ingshtrom ingshtrom added the Proposed Community submitted issue label Jan 5, 2021
@rpnguyen rpnguyen added the ECR Amazon Elastic Container Registry label Jan 6, 2021
@Malokingi
Copy link

This would be nice. The ability to use regular expressions instead of being restricted to just a "prefix" string for the tag filter on lifecycle policies, that is. I'm currently trying to maintain thousands of images and the way they're currently named, I need to filter by a prefix and a suffix, and a regex option would solve that problem and, probably, countless others.

In the meantime, I guess I'll just have to use lots and lots of ... uhhh, other commands. I'll figure something else out, I'm sure.

@davidlukac
Copy link

This would be great, we're using Java-Spring style versioning of releases (1.0.0.RELEASE) and currently migrating our Docker registry to ECR, which atm forces me to change the versioning to RELEASE-1.0.0 or something like that, so we can create lifecycle policies on how many version of certain release type we want to keep in the registry. Such version is not SemVer compatible (and looks terrible), which breaks other tooling in our pipeline.

@caladev
Copy link

caladev commented Aug 29, 2021

I wish I could give this 10 thumbs up. When using AWS SAM with the image type for lambda functions, by default it creates an image tag with a combination of the name of the lambda + the dockertag metadata of the image. If your lambda is something like myfunction-5c9aba82d6c9-mydockertag then there's no way to have a lifecycle policy based on the mydockertag at the end unless we had something like regex to do that.

@morepe
Copy link

morepe commented Mar 14, 2022

Another thump up for me. We have a on commit deployment with argocd. Each commit triggers a build that triggers a new image that is tagged with the commit hash. These images are only used very short as the tests are executed against it but there are tons of images now in ecr

@artem-kosenko
Copy link

it will be grate to have some "not equal" and/or "not content" patterns and parce all existed tags on the image. I'm OK to add an extra tag for images I need and delete all other withount this extra tags.

image [1.0.0-dev.123, 1.0.0-dev, dev] <-- keep images with "dev" tag
image [1.0.0-dev.122] <-- delete all images without "dev" tag
ect..

@HenryYanTR
Copy link

thumb up. does anyone actually put "dev" or "prod" in the front of the tag? the matching should support SemVer style tags.

@MrMarkW
Copy link

MrMarkW commented Oct 6, 2022

In order to reduce image bloat, we had to create our tags prefixed with an environment name like dev,stg,prd. It is not ideal at all. ECR also doesn't support deleting by how old or last pulled.

@maherrj
Copy link

maherrj commented Nov 17, 2022

Big thumbs up from me. We have tens of thousands of images across hundreds of repositories. And not having this required us to develop a custom solution. Its been painful and still is painful not having this.

@maerzhase
Copy link

maerzhase commented Nov 29, 2022

word! ability to use prefix only is pretty limiting and offering regex seems reasonable. 🐰

@jlbutler
Copy link

Hi all! We're looking at making enhancements to LCP rules and we've been tracking this issue. As we start looking into it, we'd like to know if wildcards would meet most needs, or if a more complete regex experience is required.

Understood that prefixes are limited, but typically some sort of schema is used in tagging. It might be simpler to introduce wildcards, but we'd want to know that it would get customers a meaningful value. What do folks think?

@vascoalramos
Copy link

For our use case, wildcards are not enough, since we need full regex expressions to distinguish dev and prod tags since we only want to delete older dev tags and keep all prod tags. So, we would regex expressions like the following:
Being prod tags like ^\d+.\d+.\d+ and dev like dev_\d+_^\d+.\d+.\d+ .

@maherrj
Copy link

maherrj commented Dec 14, 2022

Wildcards will work for some of our tag formats e.g. we have inflight release candidates which have SNAPSHOT in the name, so *SNAPSHOT* would work.

However, images with semantic version tags, are considered released images and therefore fall under separate lifecycle rules.

Elsewhere we have tooling which pushes images which we have been able to change to use prefixes.

So a mixed bag. Bottomline, wildcards wont work for us entirely, and regex would be a better fit.

@elatt
Copy link

elatt commented Dec 14, 2022

I'd prefer a regex if possible. Currently we have to force a prefix into our tags to denote dev images but ideally we just use the git versioning scheme. So a prod version is just <major>.<minor>.<patch> and all dev releases (those that occur between our git tags) end up with a name like <major>.<minor>.<patch>-<number>-g<sha>.

@MrMarkW
Copy link

MrMarkW commented Dec 16, 2022

Please also support by age and since last pulled.

@David3Ar
Copy link

David3Ar commented Dec 20, 2022

I think it is very important do change the way Lifecycle Policy rules work. They don't provide the same quality as the handling of other aws services.

  • With Regex every TaggingStrategy would be compatible. Even Regex is well known there is still a bigger possibility of accidental data loss. Some advanced deletion protection mechanism like "WILL_EXPIRE (in x days)" maybe should be considered.
  • More intuitive and at least in most cases already very helpful (f.e. when using SemVer), would be the possibility to simply allow usage of wildcards.

But in general we can say that ECR lifecycle should allow to handle complex tagging strategies, while being able to be understandable and easy to set up. It seems like the whole functionality could need a very generous rework with some good features like f.e. sinceLastPulled, keep image.

Btw. The way of testing rules and apply them is a very good feature that also prevent accidentally data loss. This is something i want to especially praise!

@maherrj
Copy link

maherrj commented Dec 22, 2022

I also second the comment above around pruning based on last pull date.

Although, we have seen some bizarre behaviour of last pull date. Will reach out separately on this issue.

Cheers,
Rich

@maistrotoad
Copy link

I don't think this has been put forward, but for me a useful regex scheme would have each match be treated individually.

My usecase is to be able to keep 1 tag for an image per pull request. E.g. a tag would have a regex prefix like pr-[0-9]{3,4}
so if I have these 4 images

pr-001-somehash1
pr-001-somehash2
pr-002-otherhash1
pr-002-otherhash2

I then want to keep

pr-001-somehash2
pr-002-otherhash2

So the latest tag per regex match of the prefix and not end up with only pr-002-otherhash2

@joaocfernandes
Copy link

Hi all! We're looking at making enhancements to LCP rules and we've been tracking this issue. As we start looking into it, we'd like to know if wildcards would meet most needs, or if a more complete regex experience is required.

Understood that prefixes are limited, but typically some sort of schema is used in tagging. It might be simpler to introduce wildcards, but we'd want to know that it would get customers a meaningful value. What do folks think?

Hi 👋🏼

Wildcards would be a huge win comparing to prefixes. It would give me some additional flexibility.

Putting it in perspective: Assuming that last pulled and image age are supported.
I would prefer to have a wildcard matching in 2 months than a regex matching in more than 1 year.

@bwmills
Copy link

bwmills commented Feb 2, 2023

This would definitely be great.

Our use case is for suffixes on tagged images. We often use SEMVER_env for applications that require an environment designator. Also some use cases for prefix and suffix logic on a per-image basis for tag matching.

Agree with @joaocfernandes

Wildcards would be a huge win comparing to prefixes [only ]...

Fwiw, one specific example is a busy frontend ECR repo where images need to be tagged with SEMVER_env. We already have a [cost-driven] single rule to control the max number of images kept over time. It would be great to add a second rule that gets applied first, where we match _prod images to ensure that the three latest production images are always kept, even with the max number of images rule getting applied.

@grbljplat
Copy link

Hi, the ability to enforce a Tag naming-convention (vN.N.N-env or whatever) on image upload is likely a fundamental requirement for most CICD/Pipeline-based build systems. The current lack of this feature on AWS ECR is a key differentiator ...- please implement this !!

@PhoenixRe32
Copy link

PhoenixRe32 commented Feb 8, 2023

A lot of nice ideas mentioned here but personally I would say if one were to go for an easy win with minimal effort regular expressions would be the one.

In my use case suffixes would suffice (I have never noticed prefix versioning personally so this drove me crazy :-) ) but I feel it is not too different to have wildcard and pattern matching (effort wise for this is an assumption) and the second is more complete as a solution.

So pattern matching for the win

@hsejour hsejour self-assigned this Apr 21, 2023
@barryib
Copy link

barryib commented May 4, 2023

Hello, in our case we use sementic versioning to tag our images. Since this is not supported yet LCP. we now have lot of old images to clean up. We'll probably build a lambda to do that on regular basis. This is not ideal at all, since we need to extra work/compute to handle it and will force to have images lifecycles in different tools (in ECR and in custom lambda).

@hsejour do you know if this is issue is planned, if yes, is there any ETA (a quarter or semester timeframe is enough).

@HaroonSaid
Copy link

We would love if AWS can solve the problem for all customers. We have repositories in lot of AWS regions replicated.

@hobti01
Copy link

hobti01 commented May 24, 2023

Hi all! We're looking at making enhancements to LCP rules and we've been tracking this issue. As we start looking into it, we'd like to know if wildcards would meet most needs, or if a more complete regex experience is required.

Understood that prefixes are limited, but typically some sort of schema is used in tagging. It might be simpler to introduce wildcards, but we'd want to know that it would get customers a meaningful value. What do folks think?

The typical tagging schema is semantic versioning. This means that prefix matching is simply inadequate since the distinction between release and build is in the suffix (or specifically the lack of a suffix).
Based on our experience with the Harbor registry, where wildcards are available but regex is not - there are real-world use cases where wildcards are simply not adequate.

Even if use cases could be met with multiple wildcard rules, only 50 rules are allowed in each policy. Using regex would cover more use cases per rule and would help users stay within the rule quota.

@carlosjgp
Copy link

Please prioritise this ticket the ECR policies are almost useless as they are now for us

we have this awful Terraform code to build even like that when a project goes over v9. their images are deleted or the v1. if we adjust the loop

locals {
  semver_lifecycle_policy = {
    # X.Y.Z or vX.Y.Z... since AWS does not support regex here the rule is a good enough approach '1\.(.*)'...'999\.(.*)' and same for 'v1\.(.*)'...'v999\.(.*)'"
    # One lifecycle rule per major version because an image tag needs to match all prefixes in the list to be removed.
    for major in range(0, 10) :
    50 + major => {
      description = "Keep as many images tag with semver as possible",
      selection = {
        tagStatus = "tagged",
        # v.X.Y.Z or X.Y.Z semver
        tagPrefixList = [
          "${var.semver_prefix}${major}."
        ]
        countType   = "imageCountMoreThan",
        countNumber = var.max_image_count # There is a hard limit of 10000 images per repository
      },
      action = {
        type = "expire"
      }
    }
  }

  default_lifecycle_policy = merge(
    local.semver_lifecycle_policy,
    {
      # images tagged with prefix `sha-` will be considered testing images and will disappear after 1 day.
      60 = {
        description = "Keep not released images for 1 day",
        selection = {
          tagStatus     = "tagged",
          tagPrefixList = ["sha-"],
          countType     = "sinceImagePushed",
          countUnit     = "days",
          countNumber   = 1
        },
        action = {
          type = "expire"
        }
      },
      # Any images that have not been marked by higher priority rules will be expired.
      # This includes untagged images, images that have been tagged with a format not expected by any defined lifecycle rules.
      # See https://docs.aws.amazon.com/AmazonECR/latest/userguide/LifecyclePolicies.html#lifecycle-policy-howitworks
      70 = {
        description = "Images without expected tagging will be deleted",
        selection = {
          tagStatus   = "any",
          countType   = "imageCountMoreThan",
          countNumber = 1
        },
        action = {
          type = "expire"
        }
      }
  })
}

At the moment this is a pain and we would soon be building up our own script to manage the image lifecycle

@bwmills
Copy link

bwmills commented Jun 16, 2023

@ingshtrom Hi Alex, any updates on this?

Our need for this continues to grow - it would be incredibly helpful in managing a fairly large number of ECR repos in AWS.

Ah excuse me, I see the label changed to in progress a few days ago - great to see and thank you

@OJOMB
Copy link

OJOMB commented Jun 23, 2023

also need this

@jufemaiz
Copy link

👀

@arareko
Copy link

arareko commented Aug 4, 2023

@hsejour Can you provide a status/ETA on this? Thanks!

@hsejour hsejour removed their assignment Sep 20, 2023
@bwmills
Copy link

bwmills commented Oct 18, 2023

No one is assigned?

Does anyone know the status?

@rafavallina
Copy link

Hi everyone. Just a quick notice from the ECR team as I notice that we went silent on this issue for quite a bit.

We continue to working on support for wildcards in lifecycle policies, and we plan release it before the end of the year! As you all are probably used to, I'm not committing to this, but I'm quite confident that it will be out there soon.

I have definitely heard that wildcards are not enough for everyone, and we want to continue working on improving LCPs, including RegEx support, SemVer support, and last date pulled. However, I do not have more to share about what will be done or when. Just note that feedback is being heard and we are using it to adjust our roadmap!

@HaroonSaid
Copy link

We were expecting a big announcement at Re:Invent.
Any more updates on feature development
Look for approximate timelines to determine if it's 2024 or beyond

@rafavallina
Copy link

rafavallina commented Dec 19, 2023

Hi all - wildcards for LCP are live. A "just-after-reinvent" launch!

https://aws.amazon.com/about-aws/whats-new/2023/12/amazon-elastic-container-registry-wildcards-lifecycle-policies/

I'm keeping this item open since the original ask is for regular expressions, which is more than wildcards. But I hope this will help some of you make progress! Thanks for your patience

@vchirikov
Copy link

@rafavallina / @HaroonSaid
it doesn't work, I got the policy json from aws console and tried to use terraform provider version 5.33.0 and got creating ECR Lifecycle Policy (***): InvalidParameterException: Invalid parameter at 'LifecyclePolicyText' failed to satisfy constraint: 'Lifecycle policy validation failure: instance value ("tagged-wildcard") not found in enum (possible values: ["tagged","untagged","any"])

When I tried to use aws console to import the policy and got this:
image

So wildcards doesn't work well (it worked only if you add rules one by one, not via json import)

@rafavallina
Copy link

@vchirikov Can you try to use 'tagged' in the 'tagStatus' field? That field is only to specify if the images must be tagged. You use the 'tagPatternList' field to indicate that we are doing a wildcard matching.

@vchirikov
Copy link

I checked network requests from UI and it uses tagged tagStatus, but if I view policy as json it shows as tagged-wildcard
Proof:
image

I'll try to use tagged tomorrow, thanks.

@celorodovalho
Copy link

Please, prioritize this issue.

@typeBlkCofe
Copy link

please prioritize this issue this should be like a few lines of code to enable regex filtering and this is open since 2021!

@deanb-everc
Copy link

please prioritize this issue, it's a real pain for us

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ECR Amazon Elastic Container Registry Proposed Community submitted issue Work in Progress
Projects
None yet
Development

No branches or pull requests