Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't download a module on a registry for a provider with a hyphen - in the name #29532

Closed
jeremmfr opened this issue Sep 7, 2021 · 14 comments · Fixed by #30053
Closed

Can't download a module on a registry for a provider with a hyphen - in the name #29532

jeremmfr opened this issue Sep 7, 2021 · 14 comments · Fixed by #30053
Labels
bug explained a Terraform Core team member has described the root cause of this issue in code new new issue not yet triaged v1.0 Issues (primarily bugs) reported against v1.0 releases

Comments

@jeremmfr
Copy link

jeremmfr commented Sep 7, 2021

Terraform Version

Terraform v1.0.6
on darwin_amd64

Terraform Configuration Files

module "test" {
	source = "terraform-google-modules/gcloud/gcp-beta"
	version = "0.1.0"
}

Debug Output

https://gist.github.com/jeremmfr/2dfa73953adb30692106ed0b976549fe

Expected Behavior

Terraform try download the module on the registry.

Actual Behavior

Terraform stop with an error indicating that the module is not a registry URL.

╷
│ Error: Invalid version constraint
│
│ Cannot apply a version constraint to module "test" (at main.tf:1) because it has a non Registry URL.
╵

This only appears when the provider has a hyphen - in the name.

Steps to Reproduce

  1. terraform init

Additional Context

The first problem encountered was on a private registry

module "private" {
  source = "app.terraform.io/example-corp/my-module/gcp-beta"
  version = "1.1.0"
}

but it is also present with public registry

@jeremmfr jeremmfr added bug new new issue not yet triaged labels Sep 7, 2021
@alisdair alisdair added confirmed a Terraform Core team member has reproduced this issue v1.0 Issues (primarily bugs) reported against v1.0 releases new new issue not yet triaged waiting-response An issue/pull request is waiting for a response from the community and removed new new issue not yet triaged confirmed a Terraform Core team member has reproduced this issue labels Sep 7, 2021
@alisdair
Copy link
Member

alisdair commented Sep 7, 2021

Thanks for the report. I'm able to reproduce this, but I'm not sure it's a valid bug.

What is happening here is Terraform is failing to parse the ambiguous source address as a Registry module, because the provider component is invalid. Because it's not a Registry source, we don't support a version constraint.

The provider component of a module source address must consist only of alphanumeric characters—hyphens are not supported. I believe this constraint should also be enforced at the Terraform Registry side (including the Terraform Cloud private registry). The public registry has no such module available.

It's unfortunate that the error message is obscure here, but that's a side effect of trying to parse ambiguous source addresses with best effort.

What led you to try installing this module with provider gcp-beta?

jeremmfr added a commit to jeremmfr/terraform that referenced this issue Sep 7, 2021
Fixes hashicorp#29532
a hyphen can be present in provider name (except first and last character) in module URL to download from registry
@jeremmfr
Copy link
Author

jeremmfr commented Sep 7, 2021

gcp-beta is only for example.

In reality, I tried download a module on a private registry (with a Gitlab instance), with a custom provider which have a hyphen in his name.
The name of this provider is in format <brand>-<type>.

This provider will be made public shortly, but there are also other providers (no beta) which are already present with a hyphen.

The module registry protocol allows to have a hyphen in the name of the provider but they do not exist on the Terraform registry because the procedure to publish a module on this Terraform registry requires to have a git repository in the format terraform-<PROVIDER>-<NAME>. With this format, there is no way to publish module with a provider which contains hyphen.

But with a third-party module registry implementation, a module with hyphen in provider's name is possible.

In addition, it might be good to add a second method to add modules to the Terraform registry, for example with the format terraform_<PROVIDER>_<NAME>.

@apparentlymart
Copy link
Member

apparentlymart commented Sep 8, 2021

I happened to do some refactoring of the address-parsing codepaths recently and so I ended up reading back through the history of registry module address parsing and adding some notes in the code based on what I learned, which I'll summarize here in the hope it helps with thinking about what we might do in response to this issue.

The most important quirk here is that Terraform v0.10 originally introduced the module registry syntax in a way that was ambiguous with previously-supported local file path syntax, and so it employed a model of first attempting to parse as a registry address and then falling back to a local file path if it encountered a syntax error. Terraform v0.12 partially resolved that by refining the rules for local paths to say that they must always start with either ../ or ./ to disambiguate, but we ended up having to still preserve an unfortunate behavior of underlying go-getter where it essentially treats anything it can't otherwise parse as an absolute file path (which Terraform then treats as if it were an external package, to ensure that the source code ends up at a system-agnostic location).

// For historical reasons, whether an address is a registry
// address is defined only by whether it can be successfully
// parsed as one, and anything else must fall through to be
// parsed as a direct remote source, where go-getter might
// then recognize it as a filesystem path. This is odd
// but matches behavior we've had since Terraform v0.10 which
// existing modules may be relying on.
// (Notice that this means that there's never any path where
// the registry source parse error gets returned to the caller,
// which is annoying but has been true for many releases
// without it posing a serious problem in practice.)
if ret, err := parseModuleSourceRegistry(raw); err == nil {
return ret, nil
}
// If we get down here then we treat everything else as a
// remote address. In practice there's very little that
// go-getter doesn't consider invalid input, so even invalid
// nonsense will probably interpreted as _something_ here
// and then fail during installation instead. We can't
// really improve this situation for historical reasons.
remoteAddr, err := parseModuleSourceRemote(raw)
if err != nil {
// This is to make sure we really return a nil ModuleSource in
// this case, rather than an interface containing the zero
// value of ModuleSourceRemote.
return nil, err
}

As I noted in the comments here, this means that there's no path whereby a syntax error in registry address parsing can actually surface to the UI. Instead, it falls through to the external package address parser. If that parser fails then unfortunately we end up reporting its error, which is typically of much lower quality because go-getter was designed under the principle of guessing what the input probably meant, rather than of giving good feedback about ambiguous input.


The above explains why Terraform returns an unhelpful error in this situation, but it doesn't explain why the "remote system" portion of the address (what we previously called "provider", but never actually enforced it being a provider name in practice) has a more restrictive validation rule than the other two parts.

In my earlier research I didn't actually arrive at a clear explanation for that, but I did still replicate the previous rules because my goal was to refactor the existing implementation rather than to change its behavior:

// parseModuleRegistryTargetSystem validates and normalizes a string in the
// "target system" position of a module registry source address. This is
// what we historically called "provider" but never actually enforced as
// being a provider address, and now _cannot_ be a provider address because
// provider addresses have three slash-separated components of their own.
func parseModuleRegistryTargetSystem(given string) (string, error) {
// Similar to the names in provider source addresses, we defined these
// to be compatible with what filesystems and typical remote systems
// like GitHub allow in names. Unfortunately we didn't end up defining
// these exactly equivalently: provider names can only use dashes as
// punctuation, whereas module names can use underscores. So here we're
// using some regular expressions from the original module source
// implementation, rather than using the IDNA rules as we do in
// ParseProviderPart.
if !moduleRegistryTargetSystemPattern.MatchString(given) {
return "", fmt.Errorf("must be between one and 64 ASCII letters or digits")
}
// We also skip normalizing the name to lowercase, because we historically
// didn't do that and so existing module registries might be doing
// case-sensitive matching.
return given, nil
}

Re-reading this I think I actually made an error in the commentary I left here. Initially I had incorrectly implemented this to allow underscores, but tightened the regular expression after I saw it fail some existing unit tests. I guess I forgot to remove the statement about underscores from the comment at the time.

If we were to relax the rules then this function would be the place to do it. I hope that (aside from the error I described above) the commentary here is useful in thinking through some of the implications of doing so. One particular concern is that it might cause existing modules containing legacy-style filesystem paths (which go-getter is currently handling) to be understood instead as containing module registry paths, which would be a breaking change not permitted under the Terraform 1.0 Compatibility Promises.


The final bit of context I want to capture here is that elsewhere in the Terraform language we typically assume providers will have local names consisting entirely of letters and digits (but don't require it, IIRC).

For example, our behavior of automatically using the prefix of a resource type name to select its provider works by taking the position of the first underscore as the separator marking the end of the provider name, and so aws_instance automatically attaches to a provider with the local name aws unless the module author explicitly overrides it by writing a provider argument inside the resource block.

I would speculate (but have no evidence) that the current constraint on "remote system" names in module addresses was originally following the precedent that we required provider names to consist only of letters and digits, even though we've relaxed that into only a default assumption rather than a requirement in subsequent versions of the language.

This of course says nothing specific about dashes, but I think that's just because idiomatic names within Terraform itself always use underscores to separate words, and the Terraform language permits dashes in identifiers only to make it easier to write configuration settings for remote systems that have different naming conventions.

The hashicorp/google-beta provider has emerged as a rather interesting exception to this rule. It's designed to behave as a drop-in replacement for the hashicorp/google provider, and so the way it uses a dash in its source address ends up skirting around all of the assumptions I've mentioned because provider source addresses appear only in the source argument inside provider_requirements, whereas the rest of the module uses a local name chosen entirely by the module author.

In real-world modules I've seen a mix of two seemingly-equally-common patterns:

  • Set the hashicorp/google-beta provider to have a local name of just google, which then makes Terraform automatically associate resource "google_anything" "..." with that provider without explicitly stating it.
  • Set the hashicorp/google-beta provider to have the local name google-beta and then write provider = google-beta in each provider block to explicitly associate it. The use of a dash here would typically be considered unidiomatic per our usual identifier rules, but it's emerged as a pattern in this case I think largely because we failed to give any specific guidance on how to handle this one-off situation. Therefore it would be dishonest to call this particular dash usage "unidiomatic"; it's in common use and has consequently been used prominently in official provider documentation.

Of course, as I noted above we no longer consider the final portion of the registry module address format as directly a provider name, and instead leave it up to module developers to choose a suitable short descriptor for whatever remote system the module primarily works with. "Google beta" is the name of a provider rather than of a remote system, so it creates a rather awkward edge-case to the naming scheme: should a module using that provider just be classified as targeting "google", or is the fact that it's using the beta provider significant enough that it ought to be codified in the source address of the module? I don't have a strong opinion on the answer to this question right now; I think it'd be worth doing some research into how existing modules are being classified and what motivated those classifications.


With all of this said, it does seem like our compatibility promises may be a practical blocker for any change to the registry module address syntax, and so we might find that the best we can do here is just document that requirement more explicitly in all of the relevant places, including in the module registry protocol documentation.

Terraform is working as intended here, even though some of the rationale for those intentions is lost to history, and so I'm torn about whether to classify this as a documentation bug, which we would close by updating the documentation, or as an enhancement request which would represent us trying to find a backward-compatible way to permit more flexibility in "remote system" names. I do lean slightly towards the documentation bug angle here, because I don't really see a clear path to a backward-compatible language change, but I'm curious to hear if others have ideas about how we could approach this as an enhancement.

@apparentlymart
Copy link
Member

apparentlymart commented Sep 8, 2021

A further thought that came to me after some time away for lunch: while we probably can't change the expected syntax of a registry module source address due to compatibility constraints, it's never been valid to use version for any source type other than a registry source address and so we could potentially use the presence of version as a heuristic to detect the intent for this to be a valid source address, and thus prefer to return the error message from the registry address parser in that case, instead of falling back to the go-getter parser.

That would mean that the error message here would've been something like:

invalid target system "gcp-beta": must be between one and 64 ASCII letters or digits

Since I knew that the current implementation would never actually show the error messages from the registry address parser, I didn't spend a lot of time polishing them. If we were going to do something like I've described here then it'd be good to also review the different error return paths to see if they're each generating a true and understandable error message.

@apparentlymart apparentlymart added explained a Terraform Core team member has described the root cause of this issue in code and removed waiting-response An issue/pull request is waiting for a response from the community labels Nov 16, 2021
@BzSpi
Copy link

BzSpi commented Nov 23, 2021

IMHO there's a bug here because the behavior is not the same if we're using the Terraform public registry or a private one. modules names with hypens (or underscores) are allowed on Terraform public registry but not on private ones.

That being said, I understand well that there's no easy way out about this.

@apparentlymart
Copy link
Member

Hi @BzSpi,

The specific constraint about the target system consisting only of letters and digits is inside Terraform CLI itself and not something that should be able to vary between registries. Can you say a little more about what you've experienced, ideally including some reproduction steps, so we can understand how what you saw relates to the problem we've been discussing in this issue?

Each implementation of the module registry protocol could potentially impose additional constraints on names beyond the ones Terraform CLI itself imposes, but if that is causing a problem then that would be something to fix in whatever registry is imposing the undesirable extra rules, rather than something to be fixed in Terraform CLI itself.

@BzSpi
Copy link

BzSpi commented Nov 30, 2021

Hi @apparentlymart

I've implemented a private registry and have the exact same issue as mentioned by @jeremmfr .

Here's an example:

Terraform code

module "private_endpoints" {
  source  = "tf-registry-domain.azurefd.net/tf-namespace/azurerm/private-endpoint"
  version = "1.0.0"

  ....
}

Terraform init sequence (with TF_LOG=trace)

2021-11-30T09:08:13.630+0100 [DEBUG] Adding temp file log sink: /tmp/terraform-log950579303
2021-11-30T09:08:13.630+0100 [INFO]  Terraform version: 1.0.11
2021-11-30T09:08:13.630+0100 [INFO]  Go runtime version: go1.16.4
2021-11-30T09:08:13.630+0100 [INFO]  CLI args: []string{"/home/laurent/.terraform.d/versions/1.0/1.0.11/terraform", "init"}
2021-11-30T09:08:13.630+0100 [TRACE] Stdout is a terminal of width 203
2021-11-30T09:08:13.630+0100 [TRACE] Stderr is a terminal of width 203
2021-11-30T09:08:13.630+0100 [TRACE] Stdin is a terminal
2021-11-30T09:08:13.630+0100 [DEBUG] Attempting to open CLI config file: /home/laurent/.terraformrc
2021-11-30T09:08:13.630+0100 [INFO]  Loading CLI configuration from /home/laurent/.terraformrc
2021-11-30T09:08:13.630+0100 [DEBUG] checking for credentials in "/home/laurent/.terraform.d/plugins"
2021-11-30T09:08:13.630+0100 [DEBUG] ignoring non-existing provider search directory terraform.d/plugins
2021-11-30T09:08:13.630+0100 [DEBUG] will search for provider plugins in /home/laurent/.terraform.d/plugins
2021-11-30T09:08:13.630+0100 [DEBUG] ignoring non-existing provider search directory /home/laurent/.local/share/terraform/plugins
2021-11-30T09:08:13.630+0100 [DEBUG] ignoring non-existing provider search directory /usr/share/ubuntu/terraform/plugins
2021-11-30T09:08:13.630+0100 [DEBUG] ignoring non-existing provider search directory /usr/local/share/terraform/plugins
2021-11-30T09:08:13.630+0100 [DEBUG] ignoring non-existing provider search directory /usr/share/terraform/plugins
2021-11-30T09:08:13.630+0100 [DEBUG] ignoring non-existing provider search directory /var/lib/snapd/desktop/terraform/plugins
2021-11-30T09:08:13.630+0100 [INFO]  CLI command args: []string{"init"}
Initializing modules...
2021-11-30T09:08:13.632+0100 [TRACE] ModuleInstaller: installing child modules for . into .terraform/modules
2021-11-30T09:08:13.633+0100 [DEBUG] Module installer: begin private_endpoints
2021-11-30T09:08:13.633+0100 [TRACE] ModuleInstaller: private_endpoints is not yet installed
2021-11-30T09:08:13.633+0100 [TRACE] ModuleInstaller: cleaning directory .terraform/modules/private_endpoints prior to install of private_endpoints
2021-11-30T09:08:13.633+0100 [TRACE] ModuleInstaller: private_endpoints address "tf-registry-domain.azurefd.net/tf-namespace/azurerm/private-endpoint" will be handled by go-getter
Downloading tf-registry-domain.azurefd.net/tf-namespace/azurerm/private-endpoint for private_endpoints...
2021-11-30T09:08:13.633+0100 [DEBUG] Module installer: begin region
2021-11-30T09:08:13.635+0100 [TRACE] ModuleInstaller: Module installer: region 4.2.0 already installed in .terraform/modules/region
2021-11-30T09:08:13.635+0100 [TRACE] modsdir: writing modules manifest to .terraform/modules/modules.json
╷
│ Error: Invalid version constraint
│ 
│ Cannot apply a version constraint to module "private_endpoints" (at r-private_endpoint.tf:23) because it has a non Registry URL.
╵

╷
│ Error: Invalid version constraint
│ 
│ Cannot apply a version constraint to module "private_endpoints" (at r-private_endpoint.tf:23) because it has a non Registry URL.
╵

╷
│ Error: Invalid version constraint
│ 
│ Cannot apply a version constraint to module "private_endpoints" (at r-private_endpoint.tf:23) because it has a non Registry URL.
╵

With this source code (after republishing the module without hyphen in the registry), everything works fine:

module "private_endpoints" {
  source  = "tf-registry-domain.azurefd.net/tf-namespace/azurerm/privateendpoint"
  version = "1.0.0"

  ....
}

I publish modules very often in the Terraform public registry and it' possible to have hyphen in the module names.
Examples: https://registry.terraform.io/modules/claranet/diagnostic-settings/azurerm/latest, https://registry.terraform.io/modules/claranet/app-gateway/azurerm/latest and so on ...

Regarding what you said, it might be due to the fact that private registry has a domain name in their source and so, the parsing is not done the same way as the public registry.

Hope this helps.

@BzSpi
Copy link

BzSpi commented Nov 30, 2021

My colleague @xp-1000 has kindly notified me that I was completely off topic here since my issue is about the name of the module and not the provider 😅

Sorry for the noise, I will look for a similar issue and update it or create a new one.

@jeremmfr
Copy link
Author

@BzSpi, for your case, the source address of your module needs to be tf-registry-domain.azurefd.net/tf-namespace/private-endpoint/azurerm

@BzSpi
Copy link

BzSpi commented Nov 30, 2021

Indeed ... Double fail for me.
Sorry again for the noise and thanks for your replies.

@apparentlymart
Copy link
Member

Over in #30053 I've implemented the partial solution I described above of using the presence of the version argument as a heuristic to show a registry-address-specific error message instead of treating an invalid registry address as if it were a direct remote module source. While I don't consider that an ideal solution, it feels to me like the best compromise to give better feedback here without risk of breaking the v1.0 Compatibility Promises.

Assuming that changeset passes review and we merge it, that'll close out this issue for now. If we later devise a more precise solution then we can discuss that in a different issue, but since there isn't a ready design for that right now I'd like to move forward with this compromise for the foreseeable future, as a "better than nothing" answer.

jeremmfr added a commit to jeremmfr/terraform-website that referenced this issue Dec 1, 2021
Following the discussion in hashicorp/terraform#29532, the documentation for module URLs in public or private registry is ambiguous because it's not the provider in URL but a "target system" and the restriction for characters is not the same as for the name of a provider
@jeremmfr
Copy link
Author

jeremmfr commented Dec 1, 2021

@apparentlymart I opened a PR hashicorp/terraform-website#1988 on terraform-website to clarify the "target system" in documentation. Can you review ?

@apparentlymart
Copy link
Member

Hi @jeremmfr! Thanks for working on that.

The Terraform Cloud documentation is outside of my typical scope, so I'll let the usual documentation authors be the ones to review that change, but for what it's worth it seems like a reasonable set of updates to me! Thanks again.

@github-actions
Copy link

github-actions bot commented Jan 1, 2022

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jan 1, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug explained a Terraform Core team member has described the root cause of this issue in code new new issue not yet triaged v1.0 Issues (primarily bugs) reported against v1.0 releases
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants