Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provider always crash #88

Closed
solivansantana opened this issue Mar 18, 2024 · 20 comments
Closed

Provider always crash #88

solivansantana opened this issue Mar 18, 2024 · 20 comments
Assignees
Labels
bug Something isn't working

Comments

@solivansantana
Copy link

Hello,
Often when using the Cilium provider, an error appears stating that the plugin has failed.
I've been using it since version 0.1.6 and the same behavior is evident.
When updating to version 0.1.7 I temporarily no longer had any problems, but about 3 days ago the same error occurred again. I then updated to the new release (0.1.8) and the error remains.
I would like to use the provider in an "online" way so that I don't always need to update the project files (provider offline).
The screenshot of the error follows the link: Image
Thank you!

@littlejo
Copy link
Owner

Hello,
Thank you for reporting this bug. Could you answer these questions:

  • Which version of terraform do you use?
  • Could you give the code which bugs? I would like to reproduce.
  • when it appear this bug during plan, apply, init ?

@solivansantana
Copy link
Author

Hello,
Thanks for the quick response.
I was using terraform in version 1.6.2, but I updated to version 1.7.5 and the error continues.
The error usually appears in the plan, preventing it from continuing.

provider.tf
`terraform {

required_providers {
azurerm = {
source = "hashicorp/azurerm"
version = ">=3.0.0"
}
cilium = {
source = "littlejo/cilium"
version = ">=0.1.8"
}
XXXXXXXXXXXXXXXX
}
}

provider "azurerm" {
features {}
}

provider "cilium" {
config_path = "./kubeconfig"
}`

cilium.tf
`resource "cilium" "install" {
depends_on = [ XXXXX.XXXXXX ]
set = [
"aksbyocni.enabled=true",
"nodeinit.enabled=true",
"azure.resourceGroup=XXXXX",
"hubble.relay.enabled=true",
"hubble.ui.enabled=true",
"hubble.relay.nodeSelector.pool=xxx",
"hubble.ui.nodeSelector.pool=xxx",
"hubble.ui.service.type=LoadBalancer",
"hubble.ui.service.annotations.service\.beta\.kubernetes\.io/azure-load-balancer-internal-subnet=xxxxxxxx",
"hubble.ui.service.annotations.service\.beta\.kubernetes\.io/azure-load-balancer-ipv4=xx\.xx\.xx\.xx"

]
version = "1.15.1"
}`

Thank you!

@littlejo
Copy link
Owner

The buggy code is the function GetCurrentRelease during read of terraform. It's a test to check helm release of cilium. If there is no cilium release. It means someone uninstall cilium with helm cli or cilium cli for example. This code is to reconcile terraform with real state.

@solivansantana
Copy link
Author

Thanks for the feedback.
In my case the cilium was not removed. What happens is that I added some other configurations in another part of Terraform and when it refreshes the cilium state, the error happens. If I try to add or remove any resource, I try to refresh the state and the error occurs. Detail, Cilium is never changed. In this case, what can I do to work around or solve this problem?

@littlejo
Copy link
Owner

@solivansantana sorry I'm in Kubecon this week. I don't have time for this issue.

@littlejo
Copy link
Owner

There is the same error for another provider: https://discuss.qovery.com/t/terraform-provider-cluster-creation-or-update-fails/1220

@littlejo
Copy link
Owner

@solivansantana Could you retry with 0.1.9 version? Thank you.

@solivansantana
Copy link
Author

Thanks for the feedback.
I did a new test using the released version 0.1.9 and I didn't get any more crash errors. What I'm seeing now is a connection error to the cluster. Has the form of authentication using config_path been maintained? I also tried to use the KUBE_CONFIG_PATH variable by passing the file manually and the same error occurs.
Refreshing state: Image
Auth error: Image
My provider remains unchanged (except version):

provider.tf
`terraform {

required_providers {
azurerm = {
source = "hashicorp/azurerm"
version = ">=3.0.0"
}
cilium = {
source = "littlejo/cilium"
version = ">=0.1.9"
}
XXXXXXXXXXXXXXXX
}
}

provider "azurerm" {
features {}
}

provider "cilium" {
config_path = "./kubeconfig"
}`

@littlejo
Copy link
Owner

I did a new test using the released version 0.1.9 and I didn't get any more crash errors.

Ok cool.

Has the form of authentication using config_path been maintained? I also tried to use the KUBE_CONFIG_PATH variable by passing the file manually and the same error occurs.

There are three ways to define the kubeconfig:

  • nothing, by default it is ~/.kube/config (like kubectl)
  • KUBECONFIG environment variable (like kubectl)
  • in the provider config_path option (it overrides KUBECONFIG environment variable if it is defined).

@solivansantana
Copy link
Author

solivansantana commented Mar 25, 2024

Thanks for the feedback.
I export kubeconfig to a local file and use it to authenticate to the cluster. It's the way I've been using it since version 0.1.6 and I haven't had any problems with it (only crashes in the plugin).
Below is an excerpt of my code:

main.tf
resource "local_file" "kubeconfig" {
depends_on = [azurerm_kubernetes_cluster_node_pool.XXXXXXXXXX]
filename = "./kubeconfig"
content = data.azurerm_kubernetes_cluster.XXXXXXXXXX.kube_config_raw
}

provider.tf
terraform {

required_providers {
azurerm = {
source = "hashicorp/azurerm"
version = ">=3.0.0"
}
cilium = {
source = "littlejo/cilium"
version = ">=0.1.9"
}
XXXXXXXXXXXXXXXX
}
}

provider "azurerm" {
features {}
}

provider "cilium" {
config_path = "./kubeconfig"
}

Then in the provider I use the kubeconfig file in the config_path. That's why I'm wondering if there were changes, since I didn't change the authentication method.

@littlejo
Copy link
Owner

That's why I'm wondering if there were changes, since I didn't change the authentication method.

I didn't change anything about authentication method on the recent version (0.1.X).

@littlejo
Copy link
Owner

@solivansantana In which context do you use this provider? Dev? Production? How do you know the existence of the provider?

@solivansantana
Copy link
Author

@littlejo I'm trying to use it in a production environment. I discovered this provider through research on Terraform providers. https://registry.terraform.io/search/providers?q=cilium
The Cilium installer via the Helm provider has its problems and I was unable to make it work through Helm, hence the use of your provider.

@littlejo
Copy link
Owner

The Cilium installer via the Helm provider has its problems and I was unable to make it work through Helm, hence the use of your provider.

Ok cool, the cilium provider mainly uses the cilium-cli libraries: https://github.com/cilium/cilium-cli

@solivansantana
Copy link
Author

@littlejo I did a test by creating a new AKS cluster to validate that there was no authentication problem and found the same crash problem on provider 0.1.9. Image below:
https://i.postimg.cc/qMfBGt1Z/NewAKS.png
The terraform codes are below:

provider.tf
terraform {

required_providers {
azurerm = {
source = "hashicorp/azurerm"
version = ">=3.0.0"
}
cilium = {
source = "littlejo/cilium"
version = ">=0.1.9"
}
}
}

provider "azurerm" {
features {}
}

provider "cilium" {
config_path = "./kubeconfig"
}

cilium.tf
resource "cilium" "install" {
depends_on = [ local_file.kubeconfig ]
set = [
"aksbyocni.enabled=true",
"nodeinit.enabled=true",
"azure.resourceGroup=XXXXXXXXXXXX"

]
version = "1.15.1"
}

@littlejo
Copy link
Owner

@solivansantana Ok thank you. This PR: #94 should fix the problem. I'm releasing a new version today.

@littlejo
Copy link
Owner

@solivansantana Could you retry with 0.1.10 version ? Thank you.

@solivansantana
Copy link
Author

@littlejo thanks for the feedback.
I used version 0.1.10 in a new installation, but as shown below, the authentication error is also present, using the kubeconfig exported from the Cluster and added through config_path.
https://i.postimg.cc/bN3VQvfS/cilium-error.png
It's the same solution that was working in previous versions and that I forwarded in my previous comment. Can you simulate the same error?

@littlejo
Copy link
Owner

@solivansantana It's because kubeconfig file doesn't exist at the beginning of apply:

provider "cilium" {
config_path = "./kubeconfig"
}

Use this for the dependance:

provider "cilium" {
  config_path = local_file.kubeconfig.filename
}

@solivansantana
Copy link
Author

@littlejo thank you for your help.
By adjusting the config_path it was possible to be successful using provider 0.1.10.
https://postimg.cc/qhv3LhvX
Thank you very much for your commitment to supporting the community.

@littlejo littlejo added the bug Something isn't working label Mar 28, 2024
@littlejo littlejo self-assigned this Mar 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants