Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validate guides for 1.9 release #13627

Closed
20 tasks done
b3a-dev opened this issue Oct 19, 2020 · 17 comments
Closed
20 tasks done

Validate guides for 1.9 release #13627

b3a-dev opened this issue Oct 19, 2020 · 17 comments
Labels
area/documentation Impacts the documentation, including textual changes, sphinx, or other doc generation code.

Comments

@b3a-dev
Copy link
Member

b3a-dev commented Oct 19, 2020

@b3a-dev b3a-dev added kind/release Used for a PR that releases a new Cilium version. area/documentation Impacts the documentation, including textual changes, sphinx, or other doc generation code. priority/release-blocker and removed kind/release Used for a PR that releases a new Cilium version. labels Oct 19, 2020
@joestringer
Copy link
Member

joestringer commented Oct 19, 2020

FYI I have disabled the "v1.9" branch docs for now in favor of the "v1.9.0-rc2" docs. Please follow the docs from here:

https://docs.cilium.io/en/v1.9.0-rc2/

EDIT: I updated the links in the guides above.

twpayne added a commit that referenced this issue Oct 20, 2020
Refs: #13627

Signed-off-by: Tom Payne <tom@isovalent.com>
@errordeveloper errordeveloper self-assigned this Oct 20, 2020
@errordeveloper errordeveloper removed their assignment Oct 20, 2020
ti-mo added a commit to ti-mo/cilium that referenced this issue Oct 20, 2020
For cilium#13627

Signed-off-by: Timo Beckers <timo@isovalent.com>
@ti-mo
Copy link
Contributor

ti-mo commented Oct 20, 2020

GKE instructions seem solid, everything works as expected. Addresses some nits in #13645.

(I can't update the issue description..)

joestringer pushed a commit that referenced this issue Oct 20, 2020
Refs: #13627

Signed-off-by: Tom Payne <tom@isovalent.com>
@michi-covalent
Copy link
Contributor

having trouble running openshift-install:

% ./openshift-install create install-config --dir "${CLUSTER_NAME}" --log-level debug
DEBUG OpenShift Installer 4.5.0-0.okd-2020-10-15-235428
DEBUG Built from commit 63200c80c431b8dbaa06c0cc13282d819bd7e5f8
DEBUG Fetching Install Config...
DEBUG Loading Install Config...
DEBUG   Loading SSH Key...
DEBUG   Loading Base Domain...
DEBUG     Loading Platform...
DEBUG   Loading Cluster Name...
DEBUG     Loading Base Domain...
DEBUG     Loading Platform...
DEBUG   Loading Pull Secret...
DEBUG   Loading Platform...
DEBUG   Fetching SSH Key...
DEBUG   Generating SSH Key...
? SSH Public Key /path/to/id_rsa.pub
DEBUG   Fetching Base Domain...
DEBUG     Fetching Platform...
DEBUG     Generating Platform...
? Platform gcp
INFO Credentials loaded from environment variable "GOOGLE_CREDENTIALS", file "/path/to/key.json"
? Region us-central1
DEBUG   Generating Base Domain...
FATAL failed to fetch Install Config: failed to fetch dependency of "Install Config": failed to generate asset "Base Domain": could not retrieve base domains: googleapi: Error 400: Invalid resource field value in the request., invalidParameter

@aditighag
Copy link
Member

@nebril Please hold off on testing the Local Redirect Policy guide until #13543 is merged.

@nathanjsweet
Copy link
Member

nathanjsweet commented Oct 21, 2020

EKS is looking good. I had no issue.

@kkourt
Copy link
Contributor

kkourt commented Oct 21, 2020

Some issues with transparent encryption:

Failure to autodetect interface

Using:

$ helm install cilium cilium/cilium --version 1.9.0-rc2 --namespace kube-system --set encryption.enabled=true --set encryption.nodeEncryption=false

Failed.

The problem seemed to be with cilium guess the interface wrong:

...
level=info msg="  --encrypt-interface='eth0'" subsys=daemon
...
level=warning msg="Device \"eth0\" does not exist." subsys=datapath-loader

There was no eth0 interface on the machine, and specifying the interface --set encryption.interface=ens4 seems to solve the issue.

The guide states the following:

If direct routing is being used, an additional argument can be used to identify
the network-facing interface. If no interface is specified, the default route
link is chosen by inspecting the routing tables.

Which might be interpreted so that it means that the interface can only be set when using direct routing.

An attempt to clarify this can be found in: #13660

Long term, we might want to improve the interface detection.

nodeEncryption=true does not seem to work

Specifying --set encryption.nodeEncryption=true results in k8s nodes being unreachable (e.g., unable to ping them, rendered unavailable). Not clear what is wrong in this case. No relevant errors/warnings in cilium logs.

Some info that might be useful:

kkourt@t1:~$ ip r
default via 10.172.0.1 dev ens4 proto dhcp src 10.172.15.198 metric 100 
10.0.0.0/24 via 10.0.0.26 dev cilium_host src 10.0.0.26 mtu 1333 
10.0.0.26 dev cilium_host scope link 
10.0.1.0/24 via 10.0.0.26 dev cilium_host src 10.0.0.26 mtu 1333 
10.172.0.1 dev ens4 proto dhcp scope link src 10.172.15.198 metric 100 
10.172.15.199 dev cilium_host proto eigrp mtu 1333 
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown 
kkourt@t1:~$ ip route get 10.172.15.199
10.172.15.199 dev cilium_host src 10.172.15.198 uid 1001 
    cache mtu 1333 
kkourt@t1:~$ ping 10.172.15.199
PING 10.172.15.199 (10.172.15.199) 56(84) bytes of data.
^C
--- 10.172.15.199 ping statistics ---
2 packets transmitted, 0 received, 100% packet loss, time 1036ms

@pchaigno
Copy link
Member

@kkourt Could you open dedicated issues for:

There was no eth0 interface on the machine, and specifying the interface --set encryption.interface=ens4 seems to solve the issue.

and

Specifying --set encryption.nodeEncryption=true results in k8s nodes being unreachable (e.g., unable to ping them, rendered unavailable). Not clear what is wrong in this case. No relevant errors/warnings in cilium logs.

aanm pushed a commit that referenced this issue Oct 21, 2020
For #13627

Signed-off-by: Timo Beckers <timo@isovalent.com>
@kkourt
Copy link
Contributor

kkourt commented Oct 21, 2020

@kkourt Could you open dedicated issues for:

Sure.

There was no eth0 interface on the machine, and specifying the interface --set encryption.interface=ens4 seems to solve the issue.

#13662

and

Specifying --set encryption.nodeEncryption=true results in k8s nodes being unreachable (e.g., unable to ping them, rendered unavailable). Not clear what is wrong in this case. No relevant errors/warnings in cilium logs.

#13663

@rolinh
Copy link
Member

rolinh commented Oct 21, 2020

The quick install guide is fine. While validating it, I noticed that we sometimes refer to slack by using Slack channel, which leads to the glossary page when clicked (note: it's not really obvious that the element is clickable) and sometimes by providing a link with anchor to the help page. We should probably strive for more consistency.

@nebril
Copy link
Member

nebril commented Oct 21, 2020

Got some CRD and Ingress api version deprecation warnings when running ./cilium-istioctl manifest apply -y , looks like istio deployed correctly, but process exited with an error:

Warning: apiextensions.k8s.io/v1beta1 CustomResourceDefinition is deprecated in v1.16+, unavailable in v1.22+; use apiextensions.k8s.io/v1 CustomResourceDefinition (repeated 1 times)

Warning: extensions/v1beta1 Ingress is deprecated in v1.14+, unavailable in v1.22+; use networking.k8s.io/v1 Ingress (repeated 1 times)

I think this is fine, but wanted to put this out there.

As a side note, minikube start uses docker backend by default, do we want to add virtualbox backend option to the docs? I had some issues with image downloads from within the docker node, not sure if that's just my setup though.

Small nitpick - "Copy All" button copies brackets with preceding backslash, which causes the script to fail - this happens on zsh only, not in bash, or when pasting into vim.

for service in ratings-v1 reviews-v2; do \
      kubectl apply -f https://raw.githubusercontent.com/cilium/cilium/v1.9.0-rc2/examples/kubernetes-istio/bookinfo-${service}.yaml ; done

Last one is this error message an expected one at the end of step 5?

[2020-10-21 13:09:03,955] ERROR Error processing message, terminating consumer process:  (kafka.tools.ConsoleConsumer$)
org.apache.kafka.common.protocol.types.SchemaException: Error reading field 'responses': Error reading field 'partition_responses': Error reading field 'record_set': Bytes size -1 cannot be negative
        at org.apache.kafka.common.protocol.types.Schema.read(Schema.java:73)
        at org.apache.kafka.clients.NetworkClient.parseResponse(NetworkClient.java:380)
        at org.apache.kafka.clients.NetworkClient.handleCompletedReceives(NetworkClient.java:449)
        at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:269)
        at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:232)
        at org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1031)
        at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:979)
        at kafka.consumer.NewShinyConsumer.receive(BaseConsumer.scala:100)
        at kafka.tools.ConsoleConsumer$.process(ConsoleConsumer.scala:120)
        at kafka.tools.ConsoleConsumer$.run(ConsoleConsumer.scala:75)
        at kafka.tools.ConsoleConsumer$.main(ConsoleConsumer.scala:50)
        at kafka.tools.ConsoleConsumer.main(ConsoleConsumer.scala)

Overall, I think the guide works good.

twpayne added a commit that referenced this issue Oct 21, 2020
Refs: #13627

Signed-off-by: Tom Payne <tom@isovalent.com>
joestringer pushed a commit that referenced this issue Oct 22, 2020
[ upstream commit 992c321 ]

Refs: #13627

Signed-off-by: Tom Payne <tom@isovalent.com>
Signed-off-by: Nate Sweet <nathanjsweet@pm.me>
joestringer pushed a commit that referenced this issue Oct 22, 2020
[ upstream commit 0dd00f3 ]

For #13627

Signed-off-by: Timo Beckers <timo@isovalent.com>
Signed-off-by: Nate Sweet <nathanjsweet@pm.me>
michi-covalent added a commit that referenced this issue Oct 23, 2020
- Use `GOOGLE_CREDENTIALS` instead of `GOOGLE_APPLICATION_CREDENTIALS`.
- Refer to https://github.com/openshift/installer/blob/master/docs/user/gcp/iam.md
  to assign appropriate roles to service account.
- Make it a bit clearer that the firewall rule creation is time-sensitive.

Ref: #13627 (comment)

Signed-off-by: Michi Mutsuzaki <michi@isovalent.com>
@michi-covalent
Copy link
Contributor

opened #13713 for some minor modifications for openshift gsg. well done @errordeveloper i don't know how you figured out all these steps 💯 🚀

gandro pushed a commit that referenced this issue Oct 23, 2020
[ upstream commit e8bce62 ]

As of Azure Linux kernel 5.4.0-1022 the necessary patches to the
hv_netvsc have been backported, see [1]. This allows to run NodePort XDP
on Azure AKS when using the Ubuntu 18.04 node image which provides said
kernel version.

[1] https://bugs.launchpad.net/ubuntu/+source/linux-azure/+bug/1877654

Update the instructions accordingly. In addition the nodePort device
helm option needs to be set explicitly to eth0, as otherwise bpf_host.o
would erroneously be bound to the azure0 interface.

For #13627

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
gandro pushed a commit that referenced this issue Oct 23, 2020
[ upstream commit 1cd20c8 ]

Refs: #13627

Signed-off-by: Tom Payne <tom@isovalent.com>
Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
gandro pushed a commit that referenced this issue Oct 23, 2020
[ upstream commit 4931f6b ]

For #13627

Signed-off-by: Gilberto Bertin <gilberto@isovalent.com>
Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
gandro pushed a commit that referenced this issue Oct 23, 2020
[ upstream commit 2c2b5de ]

For #13627

Signed-off-by: Gilberto Bertin <gilberto@isovalent.com>
Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
gandro pushed a commit that referenced this issue Oct 23, 2020
[ upstream commit 1672c81 ]

For #13627

Signed-off-by: Gilberto Bertin <gilberto@isovalent.com>
Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
aanm pushed a commit that referenced this issue Oct 23, 2020
- Use `GOOGLE_CREDENTIALS` instead of `GOOGLE_APPLICATION_CREDENTIALS`.
- Refer to https://github.com/openshift/installer/blob/master/docs/user/gcp/iam.md
  to assign appropriate roles to service account.
- Make it a bit clearer that the firewall rule creation is time-sensitive.

Ref: #13627 (comment)

Signed-off-by: Michi Mutsuzaki <michi@isovalent.com>
aanm pushed a commit that referenced this issue Oct 23, 2020
[ upstream commit e8bce62 ]

As of Azure Linux kernel 5.4.0-1022 the necessary patches to the
hv_netvsc have been backported, see [1]. This allows to run NodePort XDP
on Azure AKS when using the Ubuntu 18.04 node image which provides said
kernel version.

[1] https://bugs.launchpad.net/ubuntu/+source/linux-azure/+bug/1877654

Update the instructions accordingly. In addition the nodePort device
helm option needs to be set explicitly to eth0, as otherwise bpf_host.o
would erroneously be bound to the azure0 interface.

For #13627

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
aanm pushed a commit that referenced this issue Oct 23, 2020
[ upstream commit 1cd20c8 ]

Refs: #13627

Signed-off-by: Tom Payne <tom@isovalent.com>
Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
aanm pushed a commit that referenced this issue Oct 23, 2020
[ upstream commit 4931f6b ]

For #13627

Signed-off-by: Gilberto Bertin <gilberto@isovalent.com>
Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
aanm pushed a commit that referenced this issue Oct 23, 2020
[ upstream commit 2c2b5de ]

For #13627

Signed-off-by: Gilberto Bertin <gilberto@isovalent.com>
Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
aanm pushed a commit that referenced this issue Oct 23, 2020
[ upstream commit 1672c81 ]

For #13627

Signed-off-by: Gilberto Bertin <gilberto@isovalent.com>
Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
@nathanjsweet
Copy link
Member

I did a multi-regional deployment in GKE to test cluster-mesh. One of the things that I ran into is that I needed to turn "global access" on manually for the internal loadbalancers for etcd ala this guide. I filed this issue for us to fix it.

gandro pushed a commit that referenced this issue Oct 26, 2020
[ upstream commit c40dfb0 ]

- Use `GOOGLE_CREDENTIALS` instead of `GOOGLE_APPLICATION_CREDENTIALS`.
- Refer to https://github.com/openshift/installer/blob/master/docs/user/gcp/iam.md
  to assign appropriate roles to service account.
- Make it a bit clearer that the firewall rule creation is time-sensitive.

Ref: #13627 (comment)

Signed-off-by: Michi Mutsuzaki <michi@isovalent.com>
Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
pchaigno pushed a commit that referenced this issue Oct 26, 2020
[ upstream commit c40dfb0 ]

- Use `GOOGLE_CREDENTIALS` instead of `GOOGLE_APPLICATION_CREDENTIALS`.
- Refer to https://github.com/openshift/installer/blob/master/docs/user/gcp/iam.md
  to assign appropriate roles to service account.
- Make it a bit clearer that the firewall rule creation is time-sensitive.

Ref: #13627 (comment)

Signed-off-by: Michi Mutsuzaki <michi@isovalent.com>
Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
pchaigno pushed a commit that referenced this issue Oct 27, 2020
[ upstream commit 0dd00f3 ]

For #13627

Signed-off-by: Timo Beckers <timo@isovalent.com>
Signed-off-by: Paul Chaignon <paul@cilium.io>
christarazi pushed a commit that referenced this issue Oct 27, 2020
[ upstream commit c40dfb0 ]

- Use `GOOGLE_CREDENTIALS` instead of `GOOGLE_APPLICATION_CREDENTIALS`.
- Refer to https://github.com/openshift/installer/blob/master/docs/user/gcp/iam.md
  to assign appropriate roles to service account.
- Make it a bit clearer that the firewall rule creation is time-sensitive.

Ref: #13627 (comment)

Signed-off-by: Michi Mutsuzaki <michi@isovalent.com>
Signed-off-by: Chris Tarazi <chris@isovalent.com>
jrajahalme pushed a commit that referenced this issue Oct 28, 2020
[ upstream commit c40dfb0 ]

- Use `GOOGLE_CREDENTIALS` instead of `GOOGLE_APPLICATION_CREDENTIALS`.
- Refer to https://github.com/openshift/installer/blob/master/docs/user/gcp/iam.md
  to assign appropriate roles to service account.
- Make it a bit clearer that the firewall rule creation is time-sensitive.

Ref: #13627 (comment)

Signed-off-by: Michi Mutsuzaki <michi@isovalent.com>
Signed-off-by: Chris Tarazi <chris@isovalent.com>
tklauser added a commit that referenced this issue Mar 26, 2021
Noticed while reviewing #15370

The respective guide was removed from documentation in commit
3d0e805 ("doc: Remove Mesos/Marathon guide"), thus the link in the
README is broken already. The example hasn't been tested for the last
few releases (v1.9: #13627, v1.8: #11903), so remove it.

Signed-off-by: Tobias Klauser <tobias@cilium.io>
brb pushed a commit that referenced this issue Mar 26, 2021
Noticed while reviewing #15370

The respective guide was removed from documentation in commit
3d0e805 ("doc: Remove Mesos/Marathon guide"), thus the link in the
README is broken already. The example hasn't been tested for the last
few releases (v1.9: #13627, v1.8: #11903), so remove it.

Signed-off-by: Tobias Klauser <tobias@cilium.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/documentation Impacts the documentation, including textual changes, sphinx, or other doc generation code.
Projects
None yet
Development

No branches or pull requests