GCP deployment results in 502 http error #94

mands · 2021-08-06T14:26:21Z

Bug description

Upon installing to GCP with the default values in the README.md, I get a 502 upon going to the main URL

helm command

$ helm --kube-context analytics --namespace posthog1 install posthog posthog/posthog --values ./posthog-values-v2.yaml --set email.password=$SENDGRID_API_KEY --atomic --create-namespace

values

cloud: "gcp"
ingress:
  hostname: xyz.example.com
  nginx:
    enabled: false
certManager:
  enabled: false
metrics:
  enabled: false
email:
  from_email: posthog@example.com
  host: smtp.sendgrid.net
  user: apikey

I created an External IP in the GCP console, as per the docs, which appears to be connected to the K8S node port

I have an Google managed SSL cert which seems to have provisioned, hence DNS is working

$ gcloud beta compute ssl-certificates list
NAME                                                TYPE          CREATION_TIMESTAMP             EXPIRE_TIME                    MANAGED_STATUS
k8s2-cr-mi7kcees-a0bsfajor9b9gf2a-19c53f806b9b9baf  SELF_MANAGED  2021-06-20T23:39:15.697-07:00  2021-09-18T22:31:28.000-07:00
mcrt-291b0d5a-888d-4ac4-8d00-6dc563537402           MANAGED	  2021-08-06T06:53:02.540-07:00  2021-11-04T06:53:05.000-07:00  ACTIVE
    xyz.example.com: ACTIVE

The posthog NodePort seems to be up

$ kubectl get svc posthog --namespace posthog1
NAME      TYPE       CLUSTER-IP   EXTERNAL-IP   PORT(S)          AGE
posthog   NodePort   10.72.0.5    <none>        8000:30984/TCP   19m

Expected behavior

The posthog home page is returned

Environment

Deployment platform (gcp/aws/...): GCP
Chart version/commit: 3.3.0
Posthog version: default for chart

Additional context

I'm using posthog1 as a namespace, as I'm unable to delete the initial posthog namespace due to the ClickHouseInstallation operator not deleting when running helm uninstall

Thank you for your bug report – we love squashing them!

The text was updated successfully, but these errors were encountered:

mands · 2021-08-06T17:26:24Z

Bit more information,

It appears that the GCP load balancer believes that the postgres service is unhealthy

However port-forwarding indicates that it's all up and running, and the ingress configuration in k8s seems correct

$ kubectl -n posthog1 port-forward service/posthog 28015:8000
Forwarding from 127.0.0.1:28015 -> 8000
Forwarding from [::1]:28015 -> 8000
Handling connection for 28015

$ http HEAD http://localhost:28015/preflight
HTTP/1.1 200 OK
Connection: keep-alive
Content-Length: 2708
Content-Type: text/html; charset=utf-8
Date: Fri, 06 Aug 2021 17:25:00 GMT
Referrer-Policy: same-origin
Server: gunicorn
Vary: Cookie
X-Content-Type-Options: nosniff
X-Frame-Options: DENY

Are there any other network fields that require setup in GCP?

macobo · 2021-08-07T10:52:37Z

Hey @mands sorry you're having difficulty here.

Some debugging info that might be useful (in the right namespace):

gcloud beta compute ssl-certificates list and gcloud beta compute ssl-certificates describe CERTor same with get - making sure the ssl certificate is properly being provisioned and not stuck
kubectl get pods
kubectl logs SOME_WEB_POD - perhaps you see an issue there in the logs?

Another thing to try: to make sure this is caused by https, try setting web.secureCookies to false and ingress.gcp.forceHttps to false. If things work then (accessing via http), we can debug what's going on with the cert better :)

For deleting the old ClickhouseInstallation: It can get stuck on finalizers. One thing I've done in the past is kubectl edit ClickhouseInstallation posthog.

cc @tiina303

mands · 2021-08-10T15:09:01Z

Hi,

Thanks for the feedback, unfortunately I wasn't able to rescue the namespace and have since deleted it (although as above, it's still stuck deleting due to the ClickHouse finalizer). The gcloud ssl cert was successfully provisioned, it seemed to be an issue with the health check from the load balancer to the pod, howeever port forwarding locally to the web pod seemed to work fine.

We run a similar setup for our own product, also on GCP/GKE, and the networking side looks pretty much the same. The main difference I could see what between using the extensions/v1beta1 Ingress API and the PostHog cluster was using NEG on GKE - perhaps may be related.

For now we're moving to using PostHog Cloud, but may be worth testing the instructions from the readme, using the config above, on a clean-room GKE cluster to see if you can repro. However feel free to close either way.

Cheers!

tiina303 · 2021-08-11T18:56:09Z

but may be worth testing the instructions from the readme, using the config above, on a clean-room GKE cluster to see if you can repro. However feel free to close either way.

Added to our backlog for this.

tiina303 · 2021-09-14T19:46:24Z

I've rested gcp install/upgrade a few times now, didn't see the same problem.

mands added the bug Something isn't working label Aug 6, 2021

tiina303 self-assigned this Aug 11, 2021

tiina303 added the P2 nice to have label Aug 11, 2021

tiina303 closed this as completed Sep 14, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GCP deployment results in 502 http error #94

GCP deployment results in 502 http error #94

mands commented Aug 6, 2021 •

edited

Loading

mands commented Aug 6, 2021 •

edited

Loading

macobo commented Aug 7, 2021 •

edited

Loading

mands commented Aug 10, 2021

tiina303 commented Aug 11, 2021

tiina303 commented Sep 14, 2021

GCP deployment results in 502 http error #94

GCP deployment results in 502 http error #94

Comments

mands commented Aug 6, 2021 • edited Loading

Bug description

helm command

values

Expected behavior

Environment

Additional context

Thank you for your bug report – we love squashing them!

mands commented Aug 6, 2021 • edited Loading

macobo commented Aug 7, 2021 • edited Loading

mands commented Aug 10, 2021

tiina303 commented Aug 11, 2021

tiina303 commented Sep 14, 2021

mands commented Aug 6, 2021 •

edited

Loading

mands commented Aug 6, 2021 •

edited

Loading

macobo commented Aug 7, 2021 •

edited

Loading