Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GCP deployment results in 502 http error #94

Closed
3 tasks
mands opened this issue Aug 6, 2021 · 5 comments
Closed
3 tasks

GCP deployment results in 502 http error #94

mands opened this issue Aug 6, 2021 · 5 comments
Assignees
Labels
bug Something isn't working P2 nice to have

Comments

@mands
Copy link

mands commented Aug 6, 2021

Bug description

Upon installing to GCP with the default values in the README.md, I get a 502 upon going to the main URL

image

helm command

$ helm --kube-context analytics --namespace posthog1 install posthog posthog/posthog --values ./posthog-values-v2.yaml --set email.password=$SENDGRID_API_KEY --atomic --create-namespace

values

cloud: "gcp"
ingress:
  hostname: xyz.example.com
  nginx:
    enabled: false
certManager:
  enabled: false
metrics:
  enabled: false
email:
  from_email: posthog@example.com
  host: smtp.sendgrid.net
  user: apikey
  • I created an External IP in the GCP console, as per the docs, which appears to be connected to the K8S node port

image

  • I have an Google managed SSL cert which seems to have provisioned, hence DNS is working
$ gcloud beta compute ssl-certificates list
NAME                                                TYPE          CREATION_TIMESTAMP             EXPIRE_TIME                    MANAGED_STATUS
k8s2-cr-mi7kcees-a0bsfajor9b9gf2a-19c53f806b9b9baf  SELF_MANAGED  2021-06-20T23:39:15.697-07:00  2021-09-18T22:31:28.000-07:00
mcrt-291b0d5a-888d-4ac4-8d00-6dc563537402           MANAGED	  2021-08-06T06:53:02.540-07:00  2021-11-04T06:53:05.000-07:00  ACTIVE
    xyz.example.com: ACTIVE
  • The posthog NodePort seems to be up
$ kubectl get svc posthog --namespace posthog1
NAME      TYPE       CLUSTER-IP   EXTERNAL-IP   PORT(S)          AGE
posthog   NodePort   10.72.0.5    <none>        8000:30984/TCP   19m

Expected behavior

  • The posthog home page is returned

Environment

  • Deployment platform (gcp/aws/...): GCP
  • Chart version/commit: 3.3.0
  • Posthog version: default for chart

Additional context

  • I'm using posthog1 as a namespace, as I'm unable to delete the initial posthog namespace due to the ClickHouseInstallation operator not deleting when running helm uninstall

Thank you for your bug report – we love squashing them!

@mands mands added the bug Something isn't working label Aug 6, 2021
@mands
Copy link
Author

mands commented Aug 6, 2021

Bit more information,

It appears that the GCP load balancer believes that the postgres service is unhealthy

image

However port-forwarding indicates that it's all up and running, and the ingress configuration in k8s seems correct

$ kubectl -n posthog1 port-forward service/posthog 28015:8000
Forwarding from 127.0.0.1:28015 -> 8000
Forwarding from [::1]:28015 -> 8000
Handling connection for 28015

$ http HEAD http://localhost:28015/preflight
HTTP/1.1 200 OK
Connection: keep-alive
Content-Length: 2708
Content-Type: text/html; charset=utf-8
Date: Fri, 06 Aug 2021 17:25:00 GMT
Referrer-Policy: same-origin
Server: gunicorn
Vary: Cookie
X-Content-Type-Options: nosniff
X-Frame-Options: DENY

Are there any other network fields that require setup in GCP?

@macobo
Copy link
Contributor

macobo commented Aug 7, 2021

Hey @mands sorry you're having difficulty here.

Some debugging info that might be useful (in the right namespace):

  • gcloud beta compute ssl-certificates list and gcloud beta compute ssl-certificates describe CERTor same with get - making sure the ssl certificate is properly being provisioned and not stuck
  • kubectl get pods
  • kubectl logs SOME_WEB_POD - perhaps you see an issue there in the logs?

Another thing to try: to make sure this is caused by https, try setting web.secureCookies to false and ingress.gcp.forceHttps to false. If things work then (accessing via http), we can debug what's going on with the cert better :)

For deleting the old ClickhouseInstallation: It can get stuck on finalizers. One thing I've done in the past is kubectl edit ClickhouseInstallation posthog.

cc @tiina303

@mands
Copy link
Author

mands commented Aug 10, 2021

Hi,

Thanks for the feedback, unfortunately I wasn't able to rescue the namespace and have since deleted it (although as above, it's still stuck deleting due to the ClickHouse finalizer). The gcloud ssl cert was successfully provisioned, it seemed to be an issue with the health check from the load balancer to the pod, howeever port forwarding locally to the web pod seemed to work fine.

We run a similar setup for our own product, also on GCP/GKE, and the networking side looks pretty much the same. The main difference I could see what between using the extensions/v1beta1 Ingress API and the PostHog cluster was using NEG on GKE - perhaps may be related.

For now we're moving to using PostHog Cloud, but may be worth testing the instructions from the readme, using the config above, on a clean-room GKE cluster to see if you can repro. However feel free to close either way.

Cheers!

@tiina303 tiina303 self-assigned this Aug 11, 2021
@tiina303 tiina303 added the P2 nice to have label Aug 11, 2021
@tiina303
Copy link
Contributor

but may be worth testing the instructions from the readme, using the config above, on a clean-room GKE cluster to see if you can repro. However feel free to close either way.

Added to our backlog for this.

@tiina303
Copy link
Contributor

I've rested gcp install/upgrade a few times now, didn't see the same problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working P2 nice to have
Projects
None yet
Development

No branches or pull requests

3 participants