Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(docs): respect no_proxy environment variable #23816

Merged
merged 3 commits into from
Jul 21, 2023

Conversation

karsten-wagner
Copy link
Contributor

SUMMARY

The clickhouse-connect library 0.4.1 did not handle a no_proxy environment variable. This is eventually problematic when using the Docker Compose Installation Guide in an environment that requires a forward proxy (e.g., in an enterprise environment) and where Clickhouse should be used as a database. In such case the call to the database should remain inside the docker network to allow addressing the database via its container name. Calls to external data sources should be rerouted via the forward proxy. Version 0.4.1 handles all connections via the proxy and thus misses the local network connection inside docker network and thus eventually fails host name resolution, given the forward proxy sits outside the docker network and is not aware of the inside docker network container names.

The issue has been fixed in the clickhouse-connect library in version 0.5.19 via #PR163, hence bumping version here in the superset docs accordingly.

TESTING INSTRUCTIONS

  • Follow the Docker Compose Installation Guide
  • Follow the existing Clickhouse Installation Guide referencing version 0.4.1
  • Add a clickhouse container named clickhouse to docker-compose-non-dev.yml, e.g.:
    x-clickhouse-volumes:
      &clickhouse-volumes
      - clickhouse_home:/var/lib/clickhouse
    
    clickhouse:
      image: clickhouse/clickhouse-server:23
      container_name: clickhouse
      env_file: .env
      user: "root"
      restart: unless-stopped
      ports:
        - "8123:8123"
        - "9003:9000"
      volumes: *clickhouse-volumes
    
    volumes:
      clickhouse_home:
        external: false
  • Update docker/.env to contain http_proxy, https_proxy and no_proxy environment variables, e.g.:
    # Local Proxy Settings
    http_proxy="http://host.docker.internal:8888"
    https_proxy="http://host.docker.internal:8888"
    no_proxy="localhost,127.0.0.1,db,redis,superset,superset-init,superset-worker,superset-beat,clickhouse"
  • Provide the forward proxy on the host OS (e.g., Fiddler - in above example on port 8888)
  • Start all containers and log in to superset
  • Try to add Clickhouse database via connection string clickhousedb://clickhouse/default

The connection will fail. See eventually logs in the ´superset_app` container. Superset UI shows an error message on the dialog with the connection string, but not on the one where you provide host, port etc. in separate fields.

  • Change docker/requirements-local.txt to contain new version clickhouse-connect>=0.5.19
  • Repeat the test to add the database and all works

ADDITIONAL INFORMATION

  • Has associated issue:
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

The `clickhouse-connect` library `0.4.1` did not handle a `no_proxy` environment variable. This is eventually problematic when using the [Docker Compose Installation Guide](https://superset.apache.org/docs/installation/installing-superset-using-docker-compose) in an environment that requires a forward proxy and where Clickhouse should be used as a database. In such case the call to the database should remain inside the docker network to allow addressing the database via its container name. Calls to external data sources should be rerouted via the forward proxy. Version `0.4.1` handles all connections via the proxy and thus misses the local network connection inside docker network and thus eventually fails host name resolution, given the forward proxy sits outside the docker network and is not aware of the inside docker network container names.
@karsten-wagner
Copy link
Contributor Author

Big thanks to @genzgd for all the hard work to respect the no_proxy variable in the clickhouse-connect library. I am just connecting the dots here...

@genzgd
Copy link
Contributor

genzgd commented Apr 25, 2023

I'd recommend using 0.5.20 as the minimum, since it has a somewhat important bug fix concerning HTTP headers.

@genzgd
Copy link
Contributor

genzgd commented Apr 25, 2023

Also big thanks for adding the this PR!

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Congrats on making your first PR and thank you for contributing to Superset! 🎉 ❤️

We hope to see you in our Slack community too! Not signed up? Use our Slack App to self-register.

@karsten-wagner
Copy link
Contributor Author

@genzgd, updated to ´0.5.20´ as suggested.

@genzgd
Copy link
Contributor

genzgd commented Jul 20, 2023

Just a heads up, at this point the current version is 0.6.8 and 0.6.x is required for Superset 2.1.+

As per feedback from clickhouse-connect developer...
@karsten-wagner
Copy link
Contributor Author

@michael-s-molina, anything missing to merge this to master?

Copy link
Member

@michael-s-molina michael-s-molina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@michael-s-molina michael-s-molina merged commit a80ec15 into apache:master Jul 21, 2023
28 checks passed
@michael-s-molina
Copy link
Member

@michael-s-molina, anything missing to merge this to master?

I was just waiting for CI to complete 😄

Thank you for the fix @karsten-wagner and for the awesome PR description!

@karsten-wagner karsten-wagner deleted the patch-1 branch July 21, 2023 12:01
@karsten-wagner
Copy link
Contributor Author

Thanks and welcome! Superset is a great tool!

@michael-s-molina michael-s-molina added the v3.0 Label added by the release manager to track PRs to be included in the 3.0 branch label Jul 24, 2023
michael-s-molina pushed a commit that referenced this pull request Jul 26, 2023
@mistercrunch mistercrunch added 🍒 3.0.0 🍒 3.0.1 🍒 3.0.2 🍒 3.0.3 🍒 3.0.4 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 3.1.0 labels Mar 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels size/XS v3.0 Label added by the release manager to track PRs to be included in the 3.0 branch 🍒 3.0.0 🍒 3.0.1 🍒 3.0.2 🍒 3.0.3 🍒 3.0.4 🚢 3.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants