Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Properly split remote proxy http https #55430

Conversation

arthurpassos
Copy link
Contributor

@arthurpassos arthurpassos commented Oct 9, 2023

This PR does 3 things:

  1. Fixes small remote resolver bug - more info below.
  2. Removes DB::ProxyConfiguration::Protocol::ANY
  3. Adds proxy docs

Proxy servers must be picked based on the request protocol (i.e, https request should grab the https_proxy or whatever https related configuration). This was being done correctly for the environment and list proxy resolvers, but not for the remote resolver.

This is how it was implemented:

  • The environment resolver would query http_proxy or https_proxy environment variables based on the request protocol.
  • The list resolver would round-robin URIs listed in <proxy><http OR https></proxy></> based on the request protocol.
  • The remote resolver, on the other hand, would look into the proxy server protocol, not to what the user configured as a resolver for that protocol (i.e, proxy_scheme) - This is wrong, it's perfectly fine to have a HTTPS request routed through a HTTP proxy. The comparisong that should be made is the same that is being done for list and environment, that is: request protocol x user configuration for that protocol.

This PR addresses this issue by changing the syntax of remote resolvers. It now has to be encapsulated into <http> or <https> tags just like the list resolver.

<http> // new
     <resolver>
           // rest as usual
     </resolver>
</http>
<https> // new
     <resolver>
           // rest as usual
     </resolver>
</https>

Not sure I should classify this as bug or improvement.

Changelog category (leave one):

  • Improvement

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Select remote proxy resolver based on request protocol, add proxy feature docs and remove DB::ProxyConfiguration::Protocol::ANY.

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

Information about CI checks: https://clickhouse.com/docs/en/development/continuous-integration/

@tavplubix tavplubix added the can be tested Allows running workflows for external contributors label Oct 10, 2023
@robot-ch-test-poll2 robot-ch-test-poll2 added the pr-improvement Pull request with some product improvements label Oct 10, 2023
@robot-ch-test-poll2
Copy link
Contributor

robot-ch-test-poll2 commented Oct 10, 2023

This is an automated comment for commit 8c7973d with description of existing statuses. It's updated for the latest CI running

⏳ Click here to open a full report in a separate page

Successful checks
Check nameDescriptionStatus
Mergeable CheckChecks if all other necessary checks are successful✅ success
Push to DockerhubThe check for building and pushing the CI related docker images to docker hub✅ success
Check nameDescriptionStatus
CI runningA meta-check that indicates the running CI. Normally, it's in success or pending state. The failed status indicates some problems with the PR⏳ pending

@arthurpassos arthurpassos marked this pull request as ready for review October 11, 2023 16:37
@arthurpassos
Copy link
Contributor Author

@tavplubix Could you please take a look at this?

@tavplubix tavplubix self-assigned this Oct 11, 2023
@arthurpassos
Copy link
Contributor Author

@tavplubix kind ping

<endpoint>http://resolver:8080/hostname</endpoint>
<proxy_scheme>http</proxy_scheme>
<proxy_port>80</proxy_port>
<proxy_cache_time>10</proxy_cache_time>
Copy link
Contributor

@Enmk Enmk Oct 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

time is too unspecific, please add units like seconds, minutes, hours, etc... into the config node name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The old configuration was already like this, changing it will break backwards compatibility. I can modify this for the new version only, but code will be a bit more complex and syntax will be even more different.

I suggest we keep it like this, what do you think?

2. Proxy lists
3. Environment variables

ClickHouse will check the highest priority resolver type for the request protocol. If it is not defined,
Copy link
Contributor

@Enmk Enmk Oct 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When will CH check the next proxy configuration? Would it try proxy lists if the resolver doesn't respond?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will check the next proxy configuration if it does not find a configuration for the requests protocol. For instance: ClickHouse is trying to send a HTTPS request and there is a remote resolver configuration section, but only for HTTP. In this case, it will check list and fallback to environment if there is no list for HTTPS.

@arthurpassos
Copy link
Contributor Author

@tavplubix kind ping :)

@arthurpassos
Copy link
Contributor Author

@tavplubix daily ping :)

@tavplubix
Copy link
Member

The changelog entry is too long and detailed. Please add a short user-readable description for the changelog (one line, a few sentences)

@arthurpassos
Copy link
Contributor Author

The changelog entry is too long and detailed. Please add a short user-readable description for the changelog (one line, a few sentences)

I have moved all text above changelog entry and added a small sentence, could you please check if it is ok now?

@tavplubix
Copy link
Member

Stateless tests (release) - #55844
Stress test (tsan) - #55890

@tavplubix tavplubix merged commit f899254 into ClickHouse:master Oct 20, 2023
14 of 19 checks passed
@arthurpassos
Copy link
Contributor Author

Thanks @tavplubix!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
can be tested Allows running workflows for external contributors pr-improvement Pull request with some product improvements
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants