Skip to content

feat(csharp/src/Drivers/BigQuery): Add support for Http proxy #2831

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

davidhcoe
Copy link
Contributor

@davidhcoe davidhcoe commented May 16, 2025

Adds support for calling a proxy via HTTP using the adbc.bigquery.client.proxy parameter.

Does not support calling a proxy to the gRPC v1 storage endpoint due to net472 limitations.

@davidhcoe
Copy link
Contributor Author

I have tried using the grpc_proxy, https_proxy and http_proxy environment variables, that hasn't worked.

I have tried for both net472 and net8.0:

        private BigQueryReadClient GetNet472BigQueryReadClient(GoogleCredential? credential)
        {
            // this results in Grpc.Core.RpcException: 'Status(StatusCode="Unimplemented",
            // Detail="Bad gRPC response. HTTP status code: 404")'

            BigQueryReadClientBuilder? readClientBuilder;

            if (!string.IsNullOrEmpty(this.proxyAddress))
            {
                HttpClientHandler proxyHandler = ProxyManager.GetProxyHttpClientHandler(this.proxyAddress!);
                GrpcWebHandler grpcWebHandler = new GrpcWebHandler(GrpcWebMode.GrpcWeb, proxyHandler);
                HttpClient proxyGrpcWebClient = new HttpClient(grpcWebHandler);
                Grpc.Net.Client.GrpcChannelOptions options = new Grpc.Net.Client.GrpcChannelOptions()
                {
                    HttpClient = proxyGrpcWebClient,
                    Credentials = credential?.ToChannelCredentials()
                };
                GrpcChannel proxyChannel = GrpcChannel.ForAddress("https://bigquerystorage.googleapis.com:443", options);
                CallInvoker ci = proxyChannel.CreateCallInvoker();
                readClientBuilder = new BigQueryReadClientBuilder()
                {
                    CallInvoker = ci
                };
            }
            else
            {
                readClientBuilder = new BigQueryReadClientBuilder();
                readClientBuilder.Credential = credential;
                
            }

            BigQueryReadClient bigQueryReadClient = readClientBuilder.Build();
            return bigQueryReadClient;
        }


        private BigQueryReadClient GetBigQueryReadClient(GoogleCredential? credential)
        {
            // this also results in
            // RpcException: Status(StatusCode="Unimplemented", Detail="Bad gRPC response. HTTP status code: 404")
            // when using .net 8

            BigQueryReadClientBuilder? readClientBuilder;

            if (!string.IsNullOrEmpty(this.proxyAddress))
            {
                HttpClientHandler proxyHandler = ProxyManager.GetProxyHttpClientHandler(this.proxyAddress!);
                GrpcWebHandler grpcWebHandler = new GrpcWebHandler(GrpcWebMode.GrpcWeb, proxyHandler);
                HttpClient proxyGrpcWebClient = new HttpClient(grpcWebHandler);
                GrpcNetClientAdapter grpcNetClientAdapter =
                    GrpcNetClientAdapter.Default.WithAdditionalOptions(options =>
                    {
                        options.HttpClient = proxyGrpcWebClient;
                    });

                readClientBuilder = new BigQueryReadClientBuilder()
                {
                    GrpcAdapter = grpcNetClientAdapter,
                };
            }
            else
            {
                readClientBuilder = new BigQueryReadClientBuilder();
            }

            readClientBuilder.Credential = credential;
            BigQueryReadClient bigQueryReadClient = readClientBuilder.Build();
            return bigQueryReadClient;
        }

The only other option appears to be to use the storage v2 REST API, but then the data comes back as JSON and not Arrow. So, I am not sure what other options can be used for calling the storage backend with a proxy in the middle.

@davidhcoe
Copy link
Contributor Author

davidhcoe commented May 16, 2025

Proxy setup:

Windows machine A, running the test suite.

Windows machine B, running Fiddler on port 8888.

Set adbc.bigquery.client.proxy to the IP of machine B on port 8888.

A --> B --> BigQuery

You can see the call to accounts and BigQuery routed through. And when I run
curl -x http://<ip_machine_b>:8888 https://www.google.com
from machine A it will route through and return data, but not when http://<ip_machine_b>:8888 is set for the grpc_proxy, https_proxy or http_proxy environment variable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant