-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ability to connect to nodes through proxy #97
Conversation
cloud_config.go
Outdated
ProxyURL string `yaml:"proxyUrl,omitempty"` | ||
} | ||
|
||
type Context struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add Cloud prefix to everything defined in the cloud context or move the code to cloud
package for namespacing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moved to cloud
pkg
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed pkg
dir, cloud
is now in root dir
cloud_config.go
Outdated
func (cc *CloudConnectionConfig) GetDatacenterCAPool(datacenterName string) (*x509.CertPool, error) { | ||
dc, ok := cc.Datacenters[datacenterName] | ||
if !ok { | ||
return nil, fmt.Errorf("datacenter %s not found in cloud connection config", datacenterName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use %q everywhere, if you want to be super nice you can list available DoCs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
3bb45a9
to
03ceff4
Compare
In a library there is no need to add |
pkg/cloud/config.go
Outdated
// Servers may infer this from the endpoint the client submits requests to. | ||
// In CamelCase. | ||
// +optional | ||
Kind string `yaml:"kind,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need it? I find no usages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both Kind and APIVersion will be used to distiguish version of configuration. It's not decided yet how we are going to call it (Kind), and from which version (APIVersion) we are going to start with (v1alpha1, v1beta1, v1 etc). Once we progress with API, client code would be able to parse just these two fields to find out which version of API is used and pass right data structure for unmarshalling.
pkg/cloud/config.go
Outdated
// Servers should convert recognized schemas to the latest internal value, and | ||
// may reject unrecognized values. | ||
// +optional | ||
APIVersion string `yaml:"apiVersion,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likewise.
pkg/cloud/config.go
Outdated
Contexts map[string]*Context `yaml:"contexts"` | ||
// CurrentContext is the name of the context that you would like to use by default. | ||
CurrentContext string `yaml:"currentContext"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the purpose of dynamically changing the context?
It is not even a use case if you thy changing it there is a race.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not meant to be dynamic. This is representation of configuration file taken from disk which is read at a time of when connection to cluster is created, usually on application startup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha.
See my suggestion on making this package capable of producing gocql.Cluster and being just a helper.
pkg/cloud/config.go
Outdated
Username string `yaml:"username,omitempty"` | ||
// Password is the password for basic authentication to the Scylla cluster. | ||
// +optional ` | ||
Password string `yaml:"password,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use the username and password from Authenticator?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can overwrite it after calling NewCloudCluster. This is meant to set up a PasswordAuthenticator from configuration file without any code change.
pkg/cloud/config.go
Outdated
type Parameters struct { | ||
// DefaultConsistency is the default consistency level used for queries. | ||
// +optional | ||
DefaultConsistency ConsistencyString `yaml:"defaultConsistency,omitempty"` | ||
} | ||
|
||
type ConsistencyString string | ||
|
||
// just AnyConsistency etc is better, but there's already SerialConsistency defined elsewhere. | ||
const ( | ||
DefaultAnyConsistency ConsistencyString = "ANY" | ||
DefaultOneConsistency ConsistencyString = "ONE" | ||
DefaultTwoConsistency ConsistencyString = "TWO" | ||
DefaultThreeConsistency ConsistencyString = "THREE" | ||
DefaultQuorumConsistency ConsistencyString = "QUORUM" | ||
DefaultAllConsistency ConsistencyString = "ALL" | ||
DefaultLocalQuorumConsistency ConsistencyString = "LOCAL_QUORUM" | ||
DefaultEachQuorumConsistency ConsistencyString = "EACH_QUORUM" | ||
DefaultSerialConsistency ConsistencyString = "SERIAL" | ||
DefaultLocalSerialConsistency ConsistencyString = "LOCAL_SERIAL" | ||
DefaultLocalOneConsistency ConsistencyString = "LOCAL_ONE" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use consistency settings in the driver?
Also SerialCosistency and Consistency are distinct things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Driver consistency settings use uint16
, where in configuration file this option should be human readable. It is unmarshalled to uint16
Consistency
here: https://github.com/zimnx/gocql/blob/e6c71c39c4fba48984118757290a2697cf1105dc/cluster.go#L315-L321
Yeah now i see there're two fields for query consistency, then I guess Config API should reflect that too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added second consistency to config and validation
pkg/cloud/config.go
Outdated
return caPool, nil | ||
} | ||
|
||
func (cc *ConnectionConfig) GetInitialContactPoints() []string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a naming mismatch here and in Datacenter.Server
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initial contact point is something that is related to how drivers works, initially they contact a list of endpoints and discover peers from system table. From API point of view, it's just a address to datacenter hence name mismatch.
ring.go
Outdated
@@ -15,7 +14,7 @@ type ring struct { | |||
|
|||
// hosts are the set of all hosts in the cassandra ring that we know of | |||
mu sync.RWMutex | |||
hosts map[string]*HostInfo | |||
hosts map[UUID]*HostInfo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change could go to a separate commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
ring.go
Outdated
for _, hi := range r.hosts { | ||
if hi.connectAddress.String() == ip { | ||
return hi, true | ||
} | ||
} | ||
return nil, false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most likely this code is in a driver hot loop did you profile the changes?
Use something with O(1) runtime.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see now it's in events only so no a biggie I guess. Still it would be nice to preserve the mapping from IP to ID and fallback to getHost.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
events.go
Outdated
@@ -159,57 +159,57 @@ func (s *Session) handleNodeEvent(frames []frame) { | |||
} | |||
} | |||
|
|||
for _, f := range events { | |||
for _, event := range events { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we remain the orig name? It would be easier to maintain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's no longer a frame, but parsed event hence f
doesn't have anything in common with variable semantic. I stumbled upon it, and fixed to maintain higher quality of code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand that but it also modifies lines that would not be otherwise modified and thus makes it harder to review / maintain in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
brought it back
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feel free to submit the renaming in a separate commit/pull request though, we can update it both in gocql/gocql and scylladb/gocql.
events.go
Outdated
if ok && host.IsUp() { | ||
if host.IsUp() { | ||
return | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The inner if looks like a dead code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or a bug?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bugf/leftover, fixed
// Get host info and apply any filters to the host | ||
hostInfo, err := s.hostSource.getHostInfo(ip, port) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's not rename variables if we do not have to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
cluster.go
Outdated
connConf := &cloud.ConnectionConfig{} | ||
|
||
bundleFile, err := os.Open(bundlePath) | ||
if err != nil { | ||
return nil, fmt.Errorf("can't open bundle path: %w", err) | ||
} | ||
defer bundleFile.Close() | ||
|
||
if err := yaml.NewDecoder(bundleFile).Decode(connConf); err != nil { | ||
return nil, fmt.Errorf("can't decode bundle file at %q: %w", bundlePath, err) | ||
} | ||
|
||
if _, ok := connConf.Contexts[connConf.CurrentContext]; !ok { | ||
return nil, fmt.Errorf("current context points to unknown context") | ||
} | ||
|
||
confContext := connConf.Contexts[connConf.CurrentContext] | ||
|
||
if _, ok := connConf.AuthInfos[confContext.AuthInfoName]; !ok { | ||
return nil, fmt.Errorf("context %q auth info points to unknown authinfo", connConf.CurrentContext) | ||
} | ||
|
||
if _, ok := connConf.Datacenters[confContext.DatacenterName]; !ok { | ||
return nil, fmt.Errorf("context %q datacenter points to unknown datacenter", connConf.CurrentContext) | ||
} | ||
|
||
authInfo := connConf.AuthInfos[confContext.AuthInfoName] | ||
|
||
caPool, err := connConf.GetRootCAPool() | ||
if err != nil { | ||
return nil, fmt.Errorf("can't create root CA pool: %w", err) | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please move the parsing to cloud pkg.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't due to cycle in imports - i'm reusing default values from NewCluster here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Yes, I tried cherry-picking it last week, but didn't manage to post a response here. I have an unfinished experimental code to make the HostDialer work with shard-aware ports, will try to finish it and send as a separate PR. |
I've opened gocql#1629 and #98 for the HostDialer. Please let me know in those PRs what do you think about it. |
@zimnx It seems commits
would be useful upstream, would you please submit them to gocql/gocql as well? |
Done: |
Thanks! |
6aa6646
to
337e28e
Compare
Function changed HostInfo.hostname without holding write lock. Added write lock around it.
Currently driver identifies nodes based on their broadcasted IP address. In cloud case, broadcasted IP addresses are private and are not meant to be used as a contact point, and they may change overtime. Hence driver internals were changed to identify nodes based on their host_id which is unique per node and it's persistant throught entire node lifecycle. Because CQL Events are still using broadcasted IP addresses, driver keep mapping between already known IP addresses and host_ids. Prepared statement cache key was also changed to host_id, to not invalidate it upon IP change.
Services deployed in cloud often are hidden behind proxy which dispatches connections based on Server Name Identifier (SNI) taken from TLS Client Hello packet. New method of creating ClusterConfig - NewCloudCluster - allows to connect to nodes behind SNI proxy based on provided configuration file. Because each datacenter may have different TLS configurtion (CA, proxy address etc), more granular method of configuring connection details was needed. CloudCluster use special HostDialer which connect to nodes using information taken from HostInfo (datacenter, host_id) to go through SNI proxy.
LGTM |
So the different drivers can be test their new implementations of sni-proxy for using it, need to start the cluster from the commandline like this: ``` ❯ ccm start --sni-proxy sni_proxy listening on: 127.0.0.1:443 ``` using it from python code, would be a bit diffrent: ```python nodes_info = get_cluster_info(self.cluster.get_path(), address=self.cluster.nodelist()[0].address(), port=9142) docker_id, listen_address, listen_port = \ start_sni_proxy(self.cluster.get_path(), nodes_info=nodes_info) ``` Ref: scylladb/gocql#97
So the different drivers can be test their new implementations of sni-proxy for using it, need to start the cluster from the commandline like this: ``` ❯ ccm start --sni-proxy sni_proxy listening on: 127.0.0.1:443 ``` using it from python code, would be a bit diffrent: ```python nodes_info = get_cluster_info(self.cluster.get_path(), address=self.cluster.nodelist()[0].address(), port=9142) docker_id, listen_address, listen_port = \ start_sni_proxy(self.cluster.get_path(), nodes_info=nodes_info) ``` Ref: scylladb/gocql#97
using code introduce in scylladb/gocql#97 for connect via sni_proxy to the serverless operator Ref: scylladb/gocql@751bff9
using code introduce in scylladb/gocql#97 for connect via sni_proxy to the serverless operator Ref: scylladb/gocql@751bff9
Services deployed in cloud often are hidden behind proxy which
dispatches connections based on Server Name Identifier (SNI)
taken from TLS Client Hello packet.
New method of creating ClusterConfig - NewCloudCluster - allows to
connect to nodes behind SNI proxy based on provided configuration file.
Because each datacenter may have different TLS configurtion (CA, proxy
address etc), more granular method of configuring connection details was
needed. CloudCluster use special HostDialer which connect to nodes using
information taken from HostInfo (datacenter, host_id) to go through SNI
proxy.
Currently driver identifies nodes based on their broadcasted IP
address. In cloud case, broadcasted IP addresses are private and are not
meant to be used as a contact point, and they may change overtime.
Hence driver internals were changed to identify nodes based on their
host_id which is unique per node and it's persistant throught entire
node lifecycle.
Because CQL Events are still using broadcasted IP addresses, driver
keep mapping between already known IP addresses and host_ids.