Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Browse files

Merge pull request #8 from ifesdjeen/patch-1

Updating cassandrya.yaml template, that conforms to 1.2 and enables CQL3 support
  • Loading branch information...
commit 8b708c2fbe39478664f66a32385212480bde100c 2 parents ef5775b + 824a69a
Michael Klishin authored
Showing with 144 additions and 61 deletions.
  1. +1 −3 recipes/datastax.rb
  2. +143 −58 templates/default/cassandra.yaml.erb
4 recipes/datastax.rb
View
@@ -20,8 +20,6 @@
# This recipe relies on a PPA package and is Ubuntu/Debian specific. Please
# keep this in mind.
-include_recipe "java"
-
apt_repository "datastax" do
uri "http://debian.datastax.com/community"
distribution "stable"
@@ -37,7 +35,7 @@
action :install
end
-package "dsc" do
+package "dsc12" do
action :install
end
201 templates/default/cassandra.yaml.erb
View
@@ -1,4 +1,4 @@
-# Cassandra storage config YAML
+# Cassandra storage config YAML
# NOTE:
# See http://wiki.apache.org/cassandra/StorageConfiguration for
@@ -9,7 +9,22 @@
# one logical cluster from joining another.
cluster_name: '<%= node[:cassandra][:cluster_name] %>'
-# You should always specify InitialToken when setting up a production
+# This defines the number of tokens randomly assigned to this node on the ring
+# The more tokens, relative to other nodes, the larger the proportion of data
+# that this node will store. You probably want all nodes to have the same number
+# of tokens assuming they have equal hardware capability.
+#
+# If you leave this unspecified, Cassandra will use the default of 1 token for legacy compatibility,
+# and will use the initial_token as described below.
+#
+# Specifying initial_token will override this setting.
+#
+# If you already have a cluster with 1 token per node, and wish to migrate to
+# multiple tokens per node, see http://wiki.apache.org/cassandra/Operations
+# num_tokens: 256
+
+# If you haven't specified num_tokens, or have set it to the default of 1 then
+# you should always specify InitialToken when setting up a production
# cluster for the first time, and often when adding capacity later.
# The principle is that each node should be given an equal slice of
# the token ring; see http://wiki.apache.org/cassandra/Operations
@@ -24,10 +39,15 @@ initial_token: <%= node[:cassandra][:initial_token] %>
# See http://wiki.apache.org/cassandra/HintedHandoff
hinted_handoff_enabled: true
# this defines the maximum amount of time a dead host will have hints
-# generated. After it has been dead this long, hints will be dropped.
-max_hint_window_in_ms: 3600000 # one hour
-# Sleep this long after delivering each hint
-hinted_handoff_throttle_delay_in_ms: 1
+# generated. After it has been dead this long, new hints for it will not be
+# created until it has been seen alive and gone down again.
+max_hint_window_in_ms: 10800000 # 3 hours
+# throttle in KB's per second, per delivery thread
+hinted_handoff_throttle_in_kb: 1024
+# Number of threads with which to deliver hints;
+# Consider increasing this number when you have multi-dc deployments, since
+# cross-dc handoff tends to be slower
+max_hints_delivery_threads: 2
# The following setting populates the page cache on memtable flush and compaction
# WARNING: Enable this setting only when the whole node's data fits in memory.
@@ -37,20 +57,19 @@ hinted_handoff_throttle_delay_in_ms: 1
# authentication backend, implementing IAuthenticator; used to identify users
authenticator: org.apache.cassandra.auth.AllowAllAuthenticator
-# authorization backend, implementing IAuthority; used to limit access/provide permissions
-authority: org.apache.cassandra.auth.AllowAllAuthority
+# authorization backend, implementing IAuthorizer; used to limit access/provide permissions
+authorizer: org.apache.cassandra.auth.AllowAllAuthorizer
# The partitioner is responsible for distributing rows (by key) across
# nodes in the cluster. Any IPartitioner may be used, including your
# own as long as it is on the classpath. Out of the box, Cassandra
-# provides org.apache.cassandra.dht.RandomPartitioner
-# org.apache.cassandra.dht.ByteOrderedPartitioner,
-# org.apache.cassandra.dht.OrderPreservingPartitioner (deprecated),
-# and org.apache.cassandra.dht.CollatingOrderPreservingPartitioner
-# (deprecated).
-#
+# provides org.apache.cassandra.dht.{Murmur3Partitioner, RandomPartitioner
+# ByteOrderedPartitioner, OrderPreservingPartitioner (deprecated)}.
+#
# - RandomPartitioner distributes rows across the cluster evenly by md5.
-# When in doubt, this is the best option.
+# This is the default prior to 1.2 and is retained for compatibility.
+# - Murmur3Partitioner is similar to RandomPartioner but uses Murmur3_128
+# Hash Function instead of md5. When in doubt, this is the best option.
# - ByteOrderedPartitioner orders rows lexically by key bytes. BOP allows
# scanning rows in key order, but the ordering can generate hot spots
# for sequential insertion workloads.
@@ -62,7 +81,7 @@ authority: org.apache.cassandra.auth.AllowAllAuthority
#
# See http://wiki.apache.org/cassandra/Operations for more on
# partitioners and token selection.
-partitioner: org.apache.cassandra.dht.RandomPartitioner
+partitioner: org.apache.cassandra.dht.Murmur3Partitioner
# directories where Cassandra should store data on disk.
data_file_directories:
@@ -71,6 +90,15 @@ data_file_directories:
# commit log
commitlog_directory: <%= File.join(node.cassandra.data_root_dir, 'commitlog') %>
+# policy for data disk failures:
+# stop: shut down gossip and Thrift, leaving the node effectively dead, but
+# still inspectable via JMX.
+# best_effort: stop using the failed disk and respond to requests based on
+# remaining available sstables. This means you WILL see obsolete
+# data at CL.ONE!
+# ignore: ignore fatal errors and let requests fail, as in pre-1.2 Cassandra
+disk_failure_policy: stop
+
# Maximum size of the key cache in memory.
#
# Each key cache hit saves 1 seek and each row cache hit saves 2 seeks at the
@@ -140,7 +168,7 @@ row_cache_provider: SerializingCacheProvider
# saved caches
saved_caches_directory: <%= File.join(node.cassandra.data_root_dir, 'saved_caches') %>
-# commitlog_sync may be either "periodic" or "batch."
+# commitlog_sync may be either "periodic" or "batch."
# When in batch mode, Cassandra won't ack writes until the commit log
# has been fsynced to disk. It will wait up to
# commitlog_sync_batch_window_in_ms milliseconds for other writes, before
@@ -157,8 +185,8 @@ commitlog_sync_period_in_ms: 10000
# The size of the individual commitlog file segments. A commitlog
# segment may be archived, deleted, or recycled once all the data
-# in it (potentally from each columnfamily in the system) has been
-# flushed to sstables.
+# in it (potentally from each columnfamily in the system) has been
+# flushed to sstables.
#
# The default size is 32, which is almost always fine, but if you are
# archiving commitlog segments (see commitlog_archiving.properties),
@@ -169,7 +197,7 @@ commitlog_segment_size_in_mb: 32
# any class that implements the SeedProvider interface and has a
# constructor that takes a Map<String, String> of parameters will do.
seed_provider:
- # Addresses of hosts that are deemed contact points.
+ # Addresses of hosts that are deemed contact points.
# Cassandra nodes use this list of hosts to find each other and learn
# the topology of the ring. You must change this if you are running
# multiple nodes!
@@ -181,7 +209,7 @@ seed_provider:
# emergency pressure valve: each time heap usage after a full (CMS)
# garbage collection is above this fraction of the max, Cassandra will
-# flush the largest memtables.
+# flush the largest memtables.
#
# Set to 1.0 to disable. Setting this lower than
# CMSInitiatingOccupancyFraction is not likely to be useful.
@@ -197,8 +225,8 @@ flush_largest_memtables_at: 0.75
# Cassandra will reduce cache maximum _capacity_ to the given fraction
# of the current _size_. Should usually be set substantially above
# flush_largest_memtables_at, since that will have less long-term
-# impact on the system.
-#
+# impact on the system.
+#
# Set to 1.0 to disable. Setting this lower than
# CMSInitiatingOccupancyFraction is not likely to be useful.
reduce_cache_sizes_at: 0.85
@@ -261,7 +289,7 @@ ssl_storage_port: 7001
# Address to bind to and tell other Cassandra nodes to connect to. You
# _must_ change this if you want multiple nodes to be able to
# communicate!
-#
+#
# Leaving it blank leaves it up to InetAddress.getLocalHost(). This
# will always do the Right Thing *if* the node is properly configured
# (hostname, name resolution, etc), and the Right Thing is to use the
@@ -274,10 +302,29 @@ listen_address: <%= node[:cassandra][:listen_address] %>
# Leaving this blank will set it to the same value as listen_address
# broadcast_address: 1.2.3.4
+
+# Whether to start the native transport server.
+# Currently, only the thrift server is started by default because the native
+# transport is considered beta.
+# Please note that the address on which the native transport is bound is the
+# same as the rpc_address. The port however is different and specified below.
+start_native_transport: true
+# port for the CQL native transport to listen for clients on
+native_transport_port: 9042
+# The minimum and maximum threads for handling requests when the native
+# transport is used. The meaning is those is similar to the one of
+# rpc_min_threads and rpc_max_threads, though the default differ slightly and
+# are the ones below:
+# native_transport_min_threads: 16
+# native_transport_max_threads: 128
+
+
+# Whether to start the thrift rpc server.
+start_rpc: true
# The address to bind the Thrift RPC service to -- clients connect
# here. Unlike ListenAddress above, you *can* specify 0.0.0.0 here if
# you want Thrift to listen on all interfaces.
-#
+#
# Leaving this blank has the same effect it does for ListenAddress,
# (i.e. it will be based on the configured hostname of the node).
rpc_address: <%= node.cassandra.rpc_address %>
@@ -287,36 +334,34 @@ rpc_port: 9160
# enable or disable keepalive on rpc connections
rpc_keepalive: true
-# Cassandra provides three options for the RPC Server:
-#
-# sync -> One connection per thread in the rpc pool (see below).
-# For a very large number of clients, memory will be your limiting
-# factor; on a 64 bit JVM, 128KB is the minimum stack size per thread.
-# Connection pooling is very, very strongly recommended.
+# Cassandra provides three out-of-the-box options for the RPC Server:
#
-# async -> Nonblocking server implementation with one thread to serve
-# rpc connections. This is not recommended for high throughput use
-# cases. Async has been tested to be about 50% slower than sync
-# or hsha and is deprecated: it will be removed in the next major release.
+# sync -> One thread per thrift connection. For a very large number of clients, memory
+# will be your limiting factor. On a 64 bit JVM, 128KB is the minimum stack size
+# per thread, and that will correspond to your use of virtual memory (but physical memory
+# may be limited depending on use of stack space).
#
-# hsha -> Stands for "half synchronous, half asynchronous." The rpc thread pool
-# (see below) is used to manage requests, but the threads are multiplexed
-# across the different clients.
+# hsha -> Stands for "half synchronous, half asynchronous." All thrift clients are handled
+# asynchronously using a small number of threads that does not vary with the amount
+# of thrift clients (and thus scales well to many clients). The rpc requests are still
+# synchronous (one thread per active request).
#
# The default is sync because on Windows hsha is about 30% slower. On Linux,
# sync/hsha performance is about the same, with hsha of course using less memory.
+#
+# Alternatively, can provide your own RPC server by providing the fully-qualified class name
+# of an o.a.c.t.TServerFactory that can create an instance of it.
rpc_server_type: sync
-# Uncomment rpc_min|max|thread to set request pool size.
-# You would primarily set max for the sync server to safeguard against
-# misbehaved clients; if you do hit the max, Cassandra will block until one
-# disconnects before accepting more. The defaults for sync are min of 16 and max
-# unlimited.
-#
-# For the Hsha server, the min and max both default to quadruple the number of
-# CPU cores.
+# Uncomment rpc_min|max_thread to set request pool size limits.
+#
+# Regardless of your choice of RPC server (see above), the number of maximum requests in the
+# RPC thread pool dictates how many concurrent requests are possible (but if you are using the sync
+# RPC server, it also dictates the number of clients that can be connected at all).
#
-# This configuration is ignored by the async server.
+# The default is unlimited and thus provide no protection against clients overwhelming the server. You are
+# encouraged to set a maximum that makes sense for you in production, but do keep in mind that
+# rpc_max_threads represents the maximum number of client requests this server may execute concurrently.
#
# rpc_min_threads: 16
# rpc_max_threads: 2048
@@ -326,8 +371,6 @@ rpc_server_type: sync
# rpc_recv_buff_size_in_bytes:
# Frame size for thrift (maximum field length).
-# 0 disables TFramedTransport in favor of TSocket. This option
-# is deprecated; we strongly recommend using Framed mode.
thrift_framed_transport_size_in_mb: 15
# The max length of a thrift message, including all fields and
@@ -347,7 +390,7 @@ incremental_backups: false
snapshot_before_compaction: false
# Whether or not a snapshot is taken of the data before keyspace truncation
-# or dropping of column families. The STRONGLY advised default of true
+# or dropping of column families. The STRONGLY advised default of true
# should be used to provide data safety. If you set this flag to false, you will
# lose data on truncation or drop.
auto_snapshot: true
@@ -375,15 +418,13 @@ in_memory_compaction_limit_in_mb: 64
# slowly or too fast, you should look at
# compaction_throughput_mb_per_sec first.
#
-# This setting has no effect on LeveledCompactionStrategy.
-#
# concurrent_compactors defaults to the number of cores.
# Uncomment to make compaction mono-threaded, the pre-0.8 default.
#concurrent_compactors: 1
# Multi-threaded compaction. When enabled, each compaction will use
# up to one thread per core, plus one thread per sstable being merged.
-# This is usually only useful for SSD-based hardware: otherwise,
+# This is usually only useful for SSD-based hardware: otherwise,
# your concern is usually to get compaction to do LESS i/o (see:
# compaction_throughput_mb_per_sec), not more.
multithreaded_compaction: false
@@ -408,8 +449,26 @@ compaction_preheat_key_cache: true
# When unset, the default is 400 Mbps or 50 MB/s.
# stream_throughput_outbound_megabits_per_sec: 400
-# Time to wait for a reply from other nodes before failing the command
-rpc_timeout_in_ms: 10000
+# How long the coordinator should wait for read operations to complete
+read_request_timeout_in_ms: 10000
+# How long the coordinator should wait for seq or index scans to complete
+range_request_timeout_in_ms: 10000
+# How long the coordinator should wait for writes to complete
+write_request_timeout_in_ms: 10000
+# How long the coordinator should wait for truncates to complete
+# (This can be much longer, because unless auto_snapshot is disabled
+# we need to flush first so we can snapshot before removing the data.)
+truncate_request_timeout_in_ms: 60000
+# The default timeout for other, miscellaneous operations
+request_timeout_in_ms: 10000
+
+# Enable operation timeout information exchange between nodes to accurately
+# measure request timeouts, If disabled cassandra will assuming the request
+# was forwarded to the replica instantly by the coordinator
+#
+# Warning: before enabling this property make sure to ntp is installed
+# and the times are synchronized between the nodes.
+cross_node_timeout: false
# Enable socket timeout for streaming operation.
# When a timeout occurs during streaming, streaming is retried from the start
@@ -475,7 +534,7 @@ endpoint_snitch: SimpleSnitch
# controls how often to perform the more expensive part of host score
# calculation
-dynamic_snitch_update_interval_in_ms: 100
+dynamic_snitch_update_interval_in_ms: 100
# controls how often to reset all host scores, allowing a bad host to
# possibly recover
dynamic_snitch_reset_interval_in_ms: 600000
@@ -505,7 +564,7 @@ request_scheduler: org.apache.cassandra.scheduler.NoScheduler
# NoScheduler - Has no options
# RoundRobin
# - throttle_limit -- The throttle_limit is the number of in-flight
-# requests per client. Requests beyond
+# requests per client. Requests beyond
# that limit are queued up until
# running requests can complete.
# The value of 80 here is twice the number of
@@ -554,7 +613,7 @@ index_interval: 128
# the keystore and truststore. For instructions on generating these files, see:
# http://download.oracle.com/javase/6/docs/technotes/guides/security/jsse/JSSERefGuide.html#CreateKeystore
#
-encryption_options:
+server_encryption_options:
internode_encryption: none
keystore: conf/.keystore
keystore_password: cassandra
@@ -565,3 +624,29 @@ encryption_options:
# algorithm: SunX509
# store_type: JKS
# cipher_suites: [TLS_RSA_WITH_AES_128_CBC_SHA,TLS_RSA_WITH_AES_256_CBC_SHA]
+ # require_client_auth: false
+
+# enable or disable client/server encryption.
+client_encryption_options:
+ enabled: false
+ keystore: conf/.keystore
+ keystore_password: cassandra
+ # More advanced defaults below:
+ # protocol: TLS
+ # algorithm: SunX509
+ # store_type: JKS
+ # cipher_suites: [TLS_RSA_WITH_AES_128_CBC_SHA,TLS_RSA_WITH_AES_256_CBC_SHA]
+ # require_client_auth: false
+
+# internode_compression controls whether traffic between nodes is
+# compressed.
+# can be: all - all traffic is compressed
+# dc - traffic between different datacenters is compressed
+# none - nothing is compressed.
+internode_compression: all
+
+# Enable or disable tcp_nodelay for inter-dc communication.
+# Disabling it will result in larger (but fewer) network packets being sent,
+# reducing overhead from the TCP protocol itself, at the cost of increasing
+# latency if you block for cross-datacenter responses.
+inter_dc_tcp_nodelay: true

1 comment on commit 8b708c2

αλεx π

<3 Thank you!

Please sign in to comment.
Something went wrong with that request. Please try again.