Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document ccr-stats telemetry device #676

Conversation

dliappis
Copy link
Contributor

@dliappis dliappis commented Apr 9, 2019

Also use fields prefixed by shard. for the values returned by
_ccr/stats and move "name:ccr-stats" to be at the top level of the
document (used to be "meta.name:ccr-stats").

Also use fields prefixed by `shard.` for the values returned by
`_ccr/stats` and move "name:ccr-stats" to be at the top level of the
document (used to be "meta.name:ccr-stats").
@dliappis dliappis added :Docs Changes to the documentation :Telemetry Telemetry Devices that gather additional metrics labels Apr 9, 2019
@dliappis dliappis added this to the 1.1.0 milestone Apr 9, 2019
@dliappis
Copy link
Contributor Author

dliappis commented Apr 9, 2019

An easy way to test this (I've used this workflow) is to place these scripts: https://gist.github.com/dliappis/c4c81d10b846d0b023963b8230a34020 in an empty directory and run ./start.sh. Kibana will run on the standard http://localhost:5601 .

Copy link
Member

@danielmitterdorfer danielmitterdorfer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks mostly fine but now that this is a documented telemetry device I think it should also show up in esrally list telemetry. Furthermore, I think a small example of how to use it would be great given that it requires multiple clusters and the setup is quite complex?

doc = {
"name": "ccr-stats",
"shard": shard_stats
}
shard_metadata = {
"cluster": self.cluster_name,
"index": name,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be included in the shard stats already. Does it make sense to duplicate this information?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The shard metrics, as reported by the CCR stats endpoint are associated with each following index and I wanted to retain the structure in this respect.

For example in the following excerpt, we can see this duplication, a key index and then per shard the more specific leader-index / follower_index keys. I feel that for the sake of maintaining this structure it makes sense to keep this duplication.

   "indices" : [
      {
        "index" : "geonames-copy",
        "shards" : [
          {
            "remote_cluster" : "leader",
            "leader_index" : "geonames",
            "follower_index" : "geonames-copy",
            "shard_id" : 0,
            "leader_global_checkpoint" : -1,
            "leader_max_seq_no" : -1,
            "follower_global_checkpoint" : -1,
            "follower_max_seq_no" : -1,
            "last_requested_seq_no" : -1,
            "outstanding_read_requests" : 1,
            "outstanding_write_requests" : 0,
            "write_buffer_operation_count" : 0,
            "write_buffer_size_in_bytes" : 0,
            "follower_mapping_version" : 1,
            "follower_settings_version" : 1,
            "total_read_time_millis" : 0,
            "total_read_remote_exec_time_millis" : 0,
            "successful_read_requests" : 0,
            "failed_read_requests" : 0,
            "operations_read" : 0,
            "bytes_read" : 0,
            "total_write_time_millis" : 0,
            "successful_write_requests" : 0,
            "failed_write_requests" : 0,
            "operations_written" : 0,
            "read_exceptions" : [ ],
            "time_since_last_read_millis" : 4004
          },
          {
            "remote_cluster" : "leader",
            "leader_index" : "geonames",
            "follower_index" : "geonames-copy",
            "shard_id" : 1,
            "leader_global_checkpoint" : -1,
            "leader_max_seq_no" : -1,
            "follower_global_checkpoint" : -1,
            "follower_max_seq_no" : -1,
            "last_requested_seq_no" : -1,
            "outstanding_read_requests" : 1,
            "outstanding_write_requests" : 0,
            "write_buffer_operation_count" : 0,
            "write_buffer_size_in_bytes" : 0,
            "follower_mapping_version" : 1,
            "follower_settings_version" : 1,
            "total_read_time_millis" : 0,
            "total_read_remote_exec_time_millis" : 0,
            "successful_read_requests" : 0,
            "failed_read_requests" : 0,
            "operations_read" : 0,
            "bytes_read" : 0,
            "total_write_time_millis" : 0,
            "successful_write_requests" : 0,
            "failed_write_requests" : 0,
            "operations_written" : 0,
            "read_exceptions" : [ ],
            "time_since_last_read_millis" : 4087
          },

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'm fine with that.

shard_metadata = {
"cluster": self.cluster_name,
"index": name,
"shard": shard_stats["shard_id"],
"name": "ccr-stats"
"shard": shard_stats["shard_id"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is included in the shard stats already. Does it make sense to duplicate this information?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whereas here I think it doesn't make sense to duplicate the shard_id; I'll remove.

@@ -31,7 +31,8 @@

def list_telemetry():
console.println("Available telemetry devices:\n")
devices = [[device.command, device.human_name, device.help] for device in [JitCompiler, Gc, FlightRecorder, PerfStat, NodeStats, RecoveryStats]]
devices = [[device.command, device.human_name, device.help] for device in [JitCompiler, Gc, FlightRecorder, PerfStat,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also update the docs where we show an example of the listed telemetry devices?

@dliappis
Copy link
Contributor Author

@danielmitterdorfer Could you please take another pass esp now that I've added the ccr recipe in e592f17 and the corresponding documentation in 5ca587a?

Copy link
Member

@danielmitterdorfer danielmitterdorfer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some, mostly minor, suggestions.

docs/recipes.rst Outdated

.. _recipe_testing_rally_against_ccr_clusters:

Testing Rally against CCR Clusters using a remote metric store
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: clusters (lower case)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 7202a00

docs/recipes.rst Outdated

1. Starts a small single node Elasticsearch cluster locally, to serve as a :ref:`metrics store <configuration_options>`. It also starts Kibana attached to the Elasticsearch metric store cluster.
2. Creates a new configuration file for Rally under ``~/.rally/rally-metricstore.ini`` referencing Elasticsearch from step 1.
3. Starts two additional small local Elasticsearch clusters with 1 node each, (version ``7.0.0`` by default) called ``leader`` and ``follower`` listening at ports 329201 and 32902 respectively.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

329201 -> 39201

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 7202a00

docs/recipes.rst Outdated

Rally will push metrics to the metric store configured in 1. and they can be visualized by accessing Kibana at `http://locahost:5601 <http://localhost:5601>`_.

To tear down everything just issue ``./stop.sh``.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove "just"

docs/recipes.rst Outdated

Testing Rally features (such as the ``ccr-stats`` telemetry device) requiring Elasticsearch clusters configured for `cross-cluster replication <https://www.elastic.co/guide/en/elastic-stack-overview/current/ccr-getting-started.html>`_ can be a time consuming process.

For this reason we've created an example under `recipes/ccr in Rally's repository <https://github.com/elastic/rally/tree/master/recipes/ccr>`_ simplifying it.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about something along these lines?

Use `recipes/ccr in Rally's repository <https://github.com/elastic/rally/tree/master/recipes/ccr>`_ to test a simple but complete example.

(I'd also move this to the previous paragraph)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So no need to provide links to cross-cluster-replication from the Elastic website i.e. get rid of the entire line 158 completely, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. I'd just turn this into a single paragraph. I don't think that we should have two paragraphs here. So line 158 would stay.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, fixed in 7202a00

docs/recipes.rst Outdated

Running the ``start.sh`` script requires Docker locally installed and performs the following actions:

1. Starts a small single node Elasticsearch cluster locally, to serve as a :ref:`metrics store <configuration_options>`. It also starts Kibana attached to the Elasticsearch metric store cluster.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we maybe state what "small" means? (i.e. 512MB heap for the metrics store and 1 GB heap for the regular nodes?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 7202a00

recipes/ccr/start.sh Show resolved Hide resolved
Copy link
Member

@danielmitterdorfer danielmitterdorfer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for iterating! LGTM

@dliappis dliappis merged commit 7d5ac72 into elastic:master Apr 12, 2019
@dliappis dliappis deleted the document-ccr-and-make-metric-doc-name-consistent branch April 12, 2019 07:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Docs Changes to the documentation :Telemetry Telemetry Devices that gather additional metrics
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants