New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(cassandra): instrumentation #437

Merged
merged 8 commits into from Jul 27, 2018

Conversation

Projects
None yet
4 participants
@Qard
Member

Qard commented Jul 6, 2018

This adds instrumentation for the cassandra-driver module. It currently lacks support for the client.batch(...) method, and I have some more work to do on supporting older versions of the module.

Fixes #303

Checklist

  • Implement code
  • Add tests
  • Update documentation

@Qard Qard self-assigned this Jul 6, 2018

@Qard Qard requested a review from watson Jul 6, 2018

@@ -12,7 +12,31 @@ var request = require('../request')
var Transaction = require('./transaction')
var shimmer = require('./shimmer')
var MODULES = ['http', 'https', 'http2', 'generic-pool', 'mongodb-core', 'pg', 'mysql', 'mysql2', 'express', 'express-queue', 'hapi', 'redis', 'ioredis', 'bluebird', 'knex', 'koa-router', 'ws', 'graphql', 'express-graphql', 'elasticsearch', 'handlebars', 'mimic-response']
var MODULES = [

This comment has been minimized.

@watson

watson Jul 6, 2018

Member

👍👍👍👍👍👍👍👍👍

.travis.yml Outdated
@@ -72,7 +73,7 @@ jobs:
- stage: dependencies
node_js: '10'
if: type IN (cron, pull_request) AND NOT branch =~ ^greenkeeper/.*
env: TAV=generic-pool,mysql,mysql2,redis,koa-router,handlebars,mongodb-core
env: TAV=generic-pool,mysql,mysql2,redis,koa-router,handlebars,mongodb-core,cassandra-driver

This comment has been minimized.

@watson

watson Jul 6, 2018

Member

We should probably add this to the ioredis,pg build as that's currently the one that takes the shortest. This one seems to on average take the longest.

This comment has been minimized.

@Qard

Qard Jul 6, 2018

Member

Will do. 👍

This comment has been minimized.

@Qard

Qard Jul 6, 2018

Member

fixed.

var shimmer = require('../shimmer')
module.exports = function (cassandra, agent, version, enabled) {
if (!enabled) return cassandra

This comment has been minimized.

@watson

watson Jul 6, 2018

Member

Does cassandra have a callback queue we need to patch even if disabled?

This comment has been minimized.

@Qard

Qard Jul 6, 2018

Member

Nope. There's no pooling or callback queue in the driver.

if (span) {
var query = args[0]
span.type = 'db.cassandra.query'
span.setDbContext({ statement: query, type: 'sql' })

This comment has been minimized.

@watson

watson Jul 6, 2018

Member

I've never used Cassandra, so excuse my ignorance here... The query language looks like SQL, but they seem to call it CQL. But is it just an alias?

This comment has been minimized.

@watson

watson Jul 6, 2018

Member

I just checked with the Python agent and they call it sql as well, but I think that might be wrong. I just checked what done in OpenTracing. I found one semi official Cassandra instrumentation and they use type cassandra:

https://github.com/opentracing-contrib/java-cassandra-driver/blob/a3d068c8601146dfdaf46557422a55379505d9fb/src/main/java/io/opentracing/contrib/cassandra/TracingSession.java#L319

This comment has been minimized.

@Qard

Qard Jul 6, 2018

Member

I changed it to cassandra.

return cassandra
function wrapInnerExecute (original) {
return function wrappedInnerExecute (...args) {

This comment has been minimized.

@watson

watson Jul 6, 2018

Member

The spread operator isn't isn't supported in Node 4 without a flag.

We can drop support for v4, but should bump major if we do. If not too much of a hassle I'd like to wait bumping major until v2 of the intake API. What do you think?

This comment has been minimized.

@Qard

Qard Jul 6, 2018

Member

Yeah, I don't need to use the spread operator. I just keep forgetting it didn't exist yet in 4.x. 😅

This comment has been minimized.

@Qard

Qard Jul 6, 2018

Member

fixed.

@Qard

This comment has been minimized.

Member

Qard commented Jul 7, 2018

Seems a ps1 script is needed to install cassandra on appveyor. I'm not too familiar with how that stuff works. Jenkins also seems to be missing cassandra...or it's somewhere else? 🤔

@watson

This comment has been minimized.

Member

watson commented Jul 7, 2018

@Qard I've updated the PR to install Cassandra on both AppVeyor and Jenkins

@watson

What's the difference between _innerExecute and _batchCb? Is the former just a single query?

'use strict'
/**
* TODO: Support batch queries

This comment has been minimized.

@watson

watson Jul 10, 2018

Member

Do you want to release without support for batch queries?

This comment has been minimized.

@watson

watson Jul 10, 2018

Member

Looking further down it seems like support for batch queries have been added. Maybe just old comment?

This comment has been minimized.

@Qard

Qard Jul 10, 2018

Member

Yep. Old comment...removing...

.tav.yml Outdated
@@ -128,3 +128,8 @@ hapi-async-await:
node: '>=8.2'
versions: '^17.0.0'
commands: node test/instrumentation/modules/hapi.js
cassandra-driver:
versions: '>3.1.0 <4'

This comment has been minimized.

@watson

watson Jul 10, 2018

Member

From the tav tests it looks like 3.1.x isn't supported.

You can run ./node_modules/.bin/tav cassandra-driver ^3.1.0 --compat -- node test/instrumentation/modules/cassandra/cassandra.js locally to get an overview of what's supported.

module.exports = function (cassandra, agent, version, enabled) {
if (!enabled) return cassandra
if (!semver.satisfies(version, '^3.0.0')) {

This comment has been minimized.

@watson

watson Jul 10, 2018

Member

This semver range doesn't match the one listed in compatibility.asciidoc

agent.logger.debug('shimming cassandra.Client.prototype._innerExecute')
shimmer.wrap(cassandra.Client.prototype, '_innerExecute', wrapInnerExecute)
shimmer.wrap(cassandra.Client.prototype, '_batchCb', wrapBatch)
}

This comment has been minimized.

@watson

watson Jul 10, 2018

Member

Consider adding an else block with something like:

agent.logger.debug('could not find cassandra.Client in cassandra-driver@%s - aborting...', version)
module.exports = function (cassandra, agent, version, enabled) {
if (!enabled) return cassandra
if (!semver.satisfies(version, '^3.0.0')) {
agent.logger.debug('cassandra version %s not supported - aborting...', version)

This comment has been minimized.

@watson

watson Jul 10, 2018

Member

I would probably write the full module name here (s/cassandra/cassandra-driver/) just to be consistent 😃

This comment has been minimized.

@Qard

Qard Jul 10, 2018

Member

done.

return function wrappedBatch (queries, options, callback) {
const spans = []
for (let query of queries) {

This comment has been minimized.

@watson

watson Jul 10, 2018

Member

Seems kind of wasteful to create an array of queries that are all ended an started at the same time. Does the driver send the queries to the database all in the same request, or is it just an abstraction layer where each query is actually performed independently?

This comment has been minimized.

@Qard

Qard Jul 10, 2018

Member

I'm not really sure how to express the multiple queries in a single span. Ideas?

This comment has been minimized.

@watson

watson Jul 11, 2018

Member

Do you know if the batch of queries are sent to the database in one request? If so why not simply have a single span? Splitting up the spans seem to indicate multiple requests to the database.

I guess your question is how we then record the query string. This is a bit of a hack, but how about joining all query strings with ;\n and reporting the resulting string on the batch span? I don't like that solution, but I think I like it more than creating one span per query in the batch. But I might be viewing this from the wrong angle. So I can probably be convinced otherwise.

This comment has been minimized.

@Qard

Qard Jul 13, 2018

Member

Well, they kind of come back off a stream, internally. So it theoretically "starts" and "finishes" the query at a different time, but that's not really usefully visible to the user in any way. Just joining on ; should be fine. I just wasn't sure if there was any UI implications of doing that, or if we wanted to keep them logically separate for any reason.

@watson

This comment has been minimized.

Member

watson commented Jul 10, 2018

Besides the db.cassandra.query span type that we have here, the Python agent also has a db.cassandra.connect span type which is used to indicate when a new socket is opened before a query can be made. Would it make sense for us to create a similar span?

return ret
function wrapCallback (cb) {
if (!cb) return

This comment has been minimized.

@watson

watson Jul 11, 2018

Member

It might be safest to expand this check to typeof cb !== 'function'

This comment has been minimized.

@watson

watson Jul 11, 2018

Member

Also, it should probably then also return the callback, so it's as if you called original with that argument just as if we hadn't patched it.

This comment has been minimized.

@Qard

Qard Jul 13, 2018

Member

fixed.

// Wrap the callback
const ret = original.call(this, wrapCallback(callback))
if (typeof callback !== 'function') {

This comment has been minimized.

@watson

watson Jul 11, 2018

Member

It might be safest if we checked if ret.then was a function as well. Just in case the API changes. In that case we can't end the span though, so we should consider what to do instead.

The best solution is probably to simply bail in that case and abort the span. I don't think we have a way to do that at the moment, but it should be too hard to add a span.[cancel|abort|destroy] function. That function needs to either remove the span from the transaction._builtSpans array or update the logic that loops over this array when the transaction ends to ensure that it's ignored instead of being truncated.

This comment has been minimized.

@watson

watson Jul 11, 2018

Member

I guess we have the same issue with not ending the span in these edge cases with all the other instrumentations we're doing. So maybe it's out of scope of this instrumentation and we should simply make another issue to improve this.

This comment has been minimized.

@Qard

Qard Jul 13, 2018

Member

fixed.

const ret = original.apply(this, arguments)
if (isPromise && spans.length) {

This comment has been minimized.

@watson

watson Jul 11, 2018

Member

Just as in the wrapConnect function above we should make sure that ret.then is actually a function and maybe find ways to cancel the span in case it's not.

This comment has been minimized.

@Qard

Qard Jul 13, 2018

Member

fixed.

// Wrap the callback
const index = arguments.length - 1
const hasCallback = typeof arguments[index - 1] === 'function'

This comment has been minimized.

@watson

watson Jul 11, 2018

Member

It took me a while to understand this logic. I think it would make it easier to read if you renamed this variable to hasRowCallback.

This comment has been minimized.

@Qard

Qard Jul 13, 2018

Member

fixed.

// Wrap the callback
const index = arguments.length - 1
const hasCallback = typeof arguments[index - 1] === 'function'
const cb = hasCallback ? arguments[index] : () => {}

This comment has been minimized.

@watson

watson Jul 11, 2018

Member

There's check for if arguments[index] actually isn't a function in case people use the cassandra-driver API wrong. Should we check for that you think and if so how can we deal with it?

const ret = original.apply(this, arguments)
if (isPromise) {

This comment has been minimized.

@watson

watson Jul 11, 2018

Member

Just as in the wrapConnect function above we should make sure that ret.then is actually a function and maybe find ways to cancel the span in case it's not.

This comment has been minimized.

@Qard

Qard Jul 13, 2018

Member

fixed.

}
if (hasRowCallback) {
const cb = arguments[index]

This comment has been minimized.

@watson

watson Jul 13, 2018

Member

It's technically possible for the user to call the batch function with a proper rowCallback in the 2nd to last position and a non-function argument in the last position. I was wondering if we could deal with that in some way... what do you think?

This comment has been minimized.

@Qard

Qard Jul 13, 2018

Member

What would we do in that situation? It's an invalid input, and I'm not sure how we could recover from that other than just leaving the span un-ended. 🤔

This comment has been minimized.

@watson

watson Jul 14, 2018

Member

No we can't recover, and in we probably need a general way to "destroy" a span if we find that we can't detect when it ends. But adding that ability to spans is probably out of scope for this PR.

In regards to invalid input in general, I try to just forward that invalid input to the module being patched so that it can use its normal error handling.

This comment has been minimized.

@Qard

Qard Jul 18, 2018

Member

fixed

const queryStrings = queries.map(toQueryString)
const querySummaries = queryStrings.map(sqlSummary)
const query = queryStrings.join(';\n')
const querySummary = querySummaries.join('; ')

This comment has been minimized.

@watson

watson Jul 13, 2018

Member

I'm worried this might become a really long name in case there's a lot of queries.

Possible solutions:

  1. Only show the first query summary. This is how we do it for SQL query batches in mysql and postgresql today. But this is actually just out of laziness and should probably be reconsidered anyway
  2. Make sure all query summaries are unique, e.g. querySummaries.filter(unique).join('; ')
  3. Don't show any summary at all and just name the span Batch Query

cc @beniwohli

This comment has been minimized.

@Qard

Qard Jul 13, 2018

Member

Yeah, I think something like just BATCH is better for the name than joining all the queries. It gets long quickly, even summarized.

This comment has been minimized.

@watson

watson Jul 14, 2018

Member

Any reason why you want to uppercase it?

This comment has been minimized.

@Qard

Qard Jul 17, 2018

Member

Not really. Sometimes people capitalize keywords in SQL, and I was thinking of it kind of like that, but whatever works. 🤷‍♂️

This comment has been minimized.

@watson

watson Jul 17, 2018

Member

Yeah exactly. I just wanted to distinguish this from an actual SQL keyword as - if I understand correctly - there's no such thing as a BATCH keyword in Cassandra right? It's just so that people don't accidentally confuse it with a command you can actually give to Cassandra, that's all 😄

This comment has been minimized.

@Qard

Qard Jul 17, 2018

Member

Makes sense. Maybe just batch query?

This comment has been minimized.

@watson

watson Jul 18, 2018

Member

We have previously prefixed certain span names with the module that created them, e.g. Elasticsearch span names are prefixed Elasticsearch: . For SQL queries we don't normally do this as it's usually pretty obvious where they are coming from, but just as with Elasticsearch we might consider prefixing Cassandra spans as well. If so it would be something like Cassandra: Batch query. What do you think?

This comment has been minimized.

@Qard

Qard Jul 18, 2018

Member

fixed

}
span.type = 'db.cassandra.connect'
span.start()

This comment has been minimized.

@watson

watson Jul 13, 2018

Member

This span isn't named (it will fall back to the default name unnamed). We should coordinate with the Python agent and agree on a name.

Btw, you can just give the name and type as arguments to start() instead of setting them as properties, eg:

span.start('Cassandra: Connect', 'db.cassandra.connect')

cc @beniwohli

This comment has been minimized.

@Qard

Qard Jul 13, 2018

Member

I wasn't sure what to call it, since there's no query to summarize for the name. I'm good with whatever. 🤷‍♂️

This comment has been minimized.

@watson

watson Jul 17, 2018

Member

@beniwohli what did you call the connect span in the Python agent?

This comment has been minimized.

@beniwohli

beniwohli Jul 18, 2018

Member

in this case, we fall back to the class/method name that we wrap, e.g. Cluster.connect

This comment has been minimized.

@watson

watson Jul 18, 2018

Member

The Python agent prefixes spans with the name of the class and function that was used to start the span. So in the case of here the span would be called Cluster.connect in Python. I think we for now just should call it Cassandra: Connect. That aligns with how we name the Elasticsearch queries. Aligning across agents will be a bigger task and should probably not hold back this PR.

This comment has been minimized.

@Qard

Qard Jul 18, 2018

Member

fixed

}
function assertConnectSpan (t, span) {
t.equal(span.type, 'db.cassandra.connect', 'span type')

This comment has been minimized.

@watson

watson Jul 13, 2018

Member

Add a check for the span.name as well

This comment has been minimized.

@Qard

Qard Jul 13, 2018

Member

Will do, when we decide what the name should be. 👍

This comment has been minimized.

@Qard

Qard Jul 18, 2018

Member

fixed

@codecov-io

This comment has been minimized.

codecov-io commented Jul 18, 2018

Codecov Report

Merging #437 into master will increase coverage by 0.45%.
The diff coverage is 95.45%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #437      +/-   ##
==========================================
+ Coverage   82.29%   82.74%   +0.45%     
==========================================
  Files          42       43       +1     
  Lines        2270     2307      +37     
==========================================
+ Hits         1868     1909      +41     
+ Misses        402      398       -4
Impacted Files Coverage Δ
lib/instrumentation/index.js 95.19% <100%> (ø) ⬆️
lib/instrumentation/modules/cassandra-driver.js 95.34% <95.34%> (ø)
lib/parsers.js 84.21% <0%> (-0.14%) ⬇️
lib/agent.js 94.35% <0%> (-0.04%) ⬇️
lib/instrumentation/shimmer.js 62.74% <0%> (+1.96%) ⬆️
lib/instrumentation/modules/mimic-response.js 100% <0%> (+41.66%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ceedd43...13d64a5. Read the comment docs.

@Qard Qard changed the title from cassandra: instrumentation to feat(cassandra): instrumentation Jul 20, 2018

@Qard

This comment has been minimized.

Member

Qard commented Jul 26, 2018

Rebased to deal with conflicts from MSSQL PR.

@watson

watson approved these changes Jul 27, 2018

@watson watson merged commit b4a63a3 into elastic:master Jul 27, 2018

7 checks passed

CLA Commit author is a member of Elasticsearch
Details
apm-ci Build finished.
Details
codecov/patch 87.5% of diff hit (target 82.29%)
Details
codecov/project 82.5% (target 80%)
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
security/snyk - package.json (watson) No new issues
Details

@zube zube bot removed the [zube]: In Review label Jul 27, 2018

@zube zube bot added the [zube]: Done label Jul 27, 2018

@Qard Qard deleted the Qard:cassandra branch Aug 9, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment