Skip to content

Conversation

@bnaecker
Copy link
Collaborator

@bnaecker bnaecker commented Dec 7, 2021

  • Uses the diesel_dtrace::DTraceConnection object for connecting to
    CockroachDB, which contains probes that fire when a connection is made
    and for each query.
  • Adds a Nexus configuration option for handling the failures of probe
    registration. If true the application aborts if the probes cannot be
    registered on startup, otherwise a log message is printed.

@bnaecker
Copy link
Collaborator Author

bnaecker commented Dec 7, 2021

For reference, here's the output of the query probes when nexus and oximeter start:

bnaecker@shale : ~/omicron $ pfexec dtrace -x strsize=4k -Zqn 'diesel_db*:::query_start { printf("%d: %s\n", arg0, copyinstr(arg1)); }'
115964116993: SELECT "saga"."id", "saga"."creator", "saga"."template_name", "saga"."time_created", "saga"."saga_params", "saga"."saga_state", "saga"."current_sec", "saga"."adopt_generation", "saga"."adopt_time" FROM "saga" WHERE (("saga"."saga_state" != $1) AND ("saga"."current_sec" = $2)) ORDER BY "saga"."id" ASC  LIMIT $3 -- binds: ["done", e6bff1ff-24fb-49dc-a54e-c6a350cd4d6c, 100]
115964116994: INSERT INTO "oximeter" ("id", "time_created", "time_modified", "ip", "port") VALUES ($1, $2, $3, $4, $5) ON CONFLICT ("id") DO UPDATE SET "time_modified" = $6, "ip" = $7, "port" = $8 -- binds: [1da65e5b-210c-4859-a7d7-200c1e659972, 2021-12-07T01:50:40.935912638Z, 2021-12-07T01:50:40.935912638Z, V6(Ipv6Network { addr: ::1, prefix: 128 }), 12223, 2021-12-07T01:50:40.936000269Z, V6(Ipv6Network { addr: ::1, prefix: 128 }), 12223]
115964116995: SELECT "oximeter"."id", "oximeter"."time_created", "oximeter"."time_modified", "oximeter"."ip", "oximeter"."port" FROM "oximeter" ORDER BY "oximeter"."id" ASC  LIMIT $1 -- binds: [1]
115964116996: SELECT "saga"."id", "saga"."creator", "saga"."template_name", "saga"."time_created", "saga"."saga_params", "saga"."saga_state", "saga"."current_sec", "saga"."adopt_generation", "saga"."adopt_time" FROM "saga" WHERE (("saga"."saga_state" != $1) AND ("saga"."current_sec" = $2)) ORDER BY "saga"."id" ASC  LIMIT $3 -- binds: ["done", e6bff1ff-24fb-49dc-a54e-c6a350cd4d6c, 100]
115964116997: INSERT INTO "oximeter" ("id", "time_created", "time_modified", "ip", "port") VALUES ($1, $2, $3, $4, $5) ON CONFLICT ("id") DO UPDATE SET "time_modified" = $6, "ip" = $7, "port" = $8 -- binds: [1da65e5b-210c-4859-a7d7-200c1e659972, 2021-12-07T01:50:57.035735022Z, 2021-12-07T01:50:57.035735022Z, V6(Ipv6Network { addr: ::1, prefix: 128 }), 12223, 2021-12-07T01:50:57.035753310Z, V6(Ipv6Network { addr: ::1, prefix: 128 }), 12223]
115964116998: SELECT "metric_producer"."id", "metric_producer"."time_created", "metric_producer"."time_modified", "metric_producer"."ip", "metric_producer"."port", "metric_producer"."interval", "metric_producer"."base_route", "metric_producer"."oximeter_id" FROM "metric_producer" WHERE ("metric_producer"."oximeter_id" = $1) ORDER BY "metric_producer"."oximeter_id", "metric_producer"."id" LIMIT $2 -- binds: [1da65e5b-210c-4859-a7d7-200c1e659972, 100]
115964116999: SELECT "oximeter"."id", "oximeter"."time_created", "oximeter"."time_modified", "oximeter"."ip", "oximeter"."port" FROM "oximeter" ORDER BY "oximeter"."id" ASC  LIMIT $1 -- binds: [1]
115964117000: INSERT INTO "metric_producer" ("id", "time_created", "time_modified", "ip", "port", "interval", "base_route", "oximeter_id") VALUES ($1, $2, $3, $4, $5, $6, $7, $8) ON CONFLICT ("id") DO UPDATE SET "time_modified" = $9, "ip" = $10, "port" = $11, "interval" = $12, "base_route" = $13 -- binds: [e6bff1ff-24fb-49dc-a54e-c6a350cd4d6c, 2021-12-07T01:50:57.065900961Z, 2021-12-07T01:50:57.065900961Z, V4(Ipv4Network { addr: 127.0.0.1, prefix: 32 }), 12221, 10.0, "/metrics/collect", 1da65e5b-210c-4859-a7d7-200c1e659972, 2021-12-07T01:50:57.066000104Z, V4(Ipv4Network { addr: 127.0.0.1, prefix: 32 }), 12221, 10.0, "/metrics/collect"]
115964117001: SELECT "saga"."id", "saga"."creator", "saga"."template_name", "saga"."time_created", "saga"."saga_params", "saga"."saga_state", "saga"."current_sec", "saga"."adopt_generation", "saga"."adopt_time" FROM "saga" WHERE (("saga"."saga_state" != $1) AND ("saga"."current_sec" = $2)) ORDER BY "saga"."id" ASC  LIMIT $3 -- binds: ["done", e6bff1ff-24fb-49dc-a54e-c6a350cd4d6c, 100]
115964117002: SELECT "saga"."id", "saga"."creator", "saga"."template_name", "saga"."time_created", "saga"."saga_params", "saga"."saga_state", "saga"."current_sec", "saga"."adopt_generation", "saga"."adopt_time" FROM "saga" WHERE (("saga"."saga_state" != $1) AND ("saga"."current_sec" = $2)) ORDER BY "saga"."id" ASC  LIMIT $3 -- binds: ["done", e6bff1ff-24fb-49dc-a54e-c6a350cd4d6c, 100]

The first argument is a unique identifier that allows correlating the start/end probes for both connections and queries.

Copy link
Collaborator

@smklein smklein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thank you for driving this!

@bnaecker bnaecker force-pushed the diesel-dtrace-connection branch from 3248bf2 to f3f50cc Compare December 7, 2021 22:21
- Uses the `diesel_dtrace::DTraceConnection` object for connecting to
  CockroachDB, which contains probes that fire when a connection is made
  and for each query.
- Adds a Nexus configuration option for handling the failures of probe
  registration. If `true` the application aborts if the probes cannot be
  registered on startup, otherwise a log message is printed.
@bnaecker bnaecker merged commit fbf79ab into main Dec 7, 2021
@bnaecker bnaecker deleted the diesel-dtrace-connection branch December 7, 2021 23:33
@bnaecker bnaecker mentioned this pull request Feb 16, 2022
leftwo pushed a commit that referenced this pull request Aug 24, 2023
Updated sled-agent to no longer expect a
propolis_client::instance_spec::VolumeConstructionRequest when it is
now just a Crucible VolumeConstructionRequest.

Changes in Crucible:
eliminate spurious metadb-related syncs (#881)
ACK the write after adding it to the work queue (#874)
Use arc4random_buf or getrandom instead of ChaCha20Rng (#878)
Fix crutest in README (#879)
Show client_id in panic messages (#843)
Reduce sqlite page cache size to 64KiB (#876)
Only flush extents that might actually be dirty. (#875)

Changes in Propolis:
Clean up and restructure CQE handling in NVMe
Improve error message when /dev/vmmctl not present (#506)
new API endpoint for propolis-server to replace a crucible downstairs (#495)
Update rustls-webpki for GHSA-fh2r-99q2-6mmg
Add MemAsync block backend
try reenabling PHD jobs (#484)
Define versioned instance specs (#472)
Update toml dependency to 0.7.x
Add more USDT probes to NVMe emulation
Add more to standalone-with-crucible (#490)
Update propolis with new Crucible Volume change (#485)
Minor polish to standalone-crucible doc
Clean up bits for crucible in propolis-standalone
doc iteration for crucible and propolis-standalone
Skeleton docs for using propolis with crucible disks in isolation
propolis-standalone: Update expected crucible opts (#488)
Split up README content for server and standalone
Add crucible config to propolis-standalone
Use libstd-provided OnceCell equivalent
Allow 64-vCPU instances on Helios (stlouis)
Elide test (and doctest) steps where not required
Clean up NVMe PRP parsing and add tests
@leftwo leftwo mentioned this pull request Aug 24, 2023
leftwo added a commit that referenced this pull request Aug 24, 2023
Updated sled-agent to no longer expect a
propolis_client::instance_spec::VolumeConstructionRequest when it is now
just a Crucible VolumeConstructionRequest.

Changes in Crucible:
eliminate spurious metadb-related syncs (#881)
ACK the write after adding it to the work queue (#874) Use
arc4random_buf or getrandom instead of ChaCha20Rng (#878) Fix crutest in
README (#879)
Show client_id in panic messages (#843)
Reduce sqlite page cache size to 64KiB (#876)
Only flush extents that might actually be dirty. (#875)

Changes in Propolis:
Clean up and restructure CQE handling in NVMe
Improve error message when /dev/vmmctl not present (#506) new API
endpoint for propolis-server to replace a crucible downstairs (#495)
Update rustls-webpki for GHSA-fh2r-99q2-6mmg
Add MemAsync block backend
try reenabling PHD jobs (#484)
Define versioned instance specs (#472)
Update toml dependency to 0.7.x
Add more USDT probes to NVMe emulation
Add more to standalone-with-crucible (#490)
Update propolis with new Crucible Volume change (#485) Minor polish to
standalone-crucible doc
Clean up bits for crucible in propolis-standalone
doc iteration for crucible and propolis-standalone Skeleton docs for
using propolis with crucible disks in isolation propolis-standalone:
Update expected crucible opts (#488) Split up README content for server
and standalone
Add crucible config to propolis-standalone
Use libstd-provided OnceCell equivalent
Allow 64-vCPU instances on Helios (stlouis)
Elide test (and doctest) steps where not required
Clean up NVMe PRP parsing and add tests

Co-authored-by: Alan Hanson <alan@oxide.computer>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants