New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Utility for diagnose pg locked - Step 1 #15468
Conversation
Current status: can output yaml for local vmdb_production. The version to accept command line options and use that to connect remote dbs is work in progress. However some questions need to be clarify:
|
Now support command line arguments. -p for password, -u for user, -s for server, --port for port, and a last filename.yml for output. |
tools/lock_inspect/evm_connection.rb
Outdated
end | ||
|
||
begin | ||
res = conn.exec_params('SELECT application_name FROM pg_stat_activity ORDER BY application_name;') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ailisp we probably want to get a list of columns in pg_stat_activity and include them along with the application_name. We definitely need the pg backend pid/spid (i think that's pid), client_addr, client_hostname, backend_start, xact_start, query_start, waiting, and maybe query?... cc @gtanzillo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea is this YAML file can be processed in the later steps so we can try to correlate behavior with the cause of bad behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jrafanie what is backend_xid means? I saw it's all empty. And client_addr/client_hostname are ::1 or empty. So we should get these from application_name and miq_server table?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's the postgresql transaction id. I'd think the client_addr and hostname should be populated if you're not connecting locally. Let's start with application_name and a few of those other columns from pg_stat_activity. Don't look at miq_server table yet. We'll have that in another step. This step needs to hit only the pg_stat_activity table so we avoid locks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK! Thank you.
Why we need to truncate application_name? def database_application_name
zone = MiqServer.my_server.zone
"MIQ #{Process.pid} #{minimal_class_name}[#{compressed_id}], s[#{miq_server.compressed_id}], #{zone.name}[#{zone.compressed_id}]".truncate(64)
end There seems no restriction on this length |
I believe so, the default installation is 64 characters: https://www.postgresql.org/docs/9.6/static/runtime-config-logging.html#GUC-APPLICATION-NAME |
Oh. right. Currently I raise an error if it's a truncated application name (end with ..) |
tools/lock_inspect/step1.rb
Outdated
options[:user] = user | ||
end | ||
|
||
opts.on("-p", "--password [STRING]", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For later - we may need to provide the password in a more secure way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. This can be done in a more secure way since we only ask custom to run this script once dead lock happens. No need to run it periodically, so we can ask user to input password interactively.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For step 2. similar issue. Can we suppose it always run in the same rails console that ManageIQ application runs in. This will require it run in rails runner
as a rake task, but doesn't need password any more.
A sample output (with -a option, output all stat activity including non MIQ ones ---
- datid: '16386'
datname: vmdb_production
pid: '8729'
usesysid: '16384'
usename: root
application_name: MIQ|8583|1|-|1|Server|default
client_addr: "::1"
client_hostname:
client_port: '39478'
backend_start: '2017-07-11 10:45:29.793848-04'
xact_start:
query_start: '2017-07-11 17:07:12.789297-04'
state_change: '2017-07-11 17:07:12.789323-04'
waiting: f
state: idle
backend_xid:
backend_xmin:
query: |-
SELECT id, lock_version, priority, role FROM "miq_queue" WHERE ( state = 'ready'
AND (zone IS NULL OR zone = 'default')
AND queue_name = 'reporting'
AND (role IS NULL OR role IN ('automate','database_operations','database_owner','ems_inventory','ems_operations','event','reporting','scheduler','smartstate','user_interface','web_services','websocket'))
AND (server_guid IS NULL OR server_guid = 'f6860425-5d14-40e6-b564-2d3c2638502d')
AND (deliver_on IS NULL OR deliver_on <= '2017-07-11 21:07:12.788748')
AND (priority <= 200)
) ORDER BY "miq_queue"."priority" ASC, "miq_queue"."id" ASC LIMIT $1
- datid: '16386'
datname: vmdb_production
pid: '8840'
usesysid: '16384'
usename: root
application_name: MIQ|8794|1|98|1|Generic|default
client_addr: "::1"
client_hostname:
client_port: '39492'
backend_start: '2017-07-11 10:45:35.810419-04'
xact_start:
query_start: '2017-07-11 17:07:17.004598-04'
state_change: '2017-07-11 17:07:17.004635-04'
waiting: f
state: idle
backend_xid:
backend_xmin:
query: SELECT pg_backend_pid()
- datid: '16386'
datname: vmdb_production
pid: '8839'
usesysid: '16384'
usename: root
application_name: MIQ|8803|1|99|1|Generic|default
client_addr: "::1"
client_hostname:
client_port: '39490'
backend_start: '2017-07-11 10:45:35.779736-04'
xact_start:
query_start: '2017-07-11 17:07:15.347931-04'
state_change: '2017-07-11 17:07:15.34796-04'
waiting: f
state: idle
backend_xid:
backend_xmin:
query: SELECT pg_backend_pid()
- datid: '16386'
datname: vmdb_production
pid: '8845'
usesysid: '16384'
usename: root
application_name: MIQ|8812|1|100|1|Priority|default
client_addr: "::1"
client_hostname:
client_port: '39498'
backend_start: '2017-07-11 10:45:35.89027-04'
xact_start:
query_start: '2017-07-11 17:07:16.897365-04'
state_change: '2017-07-11 17:07:16.897401-04'
waiting: f
state: idle
backend_xid:
backend_xmin:
query: SELECT pg_backend_pid()
- datid: '16386'
datname: vmdb_production
pid: '8848'
usesysid: '16384'
usename: root
application_name: MIQ|8820|1|101|1|Priority|default
client_addr: "::1"
client_hostname:
client_port: '39502'
backend_start: '2017-07-11 10:45:35.949888-04'
xact_start:
query_start: '2017-07-11 17:07:16.90967-04'
state_change: '2017-07-11 17:07:16.909702-04'
waiting: f
state: idle
backend_xid:
backend_xmin:
query: SELECT pg_backend_pid()
- datid: '16386'
datname: vmdb_production
pid: '8851'
usesysid: '16384'
usename: root
application_name: MIQ|8830|1|102|1|Schedule|default
client_addr: "::1"
client_hostname:
client_port: '39506'
backend_start: '2017-07-11 10:45:36.002046-04'
xact_start:
query_start: '2017-07-11 17:07:12.770828-04'
state_change: '2017-07-11 17:07:12.770843-04'
waiting: f
state: idle
backend_xid:
backend_xmin:
query: SELECT "miq_schedules".* FROM "miq_schedules" WHERE (updated_at > '2017-07-11
21:06:57.751393')
- datid: '16386'
datname: vmdb_production
pid: '8881'
usesysid: '16384'
usename: root
application_name: MIQ|8854|1|103|1|EventHandler|default
client_addr: "::1"
client_hostname:
client_port: '39516'
backend_start: '2017-07-11 10:45:46.788029-04'
xact_start:
query_start: '2017-07-11 17:07:15.339799-04'
state_change: '2017-07-11 17:07:15.339833-04'
waiting: f
state: idle
backend_xid:
backend_xmin:
query: SELECT pg_backend_pid()
- datid: '16386'
datname: vmdb_production
pid: '8888'
usesysid: '16384'
usename: root
application_name: MIQ|8863|1|104|1|Reporting|default
client_addr: "::1"
client_hostname:
client_port: '39520'
backend_start: '2017-07-11 10:45:46.913674-04'
xact_start:
query_start: '2017-07-11 17:07:15.547073-04'
state_change: '2017-07-11 17:07:15.547108-04'
waiting: f
state: idle
backend_xid:
backend_xmin:
query: SELECT pg_backend_pid()
- datid: '16386'
datname: vmdb_production
pid: '8889'
usesysid: '16384'
usename: root
application_name: MIQ|8871|1|105|1|Reporting|default
client_addr: "::1"
client_hostname:
client_port: '39522'
backend_start: '2017-07-11 10:45:46.919921-04'
xact_start:
query_start: '2017-07-11 17:07:15.698499-04'
state_change: '2017-07-11 17:07:15.698534-04'
waiting: f
state: idle
backend_xid:
backend_xmin:
query: SELECT pg_backend_pid()
- datid: '16386'
datname: vmdb_production
pid: '9043'
usesysid: '16384'
usename: root
application_name: MIQ|8882|1|106|1|Ui|default
client_addr: "::1"
client_hostname:
client_port: '39556'
backend_start: '2017-07-11 10:47:19.242074-04'
xact_start:
query_start: '2017-07-11 16:47:34.075548-04'
state_change: '2017-07-11 16:47:34.075558-04'
waiting: f
state: idle
backend_xid:
backend_xmin:
query: COMMIT
- datid: '16386'
datname: vmdb_production
pid: '8916'
usesysid: '16384'
usename: root
application_name: MIQ|8882|1|106|1|Ui|default
client_addr: "::1"
client_hostname:
client_port: '39534'
backend_start: '2017-07-11 10:45:47.315264-04'
xact_start:
query_start: '2017-07-11 10:45:47.340997-04'
state_change: '2017-07-11 10:45:47.341019-04'
waiting: f
state: idle
backend_xid:
backend_xmin:
query: SELECT "miq_databases".* FROM "miq_databases" ORDER BY "miq_databases"."id"
ASC LIMIT $1
- datid: '16386'
datname: vmdb_production
pid: '8927'
usesysid: '16384'
usename: root
application_name: MIQ|8882|1|106|1|Ui|default
client_addr: "::1"
client_hostname:
client_port: '39536'
backend_start: '2017-07-11 10:45:47.353158-04'
xact_start:
query_start: '2017-07-11 16:17:21.23533-04'
state_change: '2017-07-11 16:17:21.23534-04'
waiting: f
state: idle
backend_xid:
backend_xmin:
query: COMMIT
- datid: '16386'
datname: vmdb_production
pid: '8931'
usesysid: '16384'
usename: root
application_name: MIQ|8898|1|107|1|WebService|default
client_addr: "::1"
client_hostname:
client_port: '39542'
backend_start: '2017-07-11 10:45:47.524192-04'
xact_start:
query_start: '2017-07-11 16:47:33.77212-04'
state_change: '2017-07-11 16:47:33.772131-04'
waiting: f
state: idle
backend_xid:
backend_xmin:
query: COMMIT
- datid: '16386'
datname: vmdb_production
pid: '8929'
usesysid: '16384'
usename: root
application_name: MIQ|8898|1|107|1|WebService|default
client_addr: "::1"
client_hostname:
client_port: '39540'
backend_start: '2017-07-11 10:45:47.48725-04'
xact_start:
query_start: '2017-07-11 10:45:47.508673-04'
state_change: '2017-07-11 10:45:47.508697-04'
waiting: f
state: idle
backend_xid:
backend_xmin:
query: SELECT "miq_databases".* FROM "miq_databases" ORDER BY "miq_databases"."id"
ASC LIMIT $1
- datid: '16386'
datname: vmdb_production
pid: '8942'
usesysid: '16384'
usename: root
application_name: MIQ|8907|1|108|1|Websocket|default
client_addr: "::1"
client_hostname:
client_port: '39546'
backend_start: '2017-07-11 10:45:47.565191-04'
xact_start:
query_start: '2017-07-11 16:47:33.976493-04'
state_change: '2017-07-11 16:47:33.976503-04'
waiting: f
state: idle
backend_xid:
backend_xmin:
query: COMMIT
- datid: '16386'
datname: vmdb_production
pid: '8944'
usesysid: '16384'
usename: root
application_name: MIQ|8907|1|108|1|Websocket|default
client_addr: "::1"
client_hostname:
client_port: '39548'
backend_start: '2017-07-11 10:45:47.585363-04'
xact_start:
query_start: '2017-07-11 10:45:47.734921-04'
state_change: '2017-07-11 10:45:47.734943-04'
waiting: f
state: idle
backend_xid:
backend_xmin:
query: SELECT "miq_databases".* FROM "miq_databases" ORDER BY "miq_databases"."id"
ASC LIMIT $1
- datid: '13295'
datname: postgres
pid: '6897'
usesysid: '16384'
usename: root
application_name: pgAdmin III - Browser
client_addr: 192.168.122.1
client_hostname:
client_port: '45326'
backend_start: '2017-07-11 10:18:57.285196-04'
xact_start:
query_start: '2017-07-11 10:18:57.308334-04'
state_change: '2017-07-11 10:18:57.308526-04'
waiting: f
state: idle
backend_xid:
backend_xmin:
query: SELECT count(*) FROM pg_attribute WHERE attrelid = 'pg_catalog.pg_proc'::regclass
AND attname = 'proargdefaults'
- datid: '16386'
datname: vmdb_production
pid: '7843'
usesysid: '16384'
usename: root
application_name: pgAdmin III - Browser
client_addr: 192.168.122.1
client_hostname:
client_port: '45336'
backend_start: '2017-07-11 10:20:47.666429-04'
xact_start:
query_start: '2017-07-11 15:57:31.79783-04'
state_change: '2017-07-11 15:57:31.798359-04'
waiting: f
state: idle
backend_xid:
backend_xmin:
query: "SELECT t.oid, t.xmin, t.*, relname, CASE WHEN relkind = 'r' THEN TRUE ELSE
FALSE END AS parentistable, nspname, des.description, l.lanname, p.prosrc, \n
\ COALESCE(substring(pg_get_triggerdef(t.oid), 'WHEN (.*) EXECUTE PROCEDURE'),
substring(pg_get_triggerdef(t.oid), 'WHEN (.*) \\$trigger')) AS whenclause\n
\ FROM pg_trigger t\n JOIN pg_class cl ON cl.oid=tgrelid\n JOIN pg_namespace
na ON na.oid=relnamespace\n LEFT OUTER JOIN pg_description des ON (des.objoid=t.oid
AND des.classoid='pg_trigger'::regclass)\n LEFT OUTER JOIN pg_proc p ON p.oid=t.tgfoid\n
\ LEFT OUTER JOIN pg_language l ON l.oid=p.prolang\n WHERE NOT tgisinternal\n
\ AND tgrelid = 18132::oid\n ORDER BY tgname"
- datid: '16386'
datname: vmdb_production
pid: '8733'
usesysid: '16384'
usename: root
application_name: pgAdmin III - Edit Grid
client_addr: 192.168.122.1
client_hostname:
client_port: '45418'
backend_start: '2017-07-11 10:45:30.256344-04'
xact_start:
query_start: '2017-07-11 10:45:37.408198-04'
state_change: '2017-07-11 10:45:37.408247-04'
waiting: f
state: idle
backend_xid:
backend_xmin:
query: SELECT format_type(oid,NULL) as typname FROM pg_type WHERE oid = 25
- datid: '16386'
datname: vmdb_production
pid: '15111'
usesysid: '16384'
usename: root
application_name: pgAdmin III - Edit Grid
client_addr: 192.168.122.1
client_hostname:
client_port: '50070'
backend_start: '2017-07-11 15:57:43.299728-04'
xact_start:
query_start: '2017-07-11 15:57:43.337474-04'
state_change: '2017-07-11 15:57:43.337526-04'
waiting: f
state: idle
backend_xid:
backend_xmin:
query: SELECT format_type(oid,NULL) as typname FROM pg_type WHERE oid = 16
- datid: '16386'
datname: vmdb_production
pid: '16420'
usesysid: '16384'
usename: root
application_name: "./step1.rb"
client_addr: 192.168.122.1
client_hostname:
client_port: '51042'
backend_start: '2017-07-11 17:07:17.39739-04'
xact_start: '2017-07-11 17:07:17.399171-04'
query_start: '2017-07-11 17:07:17.399171-04'
state_change: '2017-07-11 17:07:17.399172-04'
waiting: f
state: active
backend_xid:
backend_xmin: '118330'
query: |
SELECT *
FROM pg_stat_activity
ORDER BY application_name; |
tools/lock_inspect/step1.rb
Outdated
@@ -0,0 +1,159 @@ | |||
#!/usr/bin/env ruby |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets structure the directory and files as -
tools
|- pg_inspector.rb
|- pg_inspector
|- cli.rb
|- active_connections_to_yaml.rb (step 1)
|- servers_to_yaml.rb (step 2)
|- active_connections_to_human.rb (step 3)
8e8f46c
to
550d4e0
Compare
@yrudman @jrafanie @gtanzillo |
@miq-bot remove-label wip |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
def connect_pg_server | ||
conn_options = { | ||
:dbname => 'vmdb_production', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe better to user postgres
here instead of vmdb_production
because that DB should always exist. I ran it locally and it failed because I didn't have that DB.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Will set dbname default to postgres
and provide an command line option to set it.
Checked commits ailisp/manageiq@b660911~...0f439cb with ruby 2.2.6, rubocop 0.47.1, and haml-lint 0.20.0 |
This step is to generate yaml file from pg_stat_activity.
Story: https://www.pivotaltracker.com/n/projects/1608513/stories/147871949
\cc @gtanzillo @jrafanie @yrudman
@miq-bot add-label wip, tools