Skip to content
This repository has been archived by the owner on Dec 5, 2022. It is now read-only.

Statement load graph and Top statements feature implemented #44

Closed
wants to merge 20 commits into from

Conversation

s-soroosh
Copy link
Contributor

First of all pleaese update pgobserver.yaml file
Then you need to update your schema, so if tables are already exist. just run sql/patch/00_STAT_STATEMENTS_DATA.sql

Note: I made some changes in StatStatementsGatherer to gather statement calls incrementally, so we are able to filter them in different time windows. Also i have added ssd_user_id to stat_statements_data. so it is possible to filter queries by user_id.

return getTop10Interval(order, "('now'::timestamp - %s::interval)" % ( adapt("%s hours" % ( hours, )), ), hostId, limit)


def getStatLoad(hostId, days='8'):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From this func I got an exception..

File "/ssd/code/temp/pgo-psycho-ir/frontend/src/topstats.py", line 115, in getStatLoad
cur.execute(sql)
File "/usr/local/lib/python2.7/dist-packages/psycopg2/extras.py", line 120, in execute
return super(DictCursor, self).execute(query, vars)
ProgrammingError: can't execute an empty query

@kmoppel
Copy link
Contributor

kmoppel commented Feb 17, 2015

After commenting out the "if not tplE._settings['run_aggregations']:" I got:

File "/ssd/code/temp/pgo-psycho-ir/frontend/src/topstats.py", line 115, in getStatLoad
cur.execute(sql)
File "/usr/local/lib/python2.7/dist-packages/psycopg2/extras.py", line 120, in execute
return super(DictCursor, self).execute(query, vars)
DataError: numeric field overflow
DETAIL: A field with precision 6, scale 2 must round to an absolute value less than 10^4

SET client_min_messages = warning;

ALTER TABLE stat_statements_data
ADD COLUMN ssd_user_id integer DEFAULT 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"ADD COLUMN + DEFAULT 0" in one statement wouldn't work also in regards to existing DBs. In our company we have for example gigabytes of old statement data and it makes no big sense to overwrite that. Doing "SET DEFAULT" in a separate command would alleviate the problem.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean firstly i add a nullable column then update the field and then make it not null?
If not please give me a simple code snippet about implementation.
Thanks.

@kmoppel
Copy link
Contributor

kmoppel commented Feb 17, 2015

The feature of adding userId to stat_statements_data is currently problematic. Currently it would brake intended use of working frontend functions likegetStatStatements() and getStatStatementsGraph() as they're intended to work across all users e.g. one should also change them.

And also, the way of configuring "interested users" in the config.yaml file is not "scalable" in my opinion as changing them would require an application restart. So all in all I'm not sure if it makes sense to add this "userId" to the picture. Also so far all of PgObservers data is unpersonalized intentionally. Your comments on the idea are welcome.

@s-soroosh
Copy link
Contributor Author

Thank you for code review. I will have solved problems by 1day.

But Just about UserId. Before i added userid field, i analyzed pg_stat_statemenrs and figured out many internal queries executed which are persisted and are shown in pg_stat_statemets.
So i thought maybe we need a field to be sure that we are monitoring just statements that are executed by specific user.

About intention to work across all users, i think there is no problem and just by aggregation functions we can generate aggregated statistics.

I am agree about "interested users" with you. but for this version just i wanted to get feedback if this approach is usable of not. If it is OK i will move them to a more configurable and scalable place (A Table with entry form in frontend).

Please Let me know your thoughts.

@kmoppel
Copy link
Contributor

kmoppel commented Feb 18, 2015

Thanks for the background!

But ok, I think if this "user" filtering concept is redesigned into using a table (similar to sproc_schemas_monitoring_configuration), the patch would be acceptable. But I would think it would be still best to keep the actual "stat_statements_data" table as it is, and just to filter out "listed users" in the data gathering query on Java side.

@s-soroosh
Copy link
Contributor Author

OK, i will add configuration table.
About stat_statements_data. Without change i think we will miss some flexibility. For example we cannot filter queries by user, Assume you see query1 calls are too much. But maybe different users call it.
By clicking on it you can figure out how many calls there are from different users (I have not implemented this feature yet, but i think it is feasible by having userid).

I am agree about username and i will change the implementation to persist username instead of userid and its more meaningful.

Please let me your thoughts.

@s-soroosh
Copy link
Contributor Author

"""

After commenting out the "if not tplE._settings['run_aggregations']:" I got:

File "/ssd/code/temp/pgo-psycho-ir/frontend/src/topstats.py", line 115, in getStatLoad
cur.execute(sql)
File "/usr/local/lib/python2.7/dist-packages/psycopg2/extras.py", line 120, in execute
return super(DictCursor, self).execute(query, vars)
DataError: numeric field overflow
DETAIL: A field with precision 6, scale 2 must round to an absolute value less than 10^4
"""

I could not reproduce this error.
Could you please let me know how it happens, or give me a unit test that fails with your conditions.

@kmoppel
Copy link
Contributor

kmoppel commented Feb 19, 2015

Problem seems to be that you're sum-ing absolute values instead of diffs between timestamps, and I get an overflow on real life data. Something like "COALESCE(ssd_total_time - lag(ssd_total_time) OVER w, 0::bigint) AS delta_total_time" needs to be added to the innermost query

@kmoppel
Copy link
Contributor

kmoppel commented Feb 19, 2015

Also if I start to think about it, this kind of "load" is not too useful also. Probably it's just better to derive and graph the average query execution time for pg_stat_statement data.

@s-soroosh
Copy link
Contributor Author

pg_stat_statement is not time aware, so we cannot find for example the number of calls in last 1 hour.
So if these kinds of graphs are not useful.
I will changed the gatherer to gather just interested users and remove user_id from stat_statements_data.

@kmoppel
Copy link
Contributor

kmoppel commented Feb 24, 2015

ok, that should make it passable :)

but about your comment - "pg_stat_statement is not time aware, so we cannot find for example the number of calls in last 1 hour"...well true, none of data provided by PG system views is time aware, thatswhy we append the "timestamp" columns to all data saved to pgobserver datastore. Currently we already use the calls count change to display the avg. stat_statement runtime on "/perfstatstatements" screen for example.

@kmoppel
Copy link
Contributor

kmoppel commented Dec 2, 2015

Hey Soroosh, I resuscitated this feature now, and re-implemented it in a more simpler form using also some bits from your PR, so thank you! Please check #62 for details.

@kmoppel kmoppel closed this Dec 2, 2015
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants