Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update GV$PROCESS query #14143

Merged
merged 3 commits into from
Mar 21, 2023
Merged

Conversation

jake-condello
Copy link
Contributor

@jake-condello jake-condello commented Mar 13, 2023

Update GV$PROCESS query to better handle multiple records with the same program value

What does this PR do?

This PR fixes the GV$PROCESS query so that the program value in each record returned is unique.

Motivation

The current query returns multiple records with the same program value. And since program is the only tag, only one of the values is reported, and all the other values are omitted within DataDog. Based on my testing, only the values associated with the last record of a given program name is reported. This leads to incorrect calculations of PGA information in the database.

Additional Notes

Review checklist (to be filled by reviewers)

  • Feature or bugfix MUST have appropriate tests (unit, integration, e2e)
  • PR title must be written as a CHANGELOG entry (see why)
  • Files changes must correspond to the primary purpose of the PR as described in the title (small unrelated changes should have their own PR)
  • [changelog/Fixed ] PR must have changelog/ and integration/ labels attached
  • If the PR doesn't need to be tested during QA, please add a qa/skip-qa label.

Update GV$PROCESS query to better handle multiple records with the same program value
@ghost ghost added the integration/oracle label Mar 13, 2023
@jake-condello jake-condello marked this pull request as ready for review March 14, 2023 01:00
@jake-condello jake-condello requested review from a team as code owners March 14, 2023 01:00
@jake-condello
Copy link
Contributor Author

jake-condello commented Mar 14, 2023

Here is an example of the improvement:

GV$PROCESS
The rough values of PGA
~670MB
~860MB
~85MB
~1GB

Old
The incorrect values reported by the current query
'query': 'SELECT PROGRAM, PGA_USED_MEM, PGA_ALLOC_MEM, PGA_FREEABLE_MEM, PGA_MAX_MEM FROM GV$PROCESS',
~130MB
~240MB
~60MB
~280MB

New
The much more accurate values of the new query
'query': 'SELECT PROGRAM, SUM(PGA_USED_MEM), SUM(PGA_ALLOC_MEM), SUM(PGA_FREEABLE_MEM), SUM(PGA_MAX_MEM) FROM GV$PROCESS GROUP BY PROGRAM',
~670MB
~865MB
~85MB

@jmeunier28
Copy link
Contributor

👋 hey @jake-condello looks like the style tests are failing

Creating environment: lint
Checking dependencies
Syncing dependencies
cmd [1] | flake8 --config=../.flake8 .
./datadog_checks/oracle/queries.py:8:121: E501 line too long (143 > 120 characters)

Failed!

you can run this these locally (assuming your ddev is setup) by doing

ddev test <check> -s

@@ -5,7 +5,7 @@

ProcessMetrics = {
'name': 'process',
'query': 'SELECT PROGRAM, PGA_USED_MEM, PGA_ALLOC_MEM, PGA_FREEABLE_MEM, PGA_MAX_MEM FROM GV$PROCESS',
'query': 'SELECT PROGRAM, SUM(PGA_USED_MEM), SUM(PGA_ALLOC_MEM), SUM(PGA_FREEABLE_MEM), SUM(PGA_MAX_MEM) FROM GV$PROCESS GROUP BY PROGRAM',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the purpose of summing up pga_max_mem? pga_max_mem is a high usage watermark for every process. But since every process generally reached its max usage at a different time, their sum will be greater than the total memory ever allocated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point; the sum of all processes' maximums isn't the most useful.

For some context, the main goal for our company is to have a good idea of the PGA state (used, total, etc.). Our current dashboards are, for example, summing up oracle.process.pga_used_memory and reporting that as amount of PGA used. Based on my research that is incorrect because only one value is reported per program when multiple exist:

select program, count(*) from GV$PROCESS group by program order by 2 desc;
oracle@cs12dbnrduva01 356
oracle@cs12dbnrduva01 (SCMN) 2

My first thought was to sum them all, grouped by program so that there is no information missing. Do you have any thoughts on how else we can bring in all that missing information?

Copy link
Contributor Author

@jake-condello jake-condello Mar 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can add PID column to query so that the entries with duplicate program names are not overridden. For example:
SELECT PID, PROGRAM, PGA_USED_MEM, PGA_ALLOC_MEM, PGA_FREEABLE_MEM, PGA_MAX_MEM FROM GV$PROCESS

Copy link
Contributor

@nenadnoveljic nenadnoveljic Mar 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, program can't be the key because generally a single program can have multiple sessions/processes.

Summing memory allocations and grouping by program solves this - except for pga_max_mem where summing doesn't produce a meaningful value.

An alternative solution, adding PID as a unique identifier seems correct to me. The aggregations would need to be performed outside of database in that case. It would require a thorough check that adding PID doesn't break something else.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My latest commit includes the original query with the pid column added. ddev test oracle passes on my machine.

Previously existing monitors seem to be functioning with correct values
dashboard-pid

And all metrics appear to be reporting properly
metrics-with-pid

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nenadnoveljic Is there anything else that should be done before this can be merged?

@iliakur iliakur merged commit 230060a into DataDog:master Mar 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants