Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

line count does not seem to be correct #2

Closed
jhutar opened this issue May 25, 2020 · 3 comments · Fixed by #3
Closed

line count does not seem to be correct #2

jhutar opened this issue May 25, 2020 · 3 comments · Fixed by #3

Comments

@jhutar
Copy link

jhutar commented May 25, 2020

Hello. I'm running this:

# grep --no-filename duration /var/lib/pgsql/data/pg_log/postgresql-*.log | logmine --pattern-placeholder REPLACED --min-members 1 | sed 's/^\(.\{200\}\).*/\1/'
2479 REPLACED REPLACED EDT LOG: duration: REPLACED ms execute <unnamed>: UPDATE REPLACED SET REPLACED = $1, "updated_at" = $2 WHERE REPLACED = $3
 926 2020-05-22 02:46:52 EDT LOG: duration: 1341.912 ms statement: SELECT * FROM "dynflow_execution_plans" WHERE ("state" = 'scheduled') ORDER BY "started_at"
 179 2020-05-22 02:28:00 EDT LOG: duration: 977.242 ms statement: COMMIT
  13 REPLACED REPLACED EDT LOG: duration: REPLACED ms execute <unnamed>: select this_.id as id1_36_19_, this_.created as created2_36_19_, this_.updated as updated3_36_19_, this_.consumer_id as consume
  10 REPLACED REPLACED EDT LOG: duration: REPLACED ms statement: INSERT INTO "dynflow_actions" ("execution_plan_uuid", "id", "data", "input", "caller_execution_plan_id", "caller_action_id", "class", "
...

so I would expect there is 926 lines matching something like SELECT \* FROM "dynflow_execution_plans regexp (second line of output) - but there is only one:

# grep --no-filename duration /var/lib/pgsql/data/pg_log/postgresql-*.log | grep 'SELECT \* FROM "dynflow_execution_plans'
2020-05-22 02:46:52 EDT LOG:  duration: 1341.912 ms  statement: SELECT * FROM "dynflow_execution_plans" WHERE ("state" = 'scheduled') ORDER BY "started_at"

Did I understood the meaning of number in first column incorrectly, or is there some bug?

# python --version
Python 2.7.5
# pip freeze
logmine==0.1.4
@trungdq88
Copy link
Owner

Hi. Yes, you understand the first column correctly, it should represent the number of occurrences. So in this case it looks like a bug to me.

You can try run logmine with --single-core flag, which is a bit slower but should eliminate the parallel processing part, where most of the bugs usually live.

In addition, I would be very thankful if you can help provide the dataset or a subset of it which I can reproduce the issue and fix it if possible. My email is available in GitHub profile.

Thanks.

@trungdq88
Copy link
Owner

I think I found the issue, there is a problem with the pattern displaying in some edge cases. Please try again with version 0.1.5 and reopen the issue if the problem still persists.

@jhutar
Copy link
Author

jhutar commented May 27, 2020

Thank you, you are quick! Running with --single-core did not helped, but current git version (27ae6ca) worked as expected. If you are still interested in data set, please just let me know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants