Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug:1508025] symbol-check.sh is not failing for legitimate reasons #889

Closed
gluster-ant opened this issue Mar 12, 2020 · 10 comments
Closed
Assignees
Labels
Migrated Type:Bug wontfix Managed by stale[bot]

Comments

@gluster-ant
Copy link
Collaborator

URL: https://bugzilla.redhat.com/1508025
Creator: srangana at redhat
Time: 20171031T17:30:08

Description of problem:

On investigation of this (http://lists.gluster.org/pipermail/gluster-devel/2017-October/053861.html) mail thread in devel, it is seen that for 3.10 branch symbol-check.sh should always report a failure, but does not do so.

On master the reason for failure is addressed via the commit: https://review.gluster.org/#/c/16820/

But, the commit also reads that the problem was identified on a local run and not due to failures in the regression setup.

So, on further investigation of why symbol check is not reporting failures, I found that on my local Fedora26 machine, the test actually fails on the 3.10 branch.

On checking with Nigel, to look into the regression machines, he has noted that it passes there as the script is not catching the symbol.

I will let Nigel post his observations.

The issue seems to stem from the fact that the 'nm' list of symbols in the regression machines spews out a lstat64 and not a __lxstat64 that the script is looking for. This needs correction.

@gluster-ant
Copy link
Collaborator Author

Time: 20171031T20:25:38
mscherer at redhat commented:
So, Nigel and I discussed about it while waiting for my cheesecake, and he suggested that this could be caused by --enable-debug. As I had my laptop with me, we tested and indeed, the --enable-debug is what matter.

After a few tries, we narrowed it down to using -O0 vs -O2. --enable-debug switch the flags to -O0)

My understanding is that -O0 do not inline the lstat64 function, but -O2 do replace it with __lxstat64.

A quick search on Google show this is a recent optimisation of glibc and gcc 5:
https://sourceware.org/ml/libc-alpha/2015-08/msg00560.html

@gluster-ant
Copy link
Collaborator Author

Time: 20171101T13:31:59
nigelb at redhat commented:
The problem seems to be that the symbols change between debug and non-debug builds. If we want our regression machines we should look for symbols on a debug build. And if we want our developer machines to catch it, we should use a the optimized symbols. Can we do both?

@gluster-ant
Copy link
Collaborator Author

Time: 20190417T13:54:45
srangana at redhat commented:
The problem needs to be solved, as otherwise a future symbol leak is not preventable.

If required we may need an additional job that does not enable-debug (or add a task to an existing job) and checks for symbols.

Is there further information required to resolve the problem?

@gluster-ant
Copy link
Collaborator Author

Time: 20190417T14:39:06
ykaul at redhat commented:
(In reply to Shyamsundar from comment #3)

The problem needs to be solved, as otherwise a future symbol leak is not
preventable.

If required we may need an additional job that does not enable-debug (or add
a task to an existing job) and checks for symbols.

Is there further information required to resolve the problem?

Commitment to solve it - it was entered 1.5 years ago, and no one worked on it.

@gluster-ant
Copy link
Collaborator Author

Time: 20190417T14:53:50
srangana at redhat commented:
(In reply to Yaniv Kaul from comment #4)

(In reply to Shyamsundar from comment #3)

The problem needs to be solved, as otherwise a future symbol leak is not
preventable.

If required we may need an additional job that does not enable-debug (or add
a task to an existing job) and checks for symbols.

Is there further information required to resolve the problem?

Commitment to solve it - it was entered 1.5 years ago, and no one worked on
it.

Are you looking at commitment from me to resolve this? Asking to understand as it was marked NEEDINFO against me.

If so do let me know, I can do what is required and see how best to provide the details to the infra team to add to the smoke jobs. (although I have to add, with the information provided I would assume we know what needs to be done)

@gluster-ant
Copy link
Collaborator Author

Time: 20190417T15:02:00
mscherer at redhat commented:
yeah, it seems to have been forgotten with all more urgent fire, sorry. However, I miss lots of context on this and can't find the symbol-check.sh script anywhere. I guess our best bet would be to split the symbol check in a separate jobs, so we can do the debug build and do the test, rather than bundle that with the regular regression test. This would permit faster feedback on that matter.

@gluster-ant
Copy link
Collaborator Author

Time: 20190417T15:44:33
ndevos at redhat commented:
The script is path or the glusterfs.git repository: https://github.com/gluster/glusterfs/blob/master/tests/basic/symbol-check.sh

@gluster-ant
Copy link
Collaborator Author

Time: 20190614T09:12:13
atumball at redhat commented:
An update:

While testing https://review.gluster.org/22364 I noticed that 0symbol-check failed when I used access() and not sys_access(). But it didn't fail for stat().

So I suspect only set of 'stat()' functions are missed out.

@stale
Copy link

stale bot commented Oct 8, 2020

Thank you for your contributions.
Noticed that this issue is not having any activity in last ~6 months! We are marking this issue as stale because it has not had recent activity.
It will be closed in 2 weeks if no one responds with a comment here.

@stale stale bot added the wontfix Managed by stale[bot] label Oct 8, 2020
@stale
Copy link

stale bot commented Oct 23, 2020

Closing this issue as there was no update since my last update on issue. If this is an issue which is still valid, feel free to open it.

@stale stale bot closed this as completed Oct 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Migrated Type:Bug wontfix Managed by stale[bot]
Projects
None yet
Development

No branches or pull requests

2 participants