Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pacific: Update fio in pacific branch to fix make check failures #47696

Closed
wants to merge 24 commits into from

Conversation

tserong
Copy link
Contributor

@tserong tserong commented Aug 19, 2022

I'm trying to cherry pick d777100 in order to fix the make check failure mentioned in #47401 (comment). The cleanest way I could find to do this, avoiding needing to resolve any conflicts, seems to be to grab these four commits.

Contribution Guidelines

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows

tchaikov and others added 4 commits August 19, 2022 17:39
for better readability and maintainability

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit c190fee)
to pick up the clang build fix. we should use the upstream repo, once the
clang build fix gets merged.

bumping up the fio helps to address following error, as this part was rewritten:

src/test/fio/CMakeFiles/fio_ceph_objectstore.dir/fio_ceph_objectstore.cc.o.d -o src/test/fio/CMakeFiles/fio_ceph_objectstore.dir/fio_ceph_objectstore.cc.o -c ../src/test/fio/fio_ceph_objectstore.cc
In file included from ../src/test/fio/fio_ceph_objectstore.cc:26:
In file included from src/fio/fio.h:18:
In file included from src/fio/thread_options.h:6:
In file included from src/fio/options.h:8:
src/fio/parse.h:128:13: error: arithmetic on a pointer to void
        return ret + offset;
               ~~~ ^
1 error generated.

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 10baab3)
this change partially reverts 10baab3,
but since the fix for C++ build is not included by any tag or branche so
far. let's just use the sha1 for now.

Signed-off-by: Kefu Chai <tchaikov@gmail.com>
(cherry picked from commit b65d074)
to address following warning:

fatal: reference is not a tree: 7f02f21f53ccd5e2448086f8e9015489693dd2dc
CMake Error at /home/jenkins-build/build/workspace/ceph-pull-requests/build/fio_ext-prefix/tmp/fio_ext-gitclone.cmake:40 (message):
  Failed to checkout tag: '7f02f21f53ccd5e2448086f8e9015489693dd2dc'

it seems that the shallow option does not work with a sha1 tag option,
let's continue using the ceph repo with a tag.

Signed-off-by: Kefu Chai <tchaikov@gmail.com>
(cherry picked from commit d777100)
@tserong tserong requested a review from tchaikov August 19, 2022 07:49
@github-actions github-actions bot added the tests label Aug 19, 2022
@github-actions github-actions bot added this to the pacific milestone Aug 19, 2022
@tserong
Copy link
Contributor Author

tserong commented Aug 19, 2022

Now what?!?

CMake Error at cmake/modules/Distutils.cmake:81 (message):
  Unable to tell python extension's suffix: Traceback (most recent call
  last):

    File "<string>", line 1, in <module>

  ImportError: cannot import name 'sysconfig' from 'distutils'
  (/usr/lib/python3.8/distutils/__init__.py)

Call Stack (most recent call first):
  src/pybind/rados/CMakeLists.txt:1 (distutils_add_cython_module)

@tchaikov
Copy link
Contributor

@tserong sorry for the headaches. might want to backport f0fc04f and 637dd7b as well.

less redirections

Signed-off-by: Kefu Chai <tchaikov@gmail.com>
(cherry picked from commit f0fc04f)
to address following warning from python 3.9:

<string>:1: DeprecationWarning: The distutils package is deprecated and slated for removal in Python 3.12. Use setuptools or check PEP 632 for potential alternatives
<string>:1: DeprecationWarning: The distutils.sysconfig module is deprecated, use sysconfig instead

Signed-off-by: Kefu Chai <tchaikov@gmail.com>
(cherry picked from commit 637dd7b)
@tserong
Copy link
Contributor Author

tserong commented Aug 19, 2022

Thanks @tchaikov! :-)

@tserong
Copy link
Contributor Author

tserong commented Aug 19, 2022

OK, now I'm getting:

/usr/include/fmt/core.h:1728:7: error: static assertion failed: Cannot format an argument. To make type T formattable provide a formatter<T> specialization: https://fmt.dev/latest/api.html#udt

...which looks suspiciously like bb9d099, but that won't cherry-pick by itself (more conflicts). I'll have to pick this up again on Monday (sorry!)

@tchaikov
Copy link
Contributor

or, we can disable the WITH_SEASTAR option on LTS branches. as crimson is still under active development. it does not help to build or test it on the LTS branches.

@tserong
Copy link
Contributor Author

tserong commented Aug 22, 2022

or, we can disable the WITH_SEASTAR option on LTS branches. as crimson is still under active development. it does not help to build or test it on the LTS branches.

Good idea. How do I do that? I assume that's in the CI infra somewhere and not something I can tweak in the ceph source?

@tserong
Copy link
Contributor Author

tserong commented Aug 22, 2022

...oh, wait, I have an idea...

@tserong
Copy link
Contributor Author

tserong commented Aug 22, 2022

The fun continues...

CMake Error at src/pybind/CMakeLists.txt:74 (message):
  Python and Ceph link to different OpenSSL versions: 1.1.1

   vs 3.0.2

There's no way that's related to anything I've got here, surely?

Update: that failure (https://jenkins.ceph.com/job/ceph-pull-requests/102006/consoleFull#-20391438177641bc0b-bacc-4481-8290-0cf28df0ce68) was a build running with python3.8, whereas the build I've just kicked off now with my request to jenkins below, seems to be using python3.10, and did not fail with the above error... I wonder if python version is significant here?

@tserong
Copy link
Contributor Author

tserong commented Aug 22, 2022

jenkins test make check

@tserong
Copy link
Contributor Author

tserong commented Aug 22, 2022

jenkins test api

@tserong tserong requested a review from a team as a code owner August 22, 2022 05:45
@tserong tserong requested review from nizamial09 and Pegonzal and removed request for a team August 22, 2022 05:45
@tserong
Copy link
Contributor Author

tserong commented Aug 22, 2022

Those last two commits will hopefully fix ImportError: cannot import name 'Sequence' from 'collections' (/usr/lib/python3.10/collections/__init__.py) running dashboard tests (https://jenkins.ceph.com/job/ceph-pull-requests/102008/)

@tserong
Copy link
Contributor Author

tserong commented Aug 22, 2022

OK, make check is good. As for the API tests, the mgr log from https://jenkins.ceph.com/job/ceph-api/43074/ has:

2022-08-22T05:52:31.271+0000 7fe6fca7b500 -1 mgr[py] Module not found: 'dashboard'
2022-08-22T05:52:31.271+0000 7fe6fca7b500 -1 mgr[py] Traceback (most recent call last):
  File "/home/jenkins-build/build/workspace/ceph-api/src/pybind/mgr/dashboard/__init__.py", line 61, in <module>
    from .module import Module, StandbyModule  # noqa: F401
  File "/home/jenkins-build/build/workspace/ceph-api/src/pybind/mgr/dashboard/module.py", line 51, in <module>
    patch_cherrypy(cherrypy.__version__)
  File "/home/jenkins-build/build/workspace/ceph-api/src/pybind/mgr/dashboard/cherrypy_backports.py", line 197, in patch_cherrypy
    accept_socket_error_0(v)
  File "/home/jenkins-build/build/workspace/ceph-api/src/pybind/mgr/dashboard/cherrypy_backports.py", line 124, in accept_socket_error_0
    if v < StrictVersion("9.0.0") or cheroot_version < StrictVersion("6.5.5"):
  File "/usr/lib/python3.10/distutils/version.py", line 64, in __gt__
    c = self._cmp(other)
  File "/usr/lib/python3.10/distutils/version.py", line 168, in _cmp
    other = StrictVersion(other)
  File "/usr/lib/python3.10/distutils/version.py", line 40, in __init__
    self.parse(vstring)
  File "/usr/lib/python3.10/distutils/version.py", line 137, in parse
    raise ValueError("invalid version number '%s'" % vstring)
ValueError: invalid version number '8.5.2+ds1'

2022-08-22T05:52:31.271+0000 7fe6fca7b500 -1 mgr[py] Class not found in module 'dashboard'
2022-08-22T05:52:31.271+0000 7fe6fca7b500 -1 mgr[py] Error loading module 'dashboard': (2) No such file or directory

This looks like https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1967139, which (if I'm reading it correctly) should have been fixed about a week ago.

@github-actions github-actions bot added this to In progress in Dashboard Aug 22, 2022
@tserong
Copy link
Contributor Author

tserong commented Aug 22, 2022

Can anyone assist in getting an updated python3-cheroot installed on braggi*, per the ubuntu bug in my previous comment?

@tserong
Copy link
Contributor Author

tserong commented Aug 25, 2022

Right, done.

@tserong
Copy link
Contributor Author

tserong commented Aug 25, 2022

jenkins test make check

@tchaikov
Copy link
Contributor

tchaikov commented Aug 25, 2022

there're lots of old friends in the failed tests.

the unittest_rgw_dmclock_scheduler failure is tracked by https://tracker.ceph.com/issues/40069

@tserong
Copy link
Contributor Author

tserong commented Aug 25, 2022

That is old. Happily(?) another run got us through make check...

@tserong
Copy link
Contributor Author

tserong commented Aug 25, 2022

jenkins test dashboard

@tserong
Copy link
Contributor Author

tserong commented Aug 25, 2022

I've pushed this to https://github.com/ceph/ceph-ci/tree/wip-tserong-testing to try to get package builds, but I can't see that branch at https://shaman.ceph.com/builds/ceph/ - did I miss a step somewhere?

@tserong
Copy link
Contributor Author

tserong commented Aug 25, 2022

Aha! wip-tserong-testing was identical to wip-pacific-update-fio-kefu-2, including all commit hashes, so I assume something was smart enough to not bother building my branch. So I've just done a reword of the last commit in wip-tserong-testing, without making any actual changes, to force a new hash, and now it seems to be building.

@tserong
Copy link
Contributor Author

tserong commented Aug 25, 2022

(Note that the build I just triggered of wip-tserong-testing is presumably not actually necessary as https://shaman.ceph.com/builds/ceph/wip-pacific-update-fio-kefu-3/ is the same source, and looks pretty good to me)

@tchaikov
Copy link
Contributor

tchaikov commented Aug 25, 2022

https://shaman.ceph.com/builds/ceph/wip-tserong-testing/2864d6a280d94e3a4249df5ca1bd1905609fe3a9/default/315496/

ln -sf libpmemobj.so.1 ../debug/libpmemobj/../libpmemobj.so
make[7]: Leaving directory '/build/ceph-16.2.10-678-g2864d6a2/src/pmdk/src/libpmemobj'
/usr/bin/ld: pmemblk_priv_funcs.o (symbol from plugin): in function `_Malloc':
(.text+0x0): multiple definition of `_Malloc'; ../nondebug/libpmempool/alloc.o (symbol from plugin):(.text+0x0): first defined here
/usr/bin/ld: pmemblk_priv_funcs.o (symbol from plugin): in function `_Malloc':

needs backport of #47449 (its quincy backport is at #47619). i really wish i could say "this must be the last one". but i don't dare ..

@tserong
Copy link
Contributor Author

tserong commented Aug 25, 2022

Thanks for hunting that down @tchaikov. I'll try the quincy backport rather than pushing pacific up to gcc 11.

See-also: https://tracker.ceph.com/issues/54473
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
(cherry picked from commit 2a97033)

Conflicts:
	cmake/modules/Buildpmem.cmake: trivial resolution
@tserong
Copy link
Contributor Author

tserong commented Aug 25, 2022

...and away we go again. This PR really is a gift that keeps on giving ;-P

@tserong
Copy link
Contributor Author

tserong commented Aug 25, 2022

https://jenkins.ceph.com/job/ceph-dev-new-build/ARCH=x86_64,AVAILABLE_ARCH=x86_64,AVAILABLE_DIST=jammy,DIST=jammy,MACHINE_SIZE=gigantic/65064//consoleFull:

/tmp/cc1zIejP.s: Assembler messages:
/tmp/cc1zIejP.s: Error: invalid attempt to declare external version name as default in symbol `rados_create@@'
make[5]: *** [/tmp/ccHQWBW3.mk:2: /tmp/cccSMOhU.ltrans0.ltrans.o] Error 1
make[5]: *** Waiting for unfinished jobs....
lto-wrapper: fatal error: make returned 2 exit status
compilation terminated.
/usr/bin/ld: error: lto-wrapper failed
collect2: error: ld returned 1 exit status
make[4]: *** [src/librados/CMakeFiles/librados.dir/build.make:144: lib/librados.so.2.0.0] Error 1

@tserong
Copy link
Contributor Author

tserong commented Aug 25, 2022

The previous comment was ubuntu jammy failing.

Ubuntu focal and centos 9 are fine.

Centos 8 (https://jenkins.ceph.com/job/ceph-dev-new-build/ARCH=x86_64,AVAILABLE_ARCH=x86_64,AVAILABLE_DIST=centos8,DIST=centos8,MACHINE_SIZE=gigantic/65064//consoleFull) fails with:

[...]
No match for argument: ceph-volume
No match for argument: ceph-exporter
Error: Unable to find a match: ceph-volume ceph-exporter
Error: error building at STEP "RUN yum install -y epel-release && yum install -y jq && bash -c ' [ ..... ]  ceph-volume         ceph-exporter && echo 'Packages verified successfully'": error while running runtime: exit status 1

@tchaikov
Copy link
Contributor

that's caused by LTO. #42602 is the cure.

b-ranto and others added 7 commits August 25, 2022 20:51
The build fails because the function definitions are out of order. This
also reuses the LIBRADOS_C_API_DEFAULT_F macro for function definitions.

Fixes: https://tracker.ceph.com/issues/40060
Signed-off-by: Boris Ranto <branto@redhat.com>
(cherry picked from commit d79efb0)
The gcc compiler now supports symver attribute. We should update the
symvers to be able to support LTO.

Fixes: https://tracker.ceph.com/issues/40060
Signed-off-by: Boris Ranto <branto@redhat.com>
(cherry picked from commit e761933)
We can now use LTO when building ceph. The symver issue was fixed by
using the gcc __symver__ attribute. The systems that support it can now
re-enable LTO.

Fixes: https://tracker.ceph.com/issues/40060
Signed-off-by: Boris Ranto <branto@redhat.com>
(cherry picked from commit 381507a)
the __asm__(".asmver ..") is a support provided by the compiler, so
would be better to detect it by either checking the compiler identifer
or just try it out.

in this change, instead of checking the building platform, we check this
feature using check_c_source_compiles().

in future, we could support versioned symbols using function attriubte
or symbol tables or version-script.

on platform where symbol versioning is not supported, we might need to
go with a different approach.

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit d0858db)
Fixes: https://tracker.ceph.com/issues/40060
Signed-off-by: Boris Ranto <branto@redhat.com>
(cherry picked from commit f90e26c)
We should check if -flto-partition=none is defined when the compiler
does not support symver attribute and fail the build if it is not.

Fixes: https://tracker.ceph.com/issues/40060
Co-authored-by: Kefu Chai <tchaikov@gmail.com>
Signed-off-by: Boris Ranto <branto@redhat.com>
(cherry picked from commit 5bcfd5c)
We should check if either asm symver or attribute symver is defined and
not assume that attribute symver implies asm symver.

Fixes: https://tracker.ceph.com/issues/40060
Signed-off-by: Boris Ranto <branto@redhat.com>
(cherry picked from commit ff1941a)
@tserong
Copy link
Contributor Author

tserong commented Aug 25, 2022

Jammy build (https://jenkins.ceph.com/job/ceph-dev-new-build/ARCH=x86_64,AVAILABLE_ARCH=x86_64,AVAILABLE_DIST=jammy,DIST=jammy,MACHINE_SIZE=gigantic/65070//consoleFull) is now giving:

/usr/share/cmake-3.22/Modules/CheckFunctionExists.c:17: undefined reference to `pthread_set_name_np'
...
/usr/share/cmake-3.22/Modules/CheckFunctionExists.c:17: undefined reference to `pthread_get_name_np'
...
/usr/share/cmake-3.22/Modules/CheckFunctionExists.c:17: undefined reference to `getprogname'
...

(I'm going to have to leave this for today and pick it up again tomorrow)

@tchaikov
Copy link
Contributor

tchaikov commented Aug 25, 2022

lemme hold the torch for a while. see #47803

@ljflores
Copy link
Contributor

Let's close this one in favor of #47803. @tserong @tchaikov thanks for all your hard work on finding the right commits.

@tserong
Copy link
Contributor Author

tserong commented Aug 25, 2022

Closing per previous comment. Thanks everyone!

@tserong tserong closed this Aug 25, 2022
Dashboard automation moved this from Reviewer approved to Done Aug 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Dashboard
  
Done
10 participants