Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

common: Fix 44373 and make a couple cleanups in ceph::timer #33771

Merged
merged 7 commits into from
Mar 10, 2020

Conversation

adamemerson
Copy link
Contributor

Fixes https://tracker.ceph.com/issues/44373

I figured I should add a test, too, so I had a controlled place to run valgrind and since it used promises in the lambdas I switched event to use unique_function. And made a couple other cleanups while I was at it.

Checklist

  • References tracker ticket
  • Includes tests for new functionality or reproducer for bug

@liewegas liewegas added this to the octopus milestone Mar 6, 2020
@tchaikov tchaikov changed the title Fix 44373 and make a couple cleanups in ceph::timer common: Fix 44373 and make a couple cleanups in ceph::timer Mar 6, 2020
src/common/ceph_timer.h Show resolved Hide resolved
@adamemerson adamemerson force-pushed the wip-44373 branch 2 times, most recently from 0cfa489 to 2289aa4 Compare March 6, 2020 19:32
@adamemerson
Copy link
Contributor Author

@tchaikov The conflict should be fixed.

@liewegas
Copy link
Member

liewegas commented Mar 8, 2020

I think this is causing rados/perf runs to fail with

2020-03-08T19:06:28.300 INFO:teuthology.orchestra.run.smithi136:> sudo ceph --cluster ceph osd crush tunables default
2020-03-08T19:56:28.538 INFO:teuthology.orchestra.run.smithi136.stderr:[errno 110] RADOS timed out (error connecting to the cluster)
2020-03-08T19:56:28.571 DEBUG:teuthology.orchestra.run:got remote process result: 1

/a/sage-2020-03-08_17:10:32-rados-wip-sage4-testing-2020-03-08-0938-distro-basic-smithi/4838021

@liewegas liewegas changed the base branch from master to octopus March 9, 2020 14:48
Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
Stuffing things into 'detail' and using them just makes backtraces and
valgrind illegible.

Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
Take advantage of a couple things in Boost.Intrusive that make the
code less obfuscated.

Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
std::condition_variable::wait_until takes a const reference to a
time_point. It may access this reference after relinquishing the
mutex, creating a potential use-after-free error if the first event is
shut down.

So, just copy the time onto the stack, so we have a reference that
won't disappear.

https://tracker.ceph.com/issues/44373

Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
@liewegas
Copy link
Member

liewegas commented Mar 9, 2020

turns out the perf failure was a different PR, retesting

@liewegas liewegas merged commit fd27be0 into ceph:octopus Mar 10, 2020
@adamemerson adamemerson deleted the wip-44373 branch March 11, 2020 15:18
liewegas added a commit that referenced this pull request Mar 11, 2020
* refs/pull/33885/head:
	Merge pull request #33848 from mchangir/octopus-tests-remove-suprious-whitespace
	Merge PR #33746 into octopus
	Merge PR #33830 into octopus
	Merge PR #33732 into octopus
	Merge PR #33620 into octopus
	Merge pull request #33876 from tchaikov/octopus-cephadm-mypy
	cephadm: add "assert foo is not None" for mypy check
	Merge pull request #33067 from tspmelo/wip-rbd-delete-with-snapshot
	cephadm: add grafana adopt
	Merge PR #33771 into octopus
	Merge PR #33850 into octopus
	Merge PR #33853 into octopus
	Merge PR #33857 into octopus
	Merge PR #32990 into octopus
	Merge PR #33713 into octopus
	Merge PR #33838 into octopus
	qa/tasks/cephadm: no default mon|mgr|crash service specs
	qa/suites/rados/cephadm/upgrade: upgrade start point that supports the no-spec option
	Merge PR #33832 into octopus
	cephadm: bootstrap: wait for mgr to restart after enabling a module
	mgr: add 'mgr_status' tell command
	Merge pull request #33839 from rhcs-dashboard/44538-fix-rgw-grafana-get-put-latencies
	Merge pull request #33743 from votdev/issue_43869_fix_qa_test
	cephadm: create initial mon and mgr service specs too
	cephadm: no need to pregenerate a crash key for the bootstrap host
	mgr/cephadm: do not complain when we don't have enough hosts
	mgr/cephadm: remove orphan daemons
	mgr/cephadm: report size=0 for fabricated ServiceDescription
	mgr/cephadm: safety check to prevent removing all mon|mgr daemons
	mgr/cephadm: prevent scaling mon|mgr below count=1
	mgr/cephadm: do not remove daemons from remove_service
	Merge pull request #33805 from tchaikov/wip-44500
	spec: Podman (temporarily) requires apparmor-abstractions on suse
	mgr/cephadm: Make sure we don't co-locate the same daemon
	monitoring: fix RGW grafana chart 'Average GET/PUT Latencies'
	tests: remove spurious whitespace
	mgr/cephadm: fix service list filtering
	Merge PR #33825 into octopus
	Merge PR #33811 into octopus
	Revert "Merge pull request #33673 from cbodley/wip-denc-enum"
	mgr/cephadm: fix upgrade order
	Merge PR #33801 into octopus
	Merge PR #33822 into octopus
	cephadm: bootstrap: tolerate error return from -h
	Merge PR #33809 into octopus
	Merge PR #32678 into octopus
	cephadm: use `sh` instead of `bash` during enter
	ceph.in: only shut down rados on clean exit
	common/ceph_timer: Pass reference to waited time on stack
	common/ceph_timer: Add test
	common/ceph_timer: Use unique_function, allowing noncopyable events
	common/ceph_timer: Couple cleanups
	common/ceph_timer: Fix namespaces
	common/ceph_timer: Add missing includes
	common/ceph_timer.h: Don't indent contents of a namespace
	mgr/dashboard: Crush rule modal
	mgr/dashboard: Preserve rule selection on pool type change
	mgr/dashboard: Crush rule is only send during replicated pool creation
	mgr/dashboard: Explicit returns in pool form
	mgr/dashboard: Removes fork join in pool form
	mgr/dashboard: Hide ECP actions during ec pool edit
	mgr/dashboard: Pool form erasure/replicated boolean
	mgr/dashboard: Change pool info API endpoint
	mgr/dashboard: Moves ECP info endpoint to UI-API
	mgr/cephadm: add _remove_osds_bg back to main loop
	mgr/cephadm/osd: update removal report immediately
	qa/tasks/ceph_manager: use StringIO for capturing COT output
	qa/standalone/scrub/osd-scrub-repair: force osdmap prop to osds
	qa/standalone/scrub/osd-scrub-test: wait longer for update
	qa/tasks/ceph_manager: capture stderr for COT
	qa/suites/rados/ceph: drop opensuse for now
	mon/MonClient: send logs to mon on separate schedule than pings
	mgr/dashboard: Fix missing ImageSpec usage
	mgr/dashboard: Allow removing RBD with snapshots
	mgr/dashboard: Refactor and cleanup tasks.mgr.dashboard.test_user
	mgr/dashboard: support multiple DriveGroups when creating OSDs
	mon/MonClient: send logs to mon even if we have no keelalive2
	cephadm: flag dashboard user to change password

Reviewed-by: Sebastian Wagner <swagner@suse.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants