Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix CephFS Status updates #11453

Merged
merged 1 commit into from
Jan 24, 2023
Merged

Fix CephFS Status updates #11453

merged 1 commit into from
Jan 24, 2023

Conversation

aruniiird
Copy link
Contributor

@aruniiird aruniiird commented Dec 19, 2022

CephFS object fails to do a initial status update.
Operator shows the following error message in logs,

'Error while updation: Operation cannot be fulfilled on
cephfilesystems.ceph.rook.io "myfs": the object has been
modified; please apply your changes to the latest version
and try again'

Function 'updateStatus' now returns the new/updated
CephFilesystem object if successfully updated or nil.

Signed-off-by: Arun Kumar Mohan amohan@redhat.com

Description of your changes:

Which issue is resolved by this Pull Request:
Resolves issue #10767

Checklist:

  • Commit Message Formatting: Commit titles and messages follow guidelines in the developer guide).
  • Skip Tests for Docs: If this is only a documentation change, add the label skip-ci on the PR.
  • Reviewed the developer guide on Submitting a Pull Request
  • Pending release notes updated with breaking and/or notable changes for the next minor release.
  • Documentation has been updated, if necessary.
  • Unit tests have been added, if necessary.
  • Integration tests have been added, if necessary.

@aruniiird aruniiird force-pushed the fix-cephFS-status branch 3 times, most recently from 6964f59 to e9605ba Compare December 21, 2022 12:16
@aruniiird aruniiird force-pushed the fix-cephFS-status branch 2 times, most recently from ac517f1 to c5ad050 Compare January 18, 2023 11:49
@aruniiird
Copy link
Contributor Author

Ready for next set of reviews, @travisn please take a look.
Tested the command kubectl wait,
kubectl wait -n rook-ceph cephfilesystem.ceph.rook.io/myfs --for=jsonpath='{.status.phase}'="Ready" --timeout=360s
and it successfully waited for Ready state instead of throwing an error.

Copy link
Member

@travisn travisn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few nits.

I see this is working and it will accomplish the goal of ensuring the status is updated, although I'm not thrilled that it causes a retry on the reconcile. In three tests of filesystem creation I tested with this change, the retry was needed twice, while the third test succeeded. So there appears to be a race condition to update the status immediately after updating the finalizer.

The operator log shows this sequence when the finalizer is updated and the status update follows immediately after:

2023-01-18 23:39:26.581251 I | ceph-spec: adding finalizer "cephfilesystem.ceph.rook.io" on "myfs"
2023-01-18 23:39:26.592879 D | ceph-spec: update event from a CR: "myfs"
2023-01-18 23:39:26.593120 D | ceph-spec: update event on CephFilesystem CR
2023-01-18 23:39:26.593297 D | ceph-spec: skipping resource "myfs" update with unchanged spec
2023-01-18 23:39:26.600239 W | ceph-file-controller: failed to set filesystem "myfs" status to "Progressing". failed to update object "rook-ceph/myfs" status: Operation cannot be fulfilled on cephfilesystems.ceph.rook.io "myfs": the object has been modified; please apply your changes to the latest version and try again
2023-01-18 23:39:26.600350 E | ceph-file-controller: failed to reconcile failed to update ceph filesystem status
2023-01-18 23:39:26.600416 E | ceph-file-controller: failed to reconcile CephFilesystem "rook-ceph/myfs". failed to update ceph filesystem status
2023-01-18 23:39:26.688853 D | ceph-spec: update event from a CR: "myfs"
2023-01-18 23:39:26.688896 D | ceph-spec: update event on CephFilesystem CR
2023-01-18 23:39:26.689881 D | ceph-file-controller: filesystem "myfs" status updated to "Progressing"

Notice that within 1/100th of a second we are updating the status after the finalizer is updated, and the query for the latest object isn't retrieving the latest version of the resource. I'd love to see a solution where the finalizer update actually returns the latest resource so the status can just use it, but that's for a separate PR.

pkg/operator/ceph/file/controller.go Outdated Show resolved Hide resolved
pkg/operator/ceph/file/controller.go Outdated Show resolved Hide resolved
pkg/operator/ceph/file/controller.go Outdated Show resolved Hide resolved
CephFS object fails to do a initial status update.
Operator shows the following error message in logs,

'Error while updation: Operation cannot be fulfilled on
cephfilesystems.ceph.rook.io "myfs": the object has been
modified; please apply your changes to the latest version
and try again'

Function 'updateStatus' now returns the new/updated
CephFilesystem object if successfully updated or nil.

Signed-off-by: Arun Kumar Mohan <amohan@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants