Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RGW-NFS: Use rados cluster_stat to report filesystem usage #20093

Merged
merged 1 commit into from Feb 20, 2018

Conversation

supriti
Copy link
Contributor

@supriti supriti commented Jan 24, 2018

Partially fixes: http://tracker.ceph.com/issues/22202

Signed-off-by: Supriti Singh supriti.singh@suse.com

@supriti
Copy link
Contributor Author

supriti commented Jan 24, 2018

@mattbenjamin please review

Copy link
Contributor

@mattbenjamin mattbenjamin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when I ran this change under gdb, I naively checked the value of stats after executing RGWGetClusterStatReq, and found it to be all-0. I have objects in this cluster, it's over a week old. Can you think of a reason for that?

Thread 1 "ganesha.nfsd" hit Breakpoint 2, rgw_statfs (rgw_fs=, parent_fh=, vfs_st=0x7fffffffd490, flags=)
at /home/mbenjamin/ceph-noob/src/rgw/rgw_file.cc:1630
1630 if (rc < 0) {
(gdb) p rc
$1 =
(gdb) list
1625 RGWLibFS fs = static_cast<RGWLibFS>(rgw_fs->fs_private);
1626 struct rados_cluster_stat_t stats;
1627
1628 RGWGetClusterStatReq req(fs->get_context(), fs->get_user(),stats);
1629 int rc = rgwlib.get_fe()->execute_req(&req);
1630 if (rc < 0) {
1631 lderr(fs->get_context()) << "ERROR: getting total cluster usage"
1632 << cpp_strerror(-rc) << dendl;
1633 return rc;
1634 }

(gdb) n
1644 vfs_st->f_bsize = 1 << CEPH_BLOCK_SHIFT;
(gdb) p stats
$3 = {kb = 0, kb_used = 0, kb_avail = 0, num_objects = 0}

vfs_st->f_bavail = UINT64_MAX;
vfs_st->f_files = 1024; /* object count, do we have an est? */
vfs_st->f_ffree = UINT64_MAX;
/*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't seem like we should set a blocksize of 4M as of now; 1M is the most common client default for rwsize; separately, you have reported problems using this value. in upcoming nfs writeback changeset, I dimension into 4M extents == chunks, but subdivide each extent into 1M pages

* blocks. We use 4MB only because it is big enough, and because it
* actually *is* the (ceph) default block size.
*/
const int CEPH_BLOCK_SHIFT = 22;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a better source for this constant later (in extent package), good to keep it at local scope; can we make it constexpr? can we make it uint32_t?

@@ -1622,16 +1622,31 @@ int rgw_statfs(struct rgw_fs *rgw_fs,
struct rgw_statvfs *vfs_st, uint32_t flags)
{
RGWLibFS *fs = static_cast<RGWLibFS*>(rgw_fs->fs_private);
struct rados_cluster_stat_t stats;

RGWGetClusterStatReq req(fs->get_context(), fs->get_user(),stats);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

space before stats

@mattbenjamin mattbenjamin self-assigned this Jan 26, 2018
@supriti
Copy link
Contributor Author

supriti commented Jan 29, 2018

@mattbenjamin
I was testing using vstart and running a ganesha instance pointing to vstart ceph.conf. Running "df -h" on mount point shows the usage.

I also ran with gdb. I started ganesha process, and attached gdb to it. I set a breakpoint at rgw_statfs.
I can see the right stats in stack trace.
gdb) l 1619 get filesystem attributes 1620 */ 1621 int rgw_statfs(struct rgw_fs *rgw_fs, 1622 struct rgw_file_handle *parent_fh, 1623 struct rgw_statvfs *vfs_st, uint32_t flags) 1624 { 1625 RGWLibFS *fs = static_cast<RGWLibFS*>(rgw_fs->fs_private); 1626 struct rados_cluster_stat_t stats; 1627 1628 RGWGetClusterStatReq req(fs->get_context(), fs->get_user(), stats); (gdb) n 1625 RGWLibFS *fs = static_cast<RGWLibFS*>(rgw_fs->fs_private); (gdb) n 1628 RGWGetClusterStatReq req(fs->get_context(), fs->get_user(), stats); (gdb) n 1625 RGWLibFS *fs = static_cast<RGWLibFS*>(rgw_fs->fs_private); (gdb) n 1628 RGWGetClusterStatReq req(fs->get_context(), fs->get_user(), stats); (gdb) n 1629 int rc = rgwlib.get_fe()->execute_req(&req); (gdb) n 1628 RGWGetClusterStatReq req(fs->get_context(), fs->get_user(), stats); (gdb) p stats $1 = {kb = 140405592657808, kb_used = 4416878, kb_avail = 0, num_objects = 29362600} (gdb)

Partially fixes: http://tracker.ceph.com/issues/22202

Signed-off-by: Supriti Singh <supriti.singh@suse.com>
@supriti
Copy link
Contributor Author

supriti commented Jan 31, 2018

@mattbenjamin addressed your comments and submitted new patch. Please check

Copy link
Contributor

@mattbenjamin mattbenjamin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm--I'll re-test later today, should be fine

@supriti
Copy link
Contributor Author

supriti commented Feb 5, 2018

@mattbenjamin ping. Were you able to test this patch.

@mattbenjamin
Copy link
Contributor

@supriti I've retested. In my environment, I continue to see 0 values reported from rados::cluster_stat()--having said that, I see nothing wrong with the logic being executed, and I guess I have to assume my test cluster is reporting 0-stats.

@mattbenjamin
Copy link
Contributor

@supriti this works beautifully; note to self: don't forget to run ceph-mgr ;)

@mattbenjamin mattbenjamin merged commit 11526c6 into ceph:master Feb 20, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants