Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mds: [TRACKER-58216] quota.max_files check when prepare_new_inode #49326

Closed

Conversation

jinmyeonglee
Copy link
Contributor

@jinmyeonglee jinmyeonglee commented Dec 8, 2022

https://tracker.ceph.com/issues/58216

Contribution Guidelines

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows

@github-actions github-actions bot added the cephfs Ceph File System label Dec 8, 2022
Signed-off-by: Jinmyeong Lee <jinmyeong.lee@linecorp.com>
@jinmyeonglee jinmyeonglee force-pushed the 58216_Add-mds-quota-limit-check branch from 09a9250 to c087e2e Compare December 8, 2022 08:44
@jinmyeonglee jinmyeonglee changed the title [cephfs:MDS]58216 MDS quota.max_files check when prepare_new_inode mds: 58216 MDS quota.max_files check when prepare_new_inode Dec 8, 2022
@jinmyeonglee jinmyeonglee changed the title mds: 58216 MDS quota.max_files check when prepare_new_inode mds: [TRACKER-58216] quota.max_files check when prepare_new_inode Dec 8, 2022
@jinmyeonglee
Copy link
Contributor Author

jenkins test make check

1 similar comment
@tchaikov
Copy link
Contributor

tchaikov commented Dec 9, 2022

jenkins test make check

@vshankar
Copy link
Contributor

vshankar commented Dec 9, 2022

The MDS should broadcast the configured quotas to clients where the quota would be enforced which could lag a bit and that's expected. Does a sync on the client fail?

@jinmyeonglee
Copy link
Contributor Author

@vshankar Hello,
No, the sync does not fail.

But when copying a large directory having more files than the quota to the mount point that does not hit the quota limit yet, the lag you mentioned allows it to make more files in the mount point.
(This test result is briefly written in the tracker issue.)

I want to know why only the client checks the quota, not MDS. Was there any background?

@vshankar
Copy link
Contributor

vshankar commented Dec 9, 2022

@vshankar Hello, No, the sync does not fail.

But when copying a large directory having more files than the quota to the mount point that does not hit the quota limit yet, the lag you mentioned allows it to make more files in the mount point. (This test result is briefly written in the tracker issue.)

Which client is this and what version?

@jinmyeonglee
Copy link
Contributor Author

jinmyeonglee commented Dec 12, 2022

@vshankar We are using ceph-fuse mount and Nautilus 14.2.19.

@vshankar
Copy link
Contributor

@vshankar We are using ceph-fuse mount and Nautilus 14.2.19.

What are the auth caps for clients? Quota is not enforced when the client has restrictive access to a specific path (e.g. /home/foo) and quota is configured on an ancestor directory (/home).

@jinmyeonglee
Copy link
Contributor Author

@vshankar We are using ceph-fuse mount and Nautilus 14.2.19.

What are the auth caps for clients? Quota is not enforced when the client has restrictive access to a specific path (e.g. /home/foo) and quota is configured on an ancestor directory (/home).

Having the auth caps, and we are using the bind mount.

$ cat /etc/auto.master.d/41d7bfe2491b4fe193cb7d9ed8f8f467_46d34f93-5a08-4803-824e-4a05ba80228d.****

/__managed/46d34f93-5a08-4803-824e-4a05ba80228d -fstype=fuse.ceph,noatime id=46d34f93-5a08-4803-824e-4a05ba80228d,mon_host=*************,client_mountpoint=/volumes/_nogroup/5deba5b9-bf65-457b-9d0b-8a093af1a177,key=******
# ceph auth get client.46d34f93-5a08-4803-824e-4a05ba80228d
exported keyring for client.46d34f93-5a08-4803-824e-4a05ba80228d
[client.46d34f93-5a08-4803-824e-4a05ba80228d]
	key = **********
	caps mds = "allow rw path=/volumes/_nogroup/5deba5b9-bf65-457b-9d0b-8a093af1a177"
	caps mon = "allow r"
	caps osd = "allow rw pool=cephfs_data namespace=fsvolumens_5deba5b9-bf65-457b-9d0b-8a093af1a177"
$ getfattr -n ceph.quota.max_files /__managed/46d34f93-5a08-4803-824e-4a05ba80228d --absolute-name
# file: /__managed/46d34f93-5a08-4803-824e-4a05ba80228d
ceph.quota.max_files="5"

@vshankar
Copy link
Contributor

@vshankar We are using ceph-fuse mount and Nautilus 14.2.19.

What are the auth caps for clients? Quota is not enforced when the client has restrictive access to a specific path (e.g. /home/foo) and quota is configured on an ancestor directory (/home).

Having the auth caps, and we are using the bind mount.

$ cat /etc/auto.master.d/41d7bfe2491b4fe193cb7d9ed8f8f467_46d34f93-5a08-4803-824e-4a05ba80228d.****

/__managed/46d34f93-5a08-4803-824e-4a05ba80228d -fstype=fuse.ceph,noatime id=46d34f93-5a08-4803-824e-4a05ba80228d,mon_host=*************,client_mountpoint=/volumes/_nogroup/5deba5b9-bf65-457b-9d0b-8a093af1a177,key=******
# ceph auth get client.46d34f93-5a08-4803-824e-4a05ba80228d
exported keyring for client.46d34f93-5a08-4803-824e-4a05ba80228d
[client.46d34f93-5a08-4803-824e-4a05ba80228d]
	key = **********
	caps mds = "allow rw path=/volumes/_nogroup/5deba5b9-bf65-457b-9d0b-8a093af1a177"
	caps mon = "allow r"
	caps osd = "allow rw pool=cephfs_data namespace=fsvolumens_5deba5b9-bf65-457b-9d0b-8a093af1a177"
$ getfattr -n ceph.quota.max_files /__managed/46d34f93-5a08-4803-824e-4a05ba80228d --absolute-name
# file: /__managed/46d34f93-5a08-4803-824e-4a05ba80228d
ceph.quota.max_files="5"

Could you share the output of:

getfattr -n ceph.quota.max_files <mntpt>/volumes/_nogroup/5deba5b9-bf65-457b-9d0b-8a093af1a177

@jinmyeonglee
Copy link
Contributor Author

I am using the bind mount, so the real mountpoint is /__managed/46d34f93-5a08-4803-824e-4a05ba80228d and this is bound from /__managed/46d34f93-5a08-4803-824e-4a05ba80228d/share to ~/quota_mv.

And when just creating files simply with touch, the quota limit is well enforced. So I think the caps is okay. This issue is triggered when cp -R or mv the dir.

But If you want, I can test this issue again with the naive ceph-fuse mount (not bind mount) with client.admin keyring (before patching and after, both).

@jinmyeonglee
Copy link
Contributor Author

@vshankar And I want to know why the community had decided the client only enforces the quota checking even having some delay which is expected.
Is there any background?

@vshankar
Copy link
Contributor

I am using the bind mount, so the real mountpoint is /__managed/46d34f93-5a08-4803-824e-4a05ba80228d and this is bound from /__managed/46d34f93-5a08-4803-824e-4a05ba80228d/share to ~/quota_mv.

And when just creating files simply with touch, the quota limit is well enforced. So I think the caps is okay. This issue is triggered when cp -R or mv the dir.

Ugh! Have you tried this with one of the recent releases (pacific/quincy)?

But If you want, I can test this issue again with the naive ceph-fuse mount (not bind mount) with client.admin keyring (before patching and after, both).

Since quotas are enforced when using touch, the issue doesn't seem to be related to an incorrect setup. Don't worry about reproducing.

@jinmyeonglee
Copy link
Contributor Author

Ugh! Have you tried this with one of the recent releases (pacific/quincy)?

No, I didn't. I should prepare my new test cluster with centos8 kind of servers to test with pacific.
It will take a few days.

Do you think this can be related to any patch in the recent version?

@vshankar
Copy link
Contributor

Ugh! Have you tried this with one of the recent releases (pacific/quincy)?

No, I didn't. I should prepare my new test cluster with centos8 kind of servers to test with pacific. It will take a few days.

Do you think this can be related to any patch in the recent version?

Basically, quotas were introduced in mimic release, so I presume it has stabilized over the years. As far as your question on why quotas are enforced on the client - the mds ensures that clients have a view of the quota realm and can reliably enforce quotas (obviously, with some lag).

@vshankar
Copy link
Contributor

@jinmyeonglee Do you have any file system wide config set? Could you share ceph fs dump?

@jinmyeonglee
Copy link
Contributor Author

jinmyeonglee commented Dec 14, 2022

@jinmyeonglee Do you have any file system wide config set? Could you share ceph fs dump?

# ceph fs dump
dumped fsmap epoch 49
e49
enable_multiple, ever_enabled_multiple: 0,0
compat: compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,8=no anchor table,9=file layout v2,10=snaprealm v2}
legacy client fscid: 1

Filesystem 'cephfs' (1)
fs_name	cephfs
epoch	49
flags	12
created	2022-12-06 09:52:15.442781
modified	2022-12-06 14:30:28.201134
tableserver	0
root	0
session_timeout	60
session_autoclose	300
max_file_size	1099511627776
min_compat_client	-1 (unspecified)
last_failure	0
last_failure_osd_epoch	105
compat	compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,8=no anchor table,9=file layout v2,10=snaprealm v2}
max_mds	1
in	0
up	{0=56243}
failed
damaged
stopped
data_pools	[1]
metadata_pool	2
inline_data	disabled
balancer
standby_count_wanted	1
[mds.cluster-rgw001-jinsample-jp2v-dev{0:56243} state up:active seq 36 addr [v2:10.241.55.81:6800/641438653,v1:10.241.55.81:6801/641438653]]


Standby daemons:

[mds.cluster-mds001-jinsample-jp2v-dev{-1:56268} state up:standby seq 2 addr [v2:10.231.191.157:6800/2379847156,v1:10.231.191.157:6801/2379847156]]
# ceph config show mds.cluster-mds001
NAME                       VALUE                                                                                                                                         SOURCE   OVERRIDES IGNORES
cluster_network            ********/16                                                                                                                                 file
daemonize                  false                                                                                                                                         override
keyring                    $mds_data/keyring                                                                                                                             default
mds_cache_memory_limit     34359738368                                                                                                                                   file
mds_log_events_per_segment 1024                                                                                                                                          file
mds_log_max_segments       512                                                                                                                                           file
mon_host                   [************************] file
public_network             0.0.0.0/0                                                                                                                                     file
rbd_default_features       61                                                                                                                                            default
setgroup                   ceph                                                                                                                                          cmdline
setuser                    ceph                                                                                                                                          cmdline

@vshankar
Copy link
Contributor

@jinmyeonglee Do you have any file system wide config set? Could you share ceph fs dump?

# ceph fs dump
dumped fsmap epoch 49
e49
enable_multiple, ever_enabled_multiple: 0,0
compat: compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,8=no anchor table,9=file layout v2,10=snaprealm v2}
legacy client fscid: 1

Filesystem 'cephfs' (1)
fs_name	cephfs
epoch	49
flags	12
created	2022-12-06 09:52:15.442781
modified	2022-12-06 14:30:28.201134
tableserver	0
root	0
session_timeout	60
session_autoclose	300
max_file_size	1099511627776
min_compat_client	-1 (unspecified)
last_failure	0
last_failure_osd_epoch	105
compat	compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,8=no anchor table,9=file layout v2,10=snaprealm v2}
max_mds	1
in	0
up	{0=56243}
failed
damaged
stopped
data_pools	[1]
metadata_pool	2
inline_data	disabled
balancer
standby_count_wanted	1
[mds.cluster-rgw001-jinsample-jp2v-dev{0:56243} state up:active seq 36 addr [v2:10.241.55.81:6800/641438653,v1:10.241.55.81:6801/641438653]]


Standby daemons:

[mds.cluster-mds001-jinsample-jp2v-dev{-1:56268} state up:standby seq 2 addr [v2:10.231.191.157:6800/2379847156,v1:10.231.191.157:6801/2379847156]]
# ceph config show mds.cluster-mds001
NAME                       VALUE                                                                                                                                         SOURCE   OVERRIDES IGNORES
cluster_network            ********/16                                                                                                                                 file
daemonize                  false                                                                                                                                         override
keyring                    $mds_data/keyring                                                                                                                             default
mds_cache_memory_limit     34359738368                                                                                                                                   file
mds_log_events_per_segment 1024                                                                                                                                          file
mds_log_max_segments       512                                                                                                                                           file
mon_host                   [************************] file
public_network             0.0.0.0/0                                                                                                                                     file
rbd_default_features       61                                                                                                                                            default
setgroup                   ceph                                                                                                                                          cmdline
setuser                    ceph                                                                                                                                          cmdline

Nothing stands out as unusual (suspected inline data being used). I recommend trying quincy (or pacific) to see if you can reproduce it.

@jinmyeonglee
Copy link
Contributor Author

Nothing stands out as unusual (suspected inline data being used). I recommend trying quincy (or pacific) to see if you can reproduce it.

Thanks for checking, I will share the test result with pacific as soon as possible.

@jinmyeonglee
Copy link
Contributor Author

@vshankar Hello, I tested with Pacific (16.2.9), and checked the same issue in this release version.

pacific-release branch

[root@bb08c0cbc429 volume_mount]# getfattr -n ceph.quota.max_files ./
# file: .
ceph.quota.max_files="8"


[root@bb08c0cbc429 volume_mount]# cp -R /home/big_dir/ ./

[root@bb08c0cbc429 volume_mount]# tree | wc -l
344

[root@bb08c0cbc429 build]# ./bin/ceph-fuse --version
ceph version 16.0.0-13701-g4c3647a322c (4c3647a322c0ff5a1dd2344e039859dcbd28c830) pacific (stable)

pacific-release + MDS Quota Limit

[root@72e1dc6a8dae volume_mount]# getfattr -n ceph.quota.max_files ./
# file: .
ceph.quota.max_files="8"

[root@72e1dc6a8dae volume_mount]# cp -R /home/big_dir/ ./

[root@72e1dc6a8dae volume_mount]# tree
.
└── big_dir
    ├── file_1
    ├── file_2
    ├── file_3
    ├── file_4
    ├── file_5
    ├── file_6
    └── file_7

1 directory, 7 files

[root@72e1dc6a8dae build]# ./bin/ceph-fuse --version
ceph version 16.0.0-13704-g27ffe06ee11 (27ffe06ee11f27a8c84c64c9bfa38be64fd2a04f) pacific (stable)

if (cur->inode->get_projected_inode()->quota.is_enable()) {
return cur;
}
cur = cur->get_parent_dir();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vshankar Additionally, I confirmed the cur can be null in this line, so I am implementing to fix this issue.

@vshankar
Copy link
Contributor

vshankar commented Jan 2, 2023

@vshankar Hello, I tested with Pacific (16.2.9), and checked the same issue in this release version.

I was on year end PTO. Sorry for the delay in reply.

If this is seen in pacific, then I guess you might be hitting a bug somewhat. As far as this change is concerned, the underlying issue is that the client is (for some reason) not enforcing quota which it really should albeit after some delay. Do you have the debug client logs to look at?

@jinmyeonglee
Copy link
Contributor Author

jinmyeonglee commented Jan 9, 2023

I guess you might be hitting a bug somewhat

Well, As you said in the early comment, the delay is allowed and file creation more than the quota is expected.
(And, as my analysis is correct, the delay is caused by some time to get info about rstat from MDS.)

But I want to suggest "how about limiting the quota more strictly?". So I added the quota checking logic in MDS.

I think this is quite easy to reproduce, so I guess you could check the same issue with the naive version MDS. 😉

Anyway, I will share the raw client-side logs with a file.

@jinmyeonglee
Copy link
Contributor Author

jinmyeonglee commented Jan 10, 2023

client_log.txt

result from ceph-fuse -d --debug-client=20 --debug-ms=1 2>&1 | tee log.txt
with naive version MDS, which can create files more than the quota.

@vshankar
Copy link
Contributor

client_log.txt

result from ceph-fuse -d --debug-client=20 --debug-ms=1 2>&1 | tee log.txt with naive version MDS, which can create files more than the quota.

Thanks for sharing the log.

@jinmyeonglee
Copy link
Contributor Author

@vshankar Hello, I do not want to rush you, but is there any update or discovery after checking the logs?

@vshankar
Copy link
Contributor

vshankar commented Feb 1, 2023

@vshankar Hello, I do not want to rush you, but is there any update or discovery after checking the logs?

@mchangir is looking into this. I believe we'll have an update soon in this. However, I still don't think we need to introduce checks on the mds.

@vshankar
Copy link
Contributor

vshankar commented Feb 1, 2023

@jinmyeonglee @mchangir mentioned that the MDS broadcasts quota (tree) information to all clients except the client doing the quota xattr update (since the client would have the relevant caps anyway). There is probably a bug in the client that is not recording the updated quota setting (max_files in this case) causing the client to not enforce the limits.

@mchangir
Copy link
Contributor

mchangir commented Feb 1, 2023

There's a mds_dirstat_min_interval config option that is used to delay the propagation of rstat up the tree for the configured interval or when nest locks can't be grabbed on the current path.

With my tests, if the mds_dirstat_min_interval is set to zero for the entire cluster, I could see that the quota is enforced at the exact value of ceph.quota.max_files without any delay.

So, there doesn't seem to be any bug on the client side as far as I can tell.

@vshankar
Copy link
Contributor

vshankar commented Feb 1, 2023

There's a mds_dirstat_min_interval config option that is used to delay the propagation of rstat up the tree for the configured interval or when nest locks can't be grabbed on the current path.

With my tests, if the mds_dirstat_min_interval is set to zero for the entire cluster, I could see that the quota is enforced at the exact value of ceph.quota.max_files without any delay.

So, there doesn't seem to be any bug on the client side as far as I can tell.

@jinmyeonglee provided client logs here - #49326 (comment)

Would be interesting to see what's going on there...

@vshankar
Copy link
Contributor

vshankar commented Feb 1, 2023

@jinmyeonglee @mchangir from the client log @jinmyeonglee shared:

2023-01-10T06:29:04.527+0000 7ff4f1ffb700  8 client.4346 _create(0x10000000005 file_193, 0100644)
2023-01-10T06:29:04.527+0000 7ff4f1ffb700 10 client.4346 get_quota_root realm 0x10000000003
2023-01-10T06:29:04.527+0000 7ff4f1ffb700 10 client.4346 get_quota_root 0x10000000005.head -> 0x10000000003.head   unique: 6880, error: -122 (Disk quota exceeded), outsize: 16

2023-01-10T06:29:04.527+0000 7ff4f1ffb700  8 client.4346 _ll_create 0x10000000005.head file_193 0100644 193 = -122 (0 0)

-EDQUOT is returned for file_193 under big_dir.

@mchangir
Copy link
Contributor

mchangir commented Feb 1, 2023

@jinmyeonglee are you using Nautilus 14.2.19 for both: fuse-client as well as server ?

because I can see ceph version 16.0.0-13701-g4c3647a322c (4c3647a322c0ff5a1dd2344e039859dcbd28c830) pacific (stable) as the fuse client version in the client logs

just want to double confirm the versions

@jinmyeonglee
Copy link
Contributor Author

jinmyeonglee commented Feb 2, 2023

@jinmyeonglee are you using Nautilus 14.2.19 for both: fuse-client as well as server ?

because I can see ceph version 16.0.0-13701-g4c3647a322c (4c3647a322c0ff5a1dd2344e039859dcbd28c830) pacific (stable) as the fuse client version in the client logs

just want to double confirm the versions

We are using Nautilus in our service, but vshankar requested me to test in Pacific(or Quincy), so I gave the logs from Pacific Cluster & Client.

I think there is a little misunderstanding.

So, there doesn't seem to be any bug on the client side as far as I can tell.

I did not say the delay is a bug. (I mentioned it is not a bug in here: #49326 (comment))

As the official documents(https://docs.ceph.com/en/latest/cephfs/quota/), a few delays are expected and allowed.

  1. Quotas are imprecise. Processes that are writing to the file system will be stopped a short time after the quota limit is reached. They will inevitably be allowed to write some amount of data over the configured limit. How far over the quota they are able to go depends primarily on the amount of time, not the amount of data. Generally speaking writers will be stopped within 10s of seconds of crossing the configured limit.

So, I wanted to suggest "how about limiting the quota more strictly?". So I added the quota checking logic in MDS.

I will test the mds_dirstat_min_interval config. Thanks for sharing!
(It would be great to be shared in the above official document.)

Anyway, it will be good if it could enforce the quota more strictly.
But I am a little bit afraid mds_dirstat_min_interval will cause performance degradation.

@vshankar
Copy link
Contributor

vshankar commented Feb 2, 2023

@jinmyeonglee quick question - you did see EDQUOT errno when creating files although after a delay?

@jinmyeonglee
Copy link
Contributor Author

@jinmyeonglee quick question - you did see EDQUOT errno when creating files although after a delay?

yes, after a delay, I got the EDQUOT. (#49326 (comment))
The number of files in big_dir was more than 1000. So other files creating request got the errno after creating the 344th files.

My question was about why MDS allows the delay and allows creating more files than the quota, and suggested checking the quota on the MDS-side.

@vshankar
Copy link
Contributor

vshankar commented Feb 2, 2023

@jinmyeonglee quick question - you did see EDQUOT errno when creating files although after a delay?

yes, after a delay, I got the EDQUOT. (#49326 (comment)) The number of files in big_dir was more than 1000. So other files creating request got the errno after creating the 344th files.

The tracker you raised does not mention that - https://tracker.ceph.com/issues/58216

(no mention that EDQUOT is seen)

My question was about why MDS allows the delay and allows creating more files than the quota, and suggested checking the quota on the MDS-side.

This is as per design as quota is enforced by the client. I don't thoroughly recall why this path was chosen - it was related to quota trees IIRC.

@jinmyeonglee
Copy link
Contributor Author

jinmyeonglee commented Feb 2, 2023

Well, first of all, if I make you confused, I am really sorry.
#49326 (comment)

As the very early comment, I did not say it is a bug or wrong behavior.
I just asked why this design is selected, and suggested changing the way to force it from MDS-side.

I want to know why only the client checks the quota, not MDS. Was there any background?

For my above question, you did not answer me, and now you are saying you cannot recall the whole history.
Then, one more time, I just kindly want to ask the community "how about checking quota.max_files from MDS-side? Is it a better approach?".

@vshankar
Copy link
Contributor

vshankar commented Feb 2, 2023

Well, first of all, if I make you confused, I am really sorry. #49326 (comment)

As the very early comment, I did not say it is a bug or wrong behavior. I just asked why this design is selected, and suggested changing the way to force it from MDS-side.

Its a misunderstanding then - not anybody's fault. To me it seemed like no matter how many files you create, the quota is not enforced. But since that's not the case, its not a bug as per the design.

I want to know why only the client checks the quota, not MDS. Was there any background?

For my above question, you did not answer me, and now you are saying you cannot recall the whole history.

That's because I was not involved with CephFS development when quota was designed and implemented. I could only check the commit history, mail archives or ask around to other devs who were involved. But since my interpretation of this issue was that quotas are not working at all, my priority was to see if bugs are lurking around which needs immediate attention.

Then, one more time, I just kindly want to ask the community "how about checking quota.max_files from MDS-side? Is it a better approach?".

Check out this thread - https://www.spinics.net/lists/ceph-devel/msg39432.html

Basically, the server-enforced quota restriction was thought about, but given that such a design would benefit from the client having to do the (quota tree) checks anyway.

@mchangir
Copy link
Contributor

mchangir commented Feb 8, 2023

Looking the the old discussion thread about quota design and implementation, it looks like the Client based quota enforcement was chosen due to its distributed nature which helps reduce contention and load at the server (MDS) side.

@jinmyeonglee Did mds_dirstat_min_interval = 0 work out any better for you ?
It does have performance implications as you said earlier.
If possible, you could benchmark the performance of this PR with the one by just using mds_dirstat_min_interval = 0.

@jinmyeonglee
Copy link
Contributor Author

@mchangir Thanks for your suggestion.

If possible, you could benchmark the performance of this PR with the one by just using mds_dirstat_min_interval = 0.

I will benchmark and see the result. (I will use fio and smallfile tools.)

@vshankar
Copy link
Contributor

@mchangir Thanks for your suggestion.

If possible, you could benchmark the performance of this PR with the one by just using mds_dirstat_min_interval = 0.

I will benchmark and see the result. (I will use fio and smallfile tools.)

@jinmyeonglee I guess this change can be closed? If you need to continue the discussion, then please use the tracker or the mailing list.

@vshankar
Copy link
Contributor

@mchangir Thanks for your suggestion.

If possible, you could benchmark the performance of this PR with the one by just using mds_dirstat_min_interval = 0.

I will benchmark and see the result. (I will use fio and smallfile tools.)

@jinmyeonglee I guess this change can be closed? If you need to continue the discussion, then please use the tracker or the mailing list.

@jinmyeonglee ping?

@jinmyeonglee
Copy link
Contributor Author

@mchangir Thanks for your suggestion.

If possible, you could benchmark the performance of this PR with the one by just using mds_dirstat_min_interval = 0.

I will benchmark and see the result. (I will use fio and smallfile tools.)

@jinmyeonglee I guess this change can be closed? If you need to continue the discussion, then please use the tracker or the mailing list.

@jinmyeonglee ping?

@vshankar Oh, sorry for the late response.
Yes, it is okay to close, when I need to discuss, I will reopen this. Thanks a lot!

@vshankar vshankar closed this Mar 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cephfs Ceph File System
Projects
None yet
4 participants