Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ls command gets in infinite loop #578

Closed
rvelasquezsos opened this issue May 14, 2020 · 15 comments
Closed

ls command gets in infinite loop #578

rvelasquezsos opened this issue May 14, 2020 · 15 comments

Comments

@rvelasquezsos
Copy link

Hi, I have a Ganesha gateway with FSAL CEPH, I have an export directory with around 1.000.000 and 1.5TB of size.
When I try 'ls' from the console I get the error "memory exhausted" and monitoring with top I can see that the ls proccess reach the 100% of CPU and memory
When I try 'ls -f -1' prints the names of files but never finish, it gets in infinite loop repeating the files over and over.

Thanks for your help

@dang
Copy link
Contributor

dang commented May 14, 2020

What versions of Ganesha and Ceph?

@rvelasquezsos
Copy link
Author

Ceph Nautlius 14.2.5
Ganesha 2.8.3
The problem is in exports with NFS v3 and V4

Thanks

@dang
Copy link
Contributor

dang commented May 15, 2020

Unfortunately, looping readdir issues on large directories are very annoying to debug. To my knowledge, no one on the dev team has run such a large setup on FSAL_CEPH, although we have on FSAL_GLUSTER, FSAL_VFS, and FSAL_RGW. This makes me think it may be some interaction between Ganesha and CephFS. I'll see if I can set up such a cluster, but it may take me a bit due to my day job.
@jtlayton Have you run large directories on CephFS? Do you have a setup that can test this reasonably easily?

@rvelasquezsos
Copy link
Author

Thank you @dang
I appreciate what you can do. I will wait for your progress

@jtlayton
Copy link
Contributor

Not at the moment, but I may be able to set something up next week. I'll see what I can do.

@jtlayton
Copy link
Contributor

jtlayton commented May 18, 2020

Ok, I built ganesha (from next branch) and pointed it at my octopus cluster-in-a-box. From a ceph client, I created a directory with 1,000,000 files in it. I then had an NFS (v4.2) client do a 'ls -1' in that directory. It took a few minutes, but it did eventually return -- no looping.

Is there a way you can test a more recent ganesha? Maybe v3.2-ish? It's possible that this is a bug in an older version that is already fixed upstream. Also, v14.2.5 is pretty old, you may want to see about updating the libcephfs library you're using with FSAL_CEPH to something newer -- I was testing v14.2.9 as it's what I had.

@rvelasquezsos
Copy link
Author

Hi @jtlayton thanks for your reply. I'll try update the whole cluster and repeat the test.
Actually is in production and I need a window to make the upgrade.

Thanks again

@jtlayton
Copy link
Contributor

No problem. If you don't want to do all of that, just updating the parts that ganesha needs would be a good first step. In particular, updating libcephfs2 and the ganesha binaries on the host running ganesha is where I'd start. If that doesn't help, then maybe we can look a bit more closely -- it's possible there's something in particular about this directory that is causing the loops.

@rvelasquezsos
Copy link
Author

Hi, I made the update of the cluster and the problem is gone.

Thanks, I'm going to close the issue

@jtlayton
Copy link
Contributor

jtlayton commented Jun 1, 2020

Great news. Thanks for following up!

@jtlayton
Copy link
Contributor

jtlayton commented Jun 9, 2020

@rvelasquezsos , could you say what versions you updated to? Both the libcephfs and nfs-ganesha versions would be of interest. tia!

@rvelasquezsos
Copy link
Author

Hi @jtlayton I updated just the nfs-ganesha.

Thanks

@jtlayton
Copy link
Contributor

jtlayton commented Jun 9, 2020

Do you recall the version? It'd be helpful for us to know which one fixed it for you.

@jtlayton
Copy link
Contributor

@rvelasquezsos : ping? To what version of ganesha did you update?

@rvelasquezsos
Copy link
Author

Hi @jtlayton I upgraded to 3.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants