Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fails to connect to Amazon ElasticFileSystem (EFS) #30

Open
zachl123 opened this issue Apr 6, 2018 · 8 comments
Open

Fails to connect to Amazon ElasticFileSystem (EFS) #30

zachl123 opened this issue Apr 6, 2018 · 8 comments

Comments

@zachl123
Copy link

zachl123 commented Apr 6, 2018

Hi There,
I'm trying to connect to EFS using libnfs. My EC2 instance is able to mount EFS using the standard NFS tools, but when I attempt to test using:
LD_LIBRARY_PATH=/usr/local/lib LD_NFS_DEBUG=9 LD_PRELOAD=./ld_nfs.so cp /etc/hosts nfs://fs-XXXXXXX.efs.us-west-2.amazonaws.com/test.txt
It hangs when trying to connect to rpc/portmapper on port 111 of the EFS host.
(strace: connect(3, {sa_family=AF_INET, sin_port=htons(111), sin_addr=inet_addr("10.0.0.XXX")}, 16) = -1 EINPROGRESS (Operation now in progress) )
EFS only listens on port 2049, and this doesn't appear to be a problem for the native nfs tools. Is there a trick to tell it to skip the portmap part?
I saw in another ticket that examples/portmap-client might be used to manually set portmap - perhaps it would help for me to do this. Could you possibly give me the specific syntax?
Also, eventually, I'm hoping to access this using a lambda function without root access, so a solution that doesn't require root is highly desirable.

Thank you in advance for your help!!

@sahlberg
Copy link
Owner

sahlberg commented Apr 8, 2018

Please try the current master branch for libnfs.

I have added support for nfsport= as well as mountport= arguments.
This allows to bypass portmapper completely and connect directly to the specified port for MOUNT (only usedd in NFSv3) and to connect directly to the NFS port.

Syntax is like this :
./utils/nfs-ls nfs://127.0.0.1/data/sahlberg/fuse?version=4&nfsport=2049

You want to set version=4 since libnfs defaults to version 3.
In version 3, the server needs to listen to at least 2 ports (and 4 ports if you use locking)
while in NFSv4 it is sufficient to use only one single port, most often 2049.

@zachl123
Copy link
Author

zachl123 commented Apr 8, 2018 via email

@zachl123
Copy link
Author

zachl123 commented Apr 9, 2018

I think it's really close - I'm able to ls and cat files. I can also cp them from the nfs share.
But I can't seem to write to them. Actually writing manages to create the file, but doesn't write the contents.
I'm running:
LD_LIBRARY_PATH=./lib/.libs LD_NFS_DEBUG=9 LD_PRELOAD=./ld_nfs.so cp -f /etc/hosts 'nfs://10.0.1.231//test3?version=4&nfsport=2049'
and I get:
`ld_nfs: __xstat(nfs://10.0.1.231//test1?version=4&nfsport=2049&autoreconnect=-1)
ld_nfs: open(nfs://10.0.1.231//test1?version=4&nfsport=2049&autoreconnect=-1, 0, 0)
ld_nfs: Failed to open nfs file : open call failed with "NFS4: (path /) failed with NFS4ERR_NOENT(-2)"

ld_nfs: __xstat(nfs://10.0.1.231//test1?version=4&nfsport=2049&autoreconnect=-1)
ld_nfs: open(nfs://10.0.1.231//test1?version=4&nfsport=2049&autoreconnect=-1, 0, 0)
ld_nfs: Failed to open nfs file : open call failed with "NFS4: (path /) failed with NFS4ERR_NOENT(-2)"

ld_nfs: open(nfs://10.0.1.231//test1?version=4&nfsport=2049&autoreconnect=-1, c1, 644)
ld_nfs: open(nfs://10.0.1.231//test1?version=4&nfsport=2049&autoreconnect=-1) == 4
ld_nfs: __fxstat(4)
ld_nfs: __fxstat(4) success
ld_nfs: write(fd:4 count:65536)
cp: error writing ‘nfs://10.0.1.231//test1?version=4&nfsport=2049&autoreconnect=-1’: Operation not permitted
cp: failed to extend ‘nfs://10.0.1.231//test1?version=4&nfsport=2049&autoreconnect=-1’: Operation not permitted
ld_nfs: close(4)`

I've experimented with different uid/gid combinations, but I don't think it is that or an insecure mounting issue.
I've traced the error in the code to:
zdr_COMPOUND4args->libnfs_zdr_array
in the following loop:
for (i = 0; i < (int)*size; i++) { if (!proc(zdrs, *arrp + i * elsize)) { return FALSE; } }

Thoughts?
Thanks!

@zachl123
Copy link
Author

zachl123 commented Apr 9, 2018

UPDATE:
I can write files up to 3840 bytes.
The first write fails, but creates the file:

[ec2-user]$ LD_LIBRARY_PATH=./lib/.libs LD_PRELOAD=./ld_nfs.so cp <(head -c3840 /dev/random) 'nfs://10.0.XXX.XXX//test?version=4&nfsport=2049'
cp: error writing ‘nfs://10.0.XXX.XXX//test?version=4&nfsport=2049’: Numerical result out of range
cp: failed to extend ‘nfs://10.0.XXX.XXX//test?version=4&nfsport=2049’: Numerical result out of range
[ec2-user]$ ls -l ../mnt
total 4
-rw------- 1 ec2-user ec2-user 0 Apr 9 07:14 test

The second attempt properly writes:

[ec2-user]$ LD_LIBRARY_PATH=./lib/.libs LD_PRELOAD=./ld_nfs.so cp <(head -c3840 /dev/random) 'nfs://10.0.XXX.XXX//test?version=4&nfsport=2049'
[ec2-user libnfs]$ ls -l ../mnt
total 4
-rw------- 1 ec2-user ec2-user 3840 Apr 9 07:14 test

And here is with more than 3840 bytes:

[ec2-user]$ LD_LIBRARY_PATH=./lib/.libs LD_PRELOAD=./ld_nfs.so cp <(head -c3841 /dev/random) 'nfs://10.0.XXX.XXX//test?version=4&nfsport=2049'
cp: error writing ‘nfs://10.0.XXX.XXX//test?version=4&nfsport=2049’: Operation not permitted
cp: failed to extend ‘nfs://10.0.XXX.XXX//test?version=4&nfsport=2049’: Operation not permitted

Perhaps this is a just a problem with the LD_PRELOAD test script. I will verify and comment

@zachl123
Copy link
Author

zachl123 commented Apr 9, 2018

Is it possible to pass the nfs version/port to libnfs-python in the context-free mode? (I'm able to get it to work when I create a context).
Using the libnfs-python, I am able to create files >3840 bytes, but I must write less than 3841 bytes for each write.

@vlaero
Copy link

vlaero commented Apr 9, 2018

I came across the issue report as I'm interested in the same use case that zachl123 is - I'd like to be able to mount nfs inside amazon linux 1 - and Lambda. I currently have a workload that requires interacting with a glusterfs share, but have an eye to moving this to AWS where EFS might be used instead of gluster.
We don't currently have nfs-ganesha-gluster installed on our setup, but I will look at configuring that as it should then allow NFSv4 mounting of the gluster volumes.
@zachl123 - I'm interested to see your steps on getting this going.
I've compiled libnfs and checked out this repo, so I have all the required client parts
I suspect that if this could be made to work OK and is well documented that it may become the defacto way to interact with NFS under AWS lambda.

@sahlberg
Copy link
Owner

sahlberg commented Apr 12, 2018 via email

@zachl123
Copy link
Author

zachl123 commented May 4, 2020

After two years, I'm back looking at this.
For the record the network trace showed nothing. The larger write just never got sent, but I did find where the issue is:
in lib/libnfs-zdr.c, the function libnfs_zdr_bytes checks the size:
if (zdrs->pos + (int)*size > zdrs->size) {
return FALSE;
}

It looks like zdrs->size is always allocated to be 4096 (ZDR_ENCODEBUF_MINSIZE). There is code in lib/pdu.c (rpc_allocate_pdu2) to provide an alloc_hint, but this is called from rpc_allocate_pdu with the alloc_hint hardcoded to 0.

As a POC, I have changed the ZDR_ENCODEBUF_MINSIZE to 1MB and I can now write 1MB blocks. Changing it to anything much larger fails again, but in a different place. Hopefully this points you in the right direction.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants