Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

can not copy files with sticky bit set #189

Closed
dougbenjamin opened this issue Jan 9, 2015 · 4 comments
Closed

can not copy files with sticky bit set #189

dougbenjamin opened this issue Jan 9, 2015 · 4 comments
Assignees
Labels

Comments

@dougbenjamin
Copy link

Hi,

Not sure if this is a bug or a design feature.
Using this configuration file on a stand alone file server -

all.export /grid/atlas readonly

The adminpath and pidpath variables indicate where the pid and various

IPC files should be placed

all.adminpath /var/spool/xrootd
all.pidpath /var/run/xrootd


If my files have this permission on the file server -

[root@atlasfs02 ~]# ls -l /grid/atlas/test/xrootd/
total 5675160
-rwxrwxr-x. 1 usatlas2 hpcusers 5811355678 Jan 9 16:53 DAOD_TOPQ1.04503416._000074.pool.root.1

I can copy the files but if the file permissions are

[root@atlasfs02 ~]# ls -l /grid/atlas/test/xrootd/
total 5675160
-rwxrwxr-t. 1 usatlas2 hpcusers 5811355678 Jan 9 16:53 DAOD_TOPQ1.04503416._000074.pool.root.1

xrdcp does not work -

[dbenjamin@atlas28 ~]$ xrdcp -f --debug 2 root://atlasfs02.hep.anl.gov//grid/atlas/test/xrootd/DAOD_TOPQ1.04503416._000074.pool.root.1 /dev/null
[2015-01-09 17:33:17.959197 -0600][Debug ][Utility ] CopyProcess: 2 jobs to prepare
[2015-01-09 17:33:17.959295 -0600][Debug ][Utility ] Creating a classic copy job, from root://atlasfs02.hep.anl.gov:1094//grid/atlas/test/xrootd/DAOD_TOPQ1.04503416._000074.pool.root.1 to file:///dev/null
[2015-01-09 17:33:17.959321 -0600][Debug ][Utility ] Monitor library name not set. No monitoring
[2015-01-09 17:33:17.959359 -0600][Debug ][Utility ] Opening root://atlasfs02.hep.anl.gov:1094//grid/atlas/test/xrootd/DAOD_TOPQ1.04503416._000074.pool.root.1 for reading
[2015-01-09 17:33:17.959392 -0600][Debug ][File ] [0x1e48620@root://atlasfs02.hep.anl.gov:1094//grid/atlas/test/xrootd/DAOD_TOPQ1.04503416._000074.pool.root.1] Sending an open command
[2015-01-09 17:33:17.959427 -0600][Debug ][Poller ] Available pollers: built-in
[2015-01-09 17:33:17.959434 -0600][Debug ][Poller ] Attempting to create a poller according to preference: built-in,libevent
[2015-01-09 17:33:17.959441 -0600][Debug ][Poller ] Creating poller: built-in
[2015-01-09 17:33:17.959449 -0600][Debug ][Poller ] Creating and starting the built-in poller...
[2015-01-09 17:33:17.959666 -0600][Debug ][TaskMgr ] Starting the task manager...
[2015-01-09 17:33:17.959731 -0600][Debug ][TaskMgr ] Task manager started
[2015-01-09 17:33:17.959750 -0600][Debug ][JobMgr ] Starting the job manager...
[2015-01-09 17:33:17.959839 -0600][Debug ][JobMgr ] Job manager started, 3 workers
[2015-01-09 17:33:17.959860 -0600][Debug ][TaskMgr ] Registering task: "FileTimer task" to be run at: [2015-01-09 17:33:17 -0600]
[2015-01-09 17:33:17.959916 -0600][Debug ][PostMaster ] Creating new channel to: atlasfs02.hep.anl.gov:1094 1 stream(s)
[2015-01-09 17:33:17.959963 -0600][Debug ][PostMaster ] [atlasfs02.hep.anl.gov:1094 #0] Stream parameters: Network Stack: IPAuto, Connection Window: 120, ConnectionRetry: 5, Stream Error Widnow: 1800
[2015-01-09 17:33:17.960157 -0600][Debug ][TaskMgr ] Registering task: "TickGeneratorTask for: atlasfs02.hep.anl.gov:1094" to be run at: [2015-01-09 17:33:32 -0600]
[2015-01-09 17:33:17.961381 -0600][Debug ][PostMaster ] [atlasfs02.hep.anl.gov:1094] Found 1 address(es): [::ffff:146.139.33.67]:1094
[2015-01-09 17:33:17.961508 -0600][Debug ][AsyncSock ] [atlasfs02.hep.anl.gov:1094 #0.0] Attempting connection to [::ffff:146.139.33.67]:1094
[2015-01-09 17:33:17.961543 -0600][Debug ][Poller ] Adding socket 0x1e4db40 to the poller
[2015-01-09 17:33:17.962311 -0600][Debug ][AsyncSock ] [atlasfs02.hep.anl.gov:1094 #0.0] Async connection call returned
[2015-01-09 17:33:17.962359 -0600][Debug ][XRootDTransport ] [atlasfs02.hep.anl.gov:1094 #0.0] Sending out the initial hand shake + kXR_protocol
[2015-01-09 17:33:17.962949 -0600][Debug ][XRootDTransport ] [atlasfs02.hep.anl.gov:1094 #0.0] Got the server hand shake response (type: server [], protocol version 300)
[2015-01-09 17:33:17.962986 -0600][Debug ][XRootDTransport ] [atlasfs02.hep.anl.gov:1094 #0.0] kXR_protocol successful (type: server [], protocol version 300)
[2015-01-09 17:33:17.963366 -0600][Debug ][XRootDTransport ] [atlasfs02.hep.anl.gov:1094 #0.0] Sending out kXR_login request, username: dbenjami, cgi: ?xrd.cc=us&xrd.tz=-6&xrd.appname=xrdcp&xrd.info=&xrd.hostname=localhost, dual-stack: false, private IPv4: false, private IPv6: true
[2015-01-09 17:33:17.964024 -0600][Debug ][XRootDTransport ] [atlasfs02.hep.anl.gov:1094 #0.0] Logged in, session: 04000000352400001400000004000000
[2015-01-09 17:33:17.964055 -0600][Debug ][PostMaster ] [atlasfs02.hep.anl.gov:1094 #0] Stream 0 connected.
[2015-01-09 17:33:17.965027 -0600][Debug ][TaskMgr ] Registering task: "WaitTask for: 0x0x1e487d0" to be run at: [2015-01-09 17:34:17 -0600]
[2015-01-09 17:34:17.970096 -0600][Debug ][TaskMgr ] Done with task: "WaitTask for: 0x0x1e487d0"
[2015-01-09 17:34:17.971219 -0600][Debug ][TaskMgr ] Registering task: "WaitTask for: 0x0x1e487d0" to be run at: [2015-01-09 17:35:17 -0600]

and the WaitTask just repeats itself.

here is a snippet of the xrootd server log -

50109 17:31:35 9289 XrootdXeq: dbenjami.8418:7@atlas28 pub IPv4 login
150109 17:31:35 9289 XrootdXeq: dbenjami.8418:7@atlas28 disc 0:00:00
150109 17:31:35 9276 XrootdXeq: dbenjami.8418:21@atlas28 disc 0:00:00
150109 17:32:41 9275 ?:20@atlas28 XrootdProtocol: 0000 req=3007 dlen=71
150109 17:32:41 9275 dbenjami.8434:20@atlas28 XrootdResponse: 0000 sending 16 data bytes
150109 17:32:41 9275 XrootdXeq: dbenjami.8434:20@atlas28 pub IPv4 login
150109 17:32:41 9275 dbenjami.8434:20@atlas28 XrootdProtocol: 0100 req=3010 dlen=63
150109 17:32:41 9275 dbenjami.8434:20@atlas28 XrootdProtocol: 0100 open rat /grid/atlas/test/xrootd/DAOD_TOPQ1.04503416._000074.pool.root.1
150109 17:32:41 9275 dbenjami.8434:20@atlas28 XrootdProtocol: 0100 stalling client for 60 sec
150109 17:32:41 9275 dbenjami.8434:20@atlas28 XrootdResponse: 0100 sending 104 data bytes; status=4005
150109 17:32:44 9275 dbenjami.8434:20@atlas28 XrootdProtocol: 0100 request timeout; read 0 of 24 bytes

@abh3
Copy link
Member

abh3 commented Jan 10, 2015

Hi Doug,

Correct, the sticky bit is used by xrootd to indicate that the file is incomplete (i.e. pending to be staged or some other action that requires file completion). So, when you open such a file, the xrootd will stall the client until the sticky bit is turned off (as you see in the log). In your case, that will never happen because it was manually set on. Why did you do that anyway?

Andy

@dougbenjamin
Copy link
Author

Hi,

it was set because it is in a shared file system that is not managed by xrootd. I wanted to have ownership other than xrootd and to have
proper ACL's.

Doug

On 01/10/2015 02:13 AM, Andrew Hanushevsky wrote:

Hi Doug,

Correct, the sticky bit is used by xrootd to indicate that the file is incomplete (i.e. pending to be staged or some other action that requires file completion). So, when you open such a file, the xrootd will stall the client until the sticky bit is turned off (as you see in the log). In your case, that will never happen because it was manually set on. Why did you do that anyway?

Andy


Reply to this email directly or view it on GitHub #189 (comment).

@abh3
Copy link
Member

abh3 commented Jan 12, 2015

OK, let's talk about this offline so I can get the complete picture.

@abh3 abh3 self-assigned this Jul 7, 2015
@abh3 abh3 added the wontfix label Jul 11, 2015
@abh3
Copy link
Member

abh3 commented Jul 11, 2015

I am closing this. I suppose this is a strange file system. In most other fie systems (even ones with ACLs) the sticky bit has no meaning and is ignored (other than being display by ls). So, it's safe to use it to indicate the file is not complete. If there is still an issue, we can revisit this.

@abh3 abh3 closed this as completed Jul 11, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants