Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cms_Open: Unable to connect socket to #154

Closed
marcelkuri opened this issue Oct 28, 2014 · 12 comments
Closed

cms_Open: Unable to connect socket to #154

marcelkuri opened this issue Oct 28, 2014 · 12 comments

Comments

@marcelkuri
Copy link

After I updated to xrootd 4.0.4 my fuse mount stop working properly.

This is the log file of a data server (/var/log/xrootd/xrootd.log):
cms_Open: Unable to connect socket to /tmp/.olb/olbd.admin; connection refused

I followed this tutorial to create an automount:
http://xrootd.org/doc/man/xrootdfs.1.html

Before the update my mountpoint /atlas was showed as a big space. But now it shows only the space of the redirector machine.

@marcelkuri
Copy link
Author

Ah, cmsd connected. But /atlas still don't show all space.
This is the new log (lipnode01 is the redirector. lipnode02, 03... are data servers)

141028 14:27:19 14667 cms_Finder: Connected to cmsd via /tmp/.olb/olbd.admin
Config warning: this hostname, lipnode02, is registered without a domain qualification.
141028 14:31:37 14941 Starting on Linux 2.6.32-431.29.2.el6.x86_64
Copr. 2004-2012 Stanford University, xrd version v4.0.4
++++++ xrootd anon@lipnode02 initialization started.
Config using configuration file /etc/xrootd/xrootd-clustered.cfg
Config maximum number of connections restricted to 65536
Copr. 2012 Stanford University, xrootd protocol 3.0.0 version v4.0.4
++++++ xrootd protocol initialization started.
=====> all.export /data/xrootdfs
Config exporting /data/xrootdfs
Config warning: 'xrootd.seclib' not specified; strong authentication disabled!
++++++ File system initialization started.
=====> all.role server
++++++ Storage system initialization started.
=====> all.export /data/xrootdfs
=====> oss.localroot /data/files
Config effective /etc/xrootd/xrootd-clustered.cfg oss configuration:
oss.alloc 0 0 0
oss.cachescan 600
oss.fdlimit 32768 65536
oss.maxsize 0
oss.localroot /data/files
oss.trace 0
oss.xfr 1 deny 10800 keep 1200
oss.memfile off max 8351152128
oss.defaults r/w nocheck nodread nomig norcreate nopurge nostage xattr
oss.path /data/xrootdfs r/w nocheck nodread nomig norcreate nopurge nostage xattr
------ Storage system initialization completed.
++++++ Configuring server role. . .
=====> all.manager lipnode01:1213
Config effective /etc/xrootd/xrootd-clustered.cfg ofs configuration:
ofs.role server
ofs.maxdelay 60
ofs.persist manual hold 600 logdir /tmp/.ofs/posc.log
ofs.trace 0
------ File system server initialization completed.
141028 14:31:37 14953 cms_Finder: Connected to cmsd via /tmp/.olb/olbd.admin
Config warning: 'xrootd.prepare logdir' not specified; prepare tracking disabled.
------ xrootd protocol initialization completed.
------ xrootd anon@lipnode02:1094 initialization completed.

@wyang007
Copy link
Member

Can you try “xrd lipnode01 locate all /data/xrootdfs”. This should list all of your data servers (lipnode02,3, etc.)

Also when you run ls -l /atlas via xrootdfs, does it just hang or does it return with partial results?

Wei Yang | yangw@slac.stanford.edu | 650-926-3338(O)

On Oct 28, 2014, at 9:47 AM, Marcel Kuriyama notifications@github.com wrote:

Ah, cmsd connected. But /atlas still don't show all space.
This is the new log (lipnode01 is the redirector. lipnode02, 03... are data servers)

141028 14:27:19 14667 cms_Finder: Connected to cmsd via /tmp/.olb/olbd.admin
Config warning: this hostname, lipnode02, is registered without a domain qualification.
141028 14:31:37 14941 Starting on Linux 2.6.32-431.29.2.el6.x86_64
Copr. 2004-2012 Stanford University, xrd version v4.0.4
++++++ xrootd anon@lipnode02 initialization started.
Config using configuration file /etc/xrootd/xrootd-clustered.cfg
Config maximum number of connections restricted to 65536
Copr. 2012 Stanford University, xrootd protocol 3.0.0 version v4.0.4
++++++ xrootd protocol initialization started.
=====> all.export /data/xrootdfs
Config exporting /data/xrootdfs
Config warning: 'xrootd.seclib' not specified; strong authentication disabled!
++++++ File system initialization started.
=====> all.role server
++++++ Storage system initialization started.
=====> all.export /data/xrootdfs
=====> oss.localroot /data/files
Config effective /etc/xrootd/xrootd-clustered.cfg oss configuration:
oss.alloc 0 0 0
oss.cachescan 600
oss.fdlimit 32768 65536
oss.maxsize 0
oss.localroot /data/files
oss.trace 0
oss.xfr 1 deny 10800 keep 1200
oss.memfile off max 8351152128
oss.defaults r/w nocheck nodread nomig norcreate nopurge nostage xattr
oss.path /data/xrootdfs r/w nocheck nodread nomig norcreate nopurge nostage xattr
------ Storage system initialization completed.
++++++ Configuring server role. . .
=====> all.manager lipnode01:1213
Config effective /etc/xrootd/xrootd-clustered.cfg ofs configuration:
ofs.role server
ofs.maxdelay 60
ofs.persist manual hold 600 logdir /tmp/.ofs/posc.log
ofs.trace 0
------ File system server initialization completed.
141028 14:31:37 14953 cms_Finder: Connected to cmsd via /tmp/.olb/olbd.admin
Config warning: 'xrootd.prepare logdir' not specified; prepare tracking disabled.
------ xrootd protocol initialization completed.
------ xrootd anon@lipnode02:1094 initialization completed.


Reply to this email directly or view it on GitHub.

@marcelkuri
Copy link
Author

Hi, Wei.
When I try “xrd lipnode01 locate all /data/xrootdfs” the output is:
Command not recognized

So, I tried “xrdfs lipnode01:1094 locate -d /data/xrootdfs. The output was
[::192.168.0.1]:1094 Server ReadWrite

192.168.0.1 is the IP of lipnode01.
And, when I run "ls -l /atlas" it doesn't hang, but doesn't show the content correctly.
Before the update it showed only the users' folders. Now it shows only the files.

And "df -h" show /atlas only as 1.6 terabytes. But I have 3 more data servers, with 1.6 each.

@wyang007
Copy link
Member

Thanks Marcel,

The two commands are equivalent. It basically says that your redirector doesn’t see any data servers. Do you know why? any change to the configuration files on redirector or data servers? Are cmsd on data servers running?

regards,
Wei Yang | yangw@slac.stanford.edu | 650-926-3338(O)

On Oct 28, 2014, at 1:58 PM, Marcel Kuriyama notifications@github.com wrote:

Hi, Wei.
When I try “xrd lipnode01 locate all /data/xrootdfs” the output is:
Command not recognized

So, I tried “xrdfs lipnode01:1094 locate -d /data/xrootdfs. The output was
[::192.168.0.1]:1094 Server ReadWrite

192.168.0.1 is the IP of lipnode01.


Reply to this email directly or view it on GitHub.

@marcelkuri
Copy link
Author

The configuration files was changed by the xrootd installation, as I have written on this post:
#152

Basically my xrootd-clusterd.cfg was
all.export /data/xrootdfs
set xrdr=lipnode01
all.manager $(xrdr):1213
cms.allow host *
if $(xrdr)
all.role manager
xrd.port 1094
else
all.role server
oss.localroot /data/files

Now, it is:
all.role manager
all.manager lipnode01 3121
frm.xfr.copycmd /bin/cp /dev/null $PFN
all.adminpath /var/spool/xrootd
all.pidpath /var/run/xrootd

As this example:
https://github.com/xrootd/xrootd/blob/stable-4.0.x/packaging/common/xrootd-clustered.cfg

And, my /etc/sysconfig/xrootd was
CMSD_DEFAULT_OPTIONS="-l /var/log/xrootd/cmsd.log -c /etc/xrootd/xrootd-clustered.cfg -k 7"

And was changed to:
CMSD_DEFAULT_OPTIONS="-l /var/log/xrootd/cmsd.log -c /etc/xrootd/xrootd-clustered.cfg -k fifo"

@wyang007
Copy link
Member

Hi Marcel,

If this is your xrootd configuration file (I meant the “Now, it is”) then nothing will work. It doesn’t except have all.export so it will export /tmp. Can you put your old configuration files back (with an “fi” at the end)?
Can you also put your old /etc/sysconfig/xrootd back? Then run on every nodes:

service xrootd setup
service xroot start
service cmsd start

regards,
Wei Yang | yangw@slac.stanford.edu | 650-926-3338(O)

On Oct 28, 2014, at 2:11 PM, Marcel Kuriyama notifications@github.com wrote:

The configuration files was changed by the xrootd installation, as I have written on this post:
#152

Basically my xrootd-clusterd.cfg was
all.export /data/xrootdfs
set xrdr=lipnode01
all.manager $(xrdr):1213
cms.allow host *
if $(xrdr)
all.role manager
xrd.port 1094
else
all.role server
oss.localroot /data/files

Now, it is:
all.role manager
all.manager lipnode01 3121
frm.xfr.copycmd /bin/cp /dev/null $PFN
all.adminpath /var/spool/xrootd
all.pidpath /var/run/xrootd

As this example:
https://github.com/xrootd/xrootd/blob/stable-4.0.x/packaging/common/xrootd-clustered.cfg

And, my /etc/sysconfig/xrootd was
CMSD_DEFAULT_OPTIONS="-l /var/log/xrootd/cmsd.log -c /etc/xrootd/xrootd-clustered.cfg -k 7"

And was changed to:
CMSD_DEFAULT_OPTIONS="-l /var/log/xrootd/cmsd.log -c /etc/xrootd/xrootd-clustered.cfg -k fifo"


Reply to this email directly or view it on GitHub.

@marcelkuri
Copy link
Author

I returned to my old config files.
Now when I run "ls -l /atlas" the output is:
cannot access /atlas/SM_DATA: No such file or directory

And the same with the other folders that was accessible before xrootd4.0.4

@wyang007
Copy link
Member

Hi Marcel,

no need to jump to xrootdfs if your xrootd’s config is not work. Can you rerun "xrdfs lipnode01:1094 locate -d /data/xrootdfs" and see what you get?

regards,
Wei Yang | yangw@slac.stanford.edu | 650-926-3338(O)

On Oct 28, 2014, at 2:50 PM, Marcel Kuriyama notifications@github.com wrote:

I returned to my old config files.
Now when I run "ls -l /atlas" the output is:
cannot access /atlas/SM_DATA: No such file or directory

And the same with the other folders that was accessible before xrootd4.0.4


Reply to this email directly or view it on GitHub.

@marcelkuri
Copy link
Author

xrdfs lipnode01:1094 locate -d /data/xrootdfs
[::192.168.0.9]:1094 Server ReadWrite
[::192.168.0.5]:1094 Server ReadWrite
[::192.168.0.4]:1094 Server ReadWrite
[::192.168.0.2]:1094 Server ReadWrite
[::192.168.0.3]:1094 Server ReadWrite

Now, I can see some data servers, but not all.

@wyang007
Copy link
Member

Hi Marcel,

have you figured out why some data servers showed up in the following command and others are not?

regards,
Wei Yang | yangw@slac.stanford.edu | 650-926-3338(O)

On Oct 28, 2014, at 3:12 PM, Marcel Kuriyama notifications@github.com wrote:

xrdfs lipnode01:1094 locate -d /data/xrootdfs
[::192.168.0.9]:1094 Server ReadWrite
[::192.168.0.5]:1094 Server ReadWrite
[::192.168.0.4]:1094 Server ReadWrite
[::192.168.0.2]:1094 Server ReadWrite
[::192.168.0.3]:1094 Server ReadWrite

Now, I can see some data servers, but not all.


Reply to this email directly or view it on GitHub.

@marcelkuri
Copy link
Author

Hi, Wei.

I didn't find why some data servers didn't show up.
But now everything is working fine.

regards,
Marcel Kuriyama.

@ljanyst
Copy link
Contributor

ljanyst commented Nov 25, 2014

Great! Closing then.

@ljanyst ljanyst closed this as completed Nov 25, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants