-
Notifications
You must be signed in to change notification settings - Fork 7
Huawei OMS
There are two Operation Maintenance Servers (OMS) in the Huawei Universal Distributed Storage. Each server is a 2U Huawei Tecal RH2288 V2 server, equipped with 2x quad-core Intel Xeon E5-2609 CPUs (2.40GHz), 64 GB memory, and 2x 1TB SATA HDDs configured into a RAID-1 group (managed by an LSI SAS2208 RAID controller). There are 4x Gigabit Network Interfaces and 2x 10-Gigabit Network Interfaces on each server.
The 4x Gigabit Network Interfaces, eth0 – eth3, are unused on each OMS.
The 2x 10GbE NICs, eth4 & eth5, are bonded together into a channel bonding interface bond5. The IP addresses of bond5 are 172.16.0.12 & 172.16.0.13, on OMS01 & OMS02 respectively:
OMS02:~ # cat /etc/sysconfig/network/ifcfg-bond5 BONDING_MASTER='yes' BONDING_MODULE_OPTS='mode=balance-xor miimon=100 xmit_hash_policy=layer3+4' BONDING_SLAVE0='eth4' BONDING_SLAVE1='eth5' BOOTPROTO='static' IPADDR='172.16.0.13' NETMASK='255.255.255.0' STARTMODE='auto' USERCONTROL='no'
Two VLAN interfaces, vlan3076 & vlan3078, are then created on bond5:
# cat /etc/sysconfig/network/ifcfg-vlan3076 BOOTPROTO='static' ETHERDEVICE='bond0' STARTMODE='auto' USERCONTROL='no' # cat /etc/sysconfig/network/ifcfg-vlan3078 BOOTPROTO='static' ETHERDEVICE='bond0' STARTMODE='auto' USERCONTROL='no'
A bridge interface br1 is created on vlan3078. The IP addresses of br1 are 128.114.126.242 & 128.114.126.243, on OMS01 & OMS02 respectively:
OMS02:~ # cat /etc/sysconfig/network/ifcfg-br1 BOOTPROTO='static' BRIDGE='yes' BRIDGE_FORWARDDELAY='0' BRIDGE_PORTS='vlan3078' BRIDGE_STP='off' IPADDR='128.114.126.243' NETMASK='255.255.255.224' STARTMODE='auto' USERCONTROL='no'
Another bridge interface br2 is created on vlan3076. No IP address is assigned to br2:
# cat /etc/sysconfig/network/ifcfg-br2 BOOTPROTO='static' BRIDGE='yes' BRIDGE_FORWARDDELAY='0' BRIDGE_PORTS='vlan3076' BRIDGE_STP='off' STARTMODE='auto' USERCONTROL='no'
On the active OMS, two sub interfaces (aliases), bond5:25 & bond5:30, are created on bond5, with the IP addresses of 172.16.0.5 & 172.16.0.6, respectively.
Additionally on the active OMS, a sub interface (alias), br1:25, is created on br1, with the IP address 128.114.126.250; and yet another sub interface, br2:25, is created on br2, with the the IP address 172.16.6.3.
NOTE the setup (by Huawei) is unnecessarily convoluted!
- bond5:30 is pointless, being a sub interface on the same Channel Bonding Interface as bond5:25
- The 2 bridge interfaces, br1 & br2, are superfluous. One can simply assign an IP address to a VLAN interface. See:
# man ifcfg-vlan
The two OMS are configured to form a high availability (HA) cluster, using DRBD. DRBD is the acronym of Distributed Replicated Block Device, and can be understood as network based RAID-1.
Curiously, drbd is started with the following lines in /etc/init.d/rc on OMS01 (the default primary):
modprobe drbd --allow-unsupported sleep 5 service drbd start &
WatchDog is a Java program, and is probably the core service of the Huawei GalaX cloud solution (in /opt/galax/gcs/watchdog):
java -Ddecryptjar.path=/opt/galax/gcs/watchdog -Dclass.name=com.huawei.galax.gcs.watchdog.Main \ -classpath /usr/java/jdk1.6.0_24/jre/bin:/opt/galax/gcs/watchdog/lib/Galax-config.jar:/opt/galax/gcs/watchdog/lib/commons-lang-2.6.jar:/opt/galax/gcs/watchdog/lib/dom4j-1.6.1.jar:/opt/galax/gcs/watchdog/lib/galax-decrypt.jar:/opt/galax/gcs/watchdog/lib/init.jar:/opt/galax/gcs/watchdog/lib/jaxen-1.1.1.jar:/opt/galax/gcs/watchdog/lib/jfig-1.5.2.jar:/opt/galax/gcs/watchdog/lib/log4j-1.2.15.jar:/opt/galax/gcs/watchdog/lib/postgresql-8.3-605.jdbc4.jar:/opt/galax/gcs/watchdog/lib/slf4j-api-1.6.1.jar:/opt/galax/gcs/watchdog/lib/slf4j-log4j12-1.6.1-selffix.jar \ com.huawei.galax.gcs.encryption.decrypt.MainSystemIt is responsible for starting various Huawei services, monitoring their status, and restarting them if necessary; it maintains the high availability of the 2 OMS as well. It works sort of like a mini init process; but one may question the necessity and efficiency of such a scheme.
Given its importance, Huawei makes sure WatchDog is started, by the following line towards the end of /etc/init.d/rc:
sh /opt/galax/gcs/watchdog/watchdog.sh -checkand by both /etc/rc.d/rc3.d/S13watchdog and cron.
On the primary OMS, the following services are running:
Ganglia Meta Daemon (gmetad)
PostgreSQL server
OpenLDAP slapd
java -Xmx3G -XX:MaxPermSize=1G -XX:MaxDirectMemorySize=1536M \ -XX:OnOutOfMemoryError=kill -9 %p -Dorg.eclipse.jetty.io.nio.BUSY_PAUSE=10 \ -javaagent:/opt/obs/geronimo/bin/jpa.jar \ -Dorg.apache.geronimo.home.dir=/opt/obs/geronimo \ -Djava.endorsed.dirs=/opt/obs/geronimo/lib/endorsed:/usr/java/jre1.6.0_24/lib/endorsed \ -Djava.ext.dirs=/opt/obs/geronimo/lib/ext:/usr/java/jre1.6.0_24/lib/ext \ -Djava.io.tmpdir=var/temp -Dfile.encoding=UTF8 -Duser.timezone=UTC \ -jar /opt/obs/geronimo/bin/server.jar --long
java -Djava.util.logging.config.file=/opt/galax/tomcat/conf/logging.properties \ -server -XX:PermSize=256M -XX:MaxPermSize=1024m -Xms2048m -Xmx2048m \ -Dpoe.config=/opt/obs/obsconf/POE.properties \ -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager \ -agentlib:jdwp=transport=dt_socket,address=8000,server=y,suspend=n \ -Djava.endorsed.dirs=/opt/galax/tomcat/endorsed \ -classpath /opt/galax/tomcat/bin/bootstrap.jar:/opt/galax/tomcat/bin/tomcat-juli.jar \ -Dcatalina.base=/opt/galax/tomcat -Dcatalina.home=/opt/galax/tomcat \ -Djava.io.tmpdir=/opt/galax/tomcat/temp \ org.apache.catalina.startup.Bootstrap start
On both OMS, the following services are also running:
Ganglia Monitoring Daemon (gmond)
OMS agent (in /usr/galax/oms), which appears to be a Ganglia plugin.
Name Server Daemon (NSD), obsnsmon
NTP
ZooKeeper quorum (in /opt/sod/zookeeper):
java -Dzookeeper.log.dir=/opt/sod/zookeeper \ -Dzookeeper.root.logger=INFO,ROLLINGFILE,SYSLOG \ -Djute.maxbuffer=10485760 -Dzookeeper.snapCount=300 \ -cp /opt/sod/zookeeper/build/classes:/opt/sod/zookeeper/build/lib/*.jar:/opt/sod/zookeeper/zookeeper-3.3.4.jar:/opt/sod/zookeeper/lib/zookeeper-check-1.0.jar:/opt/sod/zookeeper/lib/log4j-1.2.15.jar:/opt/sod/zookeeper/lib/jline-0.9.94.jar:/opt/sod/zookeeper/lib/commons-lang-2.6.jar:/opt/sod/zookeeper/lib/commons-collections-3.2.jar:/opt/sod/zookeeper/lib/commons-cli-1.1.jar:/opt/sod/zookeeper/lib/apache-rat-tasks-0.6.jar:/opt/sod/zookeeper/lib/apache-rat-core-0.6.jar:/opt/sod/zookeeper/src/java/lib/*.jar \ org.apache.zookeeper.server.quorum.QuorumPeerMain /opt/sod/zookeeper/conf/zoo.cfg
SoD controller (in /opt/sod/sod-controller):
java -Djava.util.logging.config.file=config/log.properties -Dlog4j.configuration=config/log4j.xml \ -server -Xmx200M -Xms200M -Djute.maxbuffer=10485760 -Duser.timezone=UTC \ -cp :/opt/sod/sod-controller/lib/commons-codec-1.3.jar:/opt/sod/sod-controller/lib/commons-collections-3.2.1.jar:/opt/sod/sod-controller/lib/commons-dbcp-1.2.2.jar:/opt/sod/sod-controller/lib/commons-io-2.1.jar:/opt/sod/sod-controller/lib/commons-lang-2.6.jar:/opt/sod/sod-controller/lib/commons-logging-1.1.1.jar:/opt/sod/sod-controller/lib/commons-pool-1.5.2.jar:/opt/sod/sod-controller/lib/google-collect-1.0.jar:/opt/sod/sod-controller/lib/jackson-core-asl-1.4.0.jar:/opt/sod/sod-controller/lib/jackson-mapper-asl-1.4.0.jar:/opt/sod/sod-controller/lib/javax.ws.rs.jar:/opt/sod/sod-controller/lib/jdom-1.1.jar:/opt/sod/sod-controller/lib/jna.jar:/opt/sod/sod-controller/lib/jopt-simple-3.1.jar:/opt/sod/sod-controller/lib/log4j-1.2.15.jar:/opt/sod/sod-controller/lib/org.restlet.ext.jaxrs.jar:/opt/sod/sod-controller/lib/org.restlet.jar:/opt/sod/sod-controller/lib/pb_agent.jar:/opt/sod/sod-controller/lib/pb_server.jar:/opt/sod/sod-controller/lib/perf4j.jar:/opt/sod/sod-controller/lib/pmf-agent.jar:/opt/sod/sod-controller/lib/protobuf_sod.jar:/opt/sod/sod-controller/lib/scriptExcutor.jar:/opt/sod/sod-controller/lib/sod-agent.jar:/opt/sod/sod-controller/lib/sod-controller-1.00.jar:/opt/sod/sod-controller/lib/sod-lib.jar:/opt/sod/sod-controller/lib/zookeeper-3.3.4.jar:/opt/sod/sod-controller/lib/zookeeper-lock.jar \ com.huawei.sod.controller.SoDController
IPTables. Curiously, it is started with a line towards the end of /etc/init.d/rc:
sh /opt/uds/installconfig/script/om/setOMMIptablesRule.sh