Skip to content

Huawei OMS

Shawfeng Dong edited this page Nov 5, 2016 · 2 revisions

There are two Operation Maintenance Servers (OMS) in the Huawei Universal Distributed Storage. Each server is a 2U Huawei Tecal RH2288 V2 server, equipped with 2x quad-core Intel Xeon E5-2609 CPUs (2.40GHz), 64 GB memory, and 2x 1TB SATA HDDs configured into a RAID-1 group (managed by an LSI SAS2208 RAID controller). There are 4x Gigabit Network Interfaces and 2x 10-Gigabit Network Interfaces on each server.

Table of Contents

Network Configuration

The 4x Gigabit Network Interfaces, eth0eth3, are unused on each OMS.

The 2x 10GbE NICs, eth4 & eth5, are bonded together into a channel bonding interface bond5. The IP addresses of bond5 are 172.16.0.12 & 172.16.0.13, on OMS01 & OMS02 respectively:

OMS02:~ # cat /etc/sysconfig/network/ifcfg-bond5 
BONDING_MASTER='yes'
BONDING_MODULE_OPTS='mode=balance-xor miimon=100 xmit_hash_policy=layer3+4'
BONDING_SLAVE0='eth4'
BONDING_SLAVE1='eth5'
BOOTPROTO='static'
IPADDR='172.16.0.13'
NETMASK='255.255.255.0'
STARTMODE='auto'
USERCONTROL='no'

Two VLAN interfaces, vlan3076 & vlan3078, are then created on bond5:

# cat /etc/sysconfig/network/ifcfg-vlan3076
BOOTPROTO='static'
ETHERDEVICE='bond0'
STARTMODE='auto'
USERCONTROL='no'

# cat /etc/sysconfig/network/ifcfg-vlan3078 
BOOTPROTO='static'
ETHERDEVICE='bond0'
STARTMODE='auto'
USERCONTROL='no'

A bridge interface br1 is created on vlan3078. The IP addresses of br1 are 128.114.126.242 & 128.114.126.243, on OMS01 & OMS02 respectively:

OMS02:~ # cat /etc/sysconfig/network/ifcfg-br1     
BOOTPROTO='static'
BRIDGE='yes'
BRIDGE_FORWARDDELAY='0'
BRIDGE_PORTS='vlan3078'
BRIDGE_STP='off'
IPADDR='128.114.126.243'
NETMASK='255.255.255.224'
STARTMODE='auto'
USERCONTROL='no'

Another bridge interface br2 is created on vlan3076. No IP address is assigned to br2:

# cat /etc/sysconfig/network/ifcfg-br2
BOOTPROTO='static'
BRIDGE='yes'
BRIDGE_FORWARDDELAY='0'
BRIDGE_PORTS='vlan3076'
BRIDGE_STP='off'
STARTMODE='auto'
USERCONTROL='no'

On the active OMS, two sub interfaces (aliases), bond5:25 & bond5:30, are created on bond5, with the IP addresses of 172.16.0.5 & 172.16.0.6, respectively.

Additionally on the active OMS, a sub interface (alias), br1:25, is created on br1, with the IP address 128.114.126.250; and yet another sub interface, br2:25, is created on br2, with the the IP address 172.16.6.3.

NOTE the setup (by Huawei) is unnecessarily convoluted!

  • bond5:30 is pointless, being a sub interface on the same Channel Bonding Interface as bond5:25
  • The 2 bridge interfaces, br1 & br2, are superfluous. One can simply assign an IP address to a VLAN interface. See:
# man ifcfg-vlan

DRBD

The two OMS are configured to form a high availability (HA) cluster, using DRBD. DRBD is the acronym of Distributed Replicated Block Device, and can be understood as network based RAID-1.

Curiously, drbd is started with the following lines in /etc/init.d/rc on OMS01 (the default primary):

modprobe drbd --allow-unsupported
sleep 5
service drbd start &

WatchDog

WatchDog is a Java program, and is probably the core service of the Huawei GalaX cloud solution (in /opt/galax/gcs/watchdog):

java -Ddecryptjar.path=/opt/galax/gcs/watchdog -Dclass.name=com.huawei.galax.gcs.watchdog.Main \
  -classpath /usr/java/jdk1.6.0_24/jre/bin:/opt/galax/gcs/watchdog/lib/Galax-config.jar:/opt/galax/gcs/watchdog/lib/commons-lang-2.6.jar:/opt/galax/gcs/watchdog/lib/dom4j-1.6.1.jar:/opt/galax/gcs/watchdog/lib/galax-decrypt.jar:/opt/galax/gcs/watchdog/lib/init.jar:/opt/galax/gcs/watchdog/lib/jaxen-1.1.1.jar:/opt/galax/gcs/watchdog/lib/jfig-1.5.2.jar:/opt/galax/gcs/watchdog/lib/log4j-1.2.15.jar:/opt/galax/gcs/watchdog/lib/postgresql-8.3-605.jdbc4.jar:/opt/galax/gcs/watchdog/lib/slf4j-api-1.6.1.jar:/opt/galax/gcs/watchdog/lib/slf4j-log4j12-1.6.1-selffix.jar \
  com.huawei.galax.gcs.encryption.decrypt.MainSystem
It is responsible for starting various Huawei services, monitoring their status, and restarting them if necessary; it maintains the high availability of the 2 OMS as well. It works sort of like a mini init process; but one may question the necessity and efficiency of such a scheme.

Given its importance, Huawei makes sure WatchDog is started, by the following line towards the end of /etc/init.d/rc:

sh /opt/galax/gcs/watchdog/watchdog.sh -check
and by both /etc/rc.d/rc3.d/S13watchdog and cron.

Other Services

On the primary OMS, the following services are running:

Ganglia Meta Daemon (gmetad)

PostgreSQL server

OpenLDAP slapd

Apache Geronimo:

java -Xmx3G -XX:MaxPermSize=1G -XX:MaxDirectMemorySize=1536M \
  -XX:OnOutOfMemoryError=kill -9 %p -Dorg.eclipse.jetty.io.nio.BUSY_PAUSE=10 \
  -javaagent:/opt/obs/geronimo/bin/jpa.jar \
  -Dorg.apache.geronimo.home.dir=/opt/obs/geronimo \
  -Djava.endorsed.dirs=/opt/obs/geronimo/lib/endorsed:/usr/java/jre1.6.0_24/lib/endorsed \
  -Djava.ext.dirs=/opt/obs/geronimo/lib/ext:/usr/java/jre1.6.0_24/lib/ext \
  -Djava.io.tmpdir=var/temp -Dfile.encoding=UTF8 -Duser.timezone=UTC \
  -jar /opt/obs/geronimo/bin/server.jar --long

Apache Tomcat:

java -Djava.util.logging.config.file=/opt/galax/tomcat/conf/logging.properties \
  -server -XX:PermSize=256M -XX:MaxPermSize=1024m -Xms2048m -Xmx2048m \
  -Dpoe.config=/opt/obs/obsconf/POE.properties \
  -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager \
  -agentlib:jdwp=transport=dt_socket,address=8000,server=y,suspend=n \
  -Djava.endorsed.dirs=/opt/galax/tomcat/endorsed \
  -classpath /opt/galax/tomcat/bin/bootstrap.jar:/opt/galax/tomcat/bin/tomcat-juli.jar \
  -Dcatalina.base=/opt/galax/tomcat -Dcatalina.home=/opt/galax/tomcat \
  -Djava.io.tmpdir=/opt/galax/tomcat/temp \
  org.apache.catalina.startup.Bootstrap start

On both OMS, the following services are also running:

Ganglia Monitoring Daemon (gmond)

OMS agent (in /usr/galax/oms), which appears to be a Ganglia plugin.

Name Server Daemon (NSD), obsnsmon

NTP

ZooKeeper quorum (in /opt/sod/zookeeper):

java -Dzookeeper.log.dir=/opt/sod/zookeeper \
  -Dzookeeper.root.logger=INFO,ROLLINGFILE,SYSLOG \
  -Djute.maxbuffer=10485760 -Dzookeeper.snapCount=300 \
  -cp /opt/sod/zookeeper/build/classes:/opt/sod/zookeeper/build/lib/*.jar:/opt/sod/zookeeper/zookeeper-3.3.4.jar:/opt/sod/zookeeper/lib/zookeeper-check-1.0.jar:/opt/sod/zookeeper/lib/log4j-1.2.15.jar:/opt/sod/zookeeper/lib/jline-0.9.94.jar:/opt/sod/zookeeper/lib/commons-lang-2.6.jar:/opt/sod/zookeeper/lib/commons-collections-3.2.jar:/opt/sod/zookeeper/lib/commons-cli-1.1.jar:/opt/sod/zookeeper/lib/apache-rat-tasks-0.6.jar:/opt/sod/zookeeper/lib/apache-rat-core-0.6.jar:/opt/sod/zookeeper/src/java/lib/*.jar \
  org.apache.zookeeper.server.quorum.QuorumPeerMain /opt/sod/zookeeper/conf/zoo.cfg

SoD controller (in /opt/sod/sod-controller):

java -Djava.util.logging.config.file=config/log.properties -Dlog4j.configuration=config/log4j.xml \
  -server -Xmx200M -Xms200M -Djute.maxbuffer=10485760 -Duser.timezone=UTC \
  -cp :/opt/sod/sod-controller/lib/commons-codec-1.3.jar:/opt/sod/sod-controller/lib/commons-collections-3.2.1.jar:/opt/sod/sod-controller/lib/commons-dbcp-1.2.2.jar:/opt/sod/sod-controller/lib/commons-io-2.1.jar:/opt/sod/sod-controller/lib/commons-lang-2.6.jar:/opt/sod/sod-controller/lib/commons-logging-1.1.1.jar:/opt/sod/sod-controller/lib/commons-pool-1.5.2.jar:/opt/sod/sod-controller/lib/google-collect-1.0.jar:/opt/sod/sod-controller/lib/jackson-core-asl-1.4.0.jar:/opt/sod/sod-controller/lib/jackson-mapper-asl-1.4.0.jar:/opt/sod/sod-controller/lib/javax.ws.rs.jar:/opt/sod/sod-controller/lib/jdom-1.1.jar:/opt/sod/sod-controller/lib/jna.jar:/opt/sod/sod-controller/lib/jopt-simple-3.1.jar:/opt/sod/sod-controller/lib/log4j-1.2.15.jar:/opt/sod/sod-controller/lib/org.restlet.ext.jaxrs.jar:/opt/sod/sod-controller/lib/org.restlet.jar:/opt/sod/sod-controller/lib/pb_agent.jar:/opt/sod/sod-controller/lib/pb_server.jar:/opt/sod/sod-controller/lib/perf4j.jar:/opt/sod/sod-controller/lib/pmf-agent.jar:/opt/sod/sod-controller/lib/protobuf_sod.jar:/opt/sod/sod-controller/lib/scriptExcutor.jar:/opt/sod/sod-controller/lib/sod-agent.jar:/opt/sod/sod-controller/lib/sod-controller-1.00.jar:/opt/sod/sod-controller/lib/sod-lib.jar:/opt/sod/sod-controller/lib/zookeeper-3.3.4.jar:/opt/sod/sod-controller/lib/zookeeper-lock.jar \
  com.huawei.sod.controller.SoDController

IPTables. Curiously, it is started with a line towards the end of /etc/init.d/rc:

sh /opt/uds/installconfig/script/om/setOMMIptablesRule.sh

See Also

References

Clone this wiki locally