New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Install Cisco RPS2300 in Equinix #254

Closed
pnorman opened this Issue Dec 4, 2018 · 2 comments

Comments

Projects
None yet
3 participants
@pnorman
Copy link
Collaborator

pnorman commented Dec 4, 2018

We have two switches at Equinix for redundancy, but each switch doesn't have redundant power. A power loss on one feed can bring the network down until it's moved to the other switch.

We plan to get a Cisco RPS2300 to power the switches redundantly, and it needs installing in AM6.

Rack spot 26 is right above the switches and free. If the RPS gets plugged in to AC in addition to the switches, A6/B6 on the switches should be free, if instead, A7/B7 can be reused. Or any other cabling that makes sense ;)

@jburgess777

This comment has been minimized.

Copy link
Member

jburgess777 commented Dec 6, 2018

If the RPS gets plugged in to AC in addition to the switches, A6/B6 on the switches should be free, if instead, A7/B7 can be reused.

If I understand the RPS2300 documentation correctly: In order to have true redundant power for two switches we must take the "addition" option where the switches are powered by the AC normally, and the RPS is used as a fallback.

If the only source of power was the RPS then I don't think it offers redundancy to two switches.

The docs say that if you only have a single PSU in the RPS then it can actively provide power for only a single switch. Multiple switches can be connected and if only one lost power then the RPS would provide an alternative supply for this one switch. If multiple switches lost power only the one with the highest priority (lowest number) would get the RPS power.

With two PSUs in the RPS, the RPS can actively provide power to at most two swtiches simultaneously.

What this means is that if the only source of power to the two switches was the RPS then they would both work only when the RPS had power on both its PSUs. If we lost power on one supply then one switch would still go down. We could set the priorities so that the one with the uplink is the one which gets the power but we still lose one switch.

To actually get redundant power you need the switch to normally be powered by its internal supply and only rely on the RPS for when the internal supply fails (for either internal or external reasons).

Since we have the switches across the A/B supplies, the RPS should also be across the A/B supplies. If power is lost on (A) what would happen is:

  • The switch powered by B would carry on with its internal supply
  • The RPS PSU on A would drop, leaving the RPS with the ability to power just 1 switch.
  • The internal PSU for the switch on A would drop, and start taking RPS power from the PSU supplied from B.
    Something similar would happen if power were lost on B instead of A.

In addition, we should set the switch with the uplink with priority 1 (highest) and the second switch to 2. That ensures that if both switches demanded power and the RPS could only supply one then it would choose the switch with the uplink.

https://www.cisco.com/c/dam/en/us/products/collateral/switches/redundant-power-system-2300/redundant_power_system_2300_qa.pdf

https://www.cisco.com/c/en/us/td/docs/switches/power_supplies/rps2300/hardware/installation/guide/2300hig/rpsoview.html

@Firefishy

This comment has been minimized.

Copy link
Member

Firefishy commented Dec 8, 2018

Installed.

Also tested following scenarios:

  1. PASS: RPS2300 disconnected PDU-A-6 AC power source. Result: Unit marks 1 PSU down, but status remains lit. PDU returned to OK when power returned.
  2. PASS: RPS2300 disconnected as above, but PDU-B-6 unplugged.
  3. PASS: SW2, disconnected mains. SW2 switched to DC power and reported via UI. Returns to mains once re-connect.
  4. FAIL: SW1, disconnected mains. SW1 switched locked up and Watchdog rebooted unit. Mains power returned during reboot, but unit would have come back OK once rebooted.
%STCK SYSL-A-UNITMSG: UNIT ID 1,Msg:%HAL_config_main-F-WatchdogTimeout: hal_config_main Watchdog timer timeout with op_id 302  ***** FATAL ERROR *****  Repor ting Task: HCWT. Software Version: 2.4.0.94 (date May 31 2018 time 02:37:28) ros(HOSTG_fatal_error+0x14) [0x2e62a4] ros(OSSYSG_fatal_error+0x134) [0x8752ec]   ros() [0x693ec0] ros() [0xbc3c80] /lib/libp2linux.so.1(+0x3aa4) [0xb6ec4aa4] /lib/libpthread.so.0(+0x6e64) [0xb6e9fe64]  ***** END OF FATAL ERROR *****    ***** END OF FATAL ERROR *****       
  1. PASS: Repeated Test 4. Pass: SW1, disconnected mains. SW1 switched to DC power and reported via UI. Returns to mains once re-connect.

@Firefishy Firefishy closed this Dec 8, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment