Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[QA] iptables conditional logic can reference undefined var #3555

Closed
conorsch opened this issue Jun 22, 2018 · 11 comments · Fixed by #3560
Closed

[QA] iptables conditional logic can reference undefined var #3555

conorsch opened this issue Jun 22, 2018 · 11 comments · Fixed by #3560

Comments

@conorsch
Copy link
Contributor

Description

The iptables templating logic contains unreliable conditional checks that may result in an undefined variable being referenced, which causes the provisioning run to fail hard.

Steps to Reproduce

Note that I was running on 0.7.0 in Tails, so it's possible the bug described cannot reproduced on release/0.8.0 and/or develop.

  1. Hypothesis: use hardware machines with interface names such as em1 (rather than eth0 / eth1 used in our test VMs).
  2. Run ./securedrop-admin install against hardware machines.
  3. Observe failure on the "Copy IPv4 iptables rules" task (see full error output below).

Expected Behavior

Playbook run completes, machines reboot, everything is copacetic.

Actual Behavior

Ansible errors out with a cryptic message about undefined dict attributes. Machines are incompletely configured.

TASK [restrict-direct-access : Determine admin network - next compute admin network cidr] **************************************************

TASK [restrict-direct-access : Copy IPv4 iptables rules.] **********************************************************************************
fatal: [app]: FAILED! => {"changed": false, "msg": "AnsibleUndefinedVariable: 'dict object' has no attribute 'ipv4'"}
fatal: [mon]: FAILED! => {"changed": false, "msg": "AnsibleUndefinedVariable: 'dict object' has no attribute 'ipv4'"}

Comments

There's a history to the logic referred to here.

  1. Conditional logic first introduced via Allow SSH over local network instead of Tor #2592.
  2. Subsequently patched in develop via ansible: rules_v4 does not need to inspect unused ipv4 details #3466
  3. A forum thread documented the problem happening in the wild: https://forum.securedrop.club/t/unable-to-rerun-install/792

While testing against hardware, I applied this patch against the 0.7.0 source and it resolved the problem:

diff --git a/install_files/ansible-base/roles/restrict-direct-access/templates/rules_v4 b/install_files/ansible-base/roles/restrict-direct-access/templates/rules_v4
index a7b78557..5c2f8082 100644
--- a/install_files/ansible-base/roles/restrict-direct-access/templates/rules_v4
+++ b/install_files/ansible-base/roles/restrict-direct-access/templates/rules_v4
@@ -95,7 +95,7 @@
 # Permit direct SSH access.
 # Allowed for staging and optionally for production (disables ssh over tor)
 {% for interface in ansible_interfaces -%}
-  {%- if  hostvars[inventory_hostname]['ansible_'+interface].ipv4 -%}
+  {%- if 'ipv4' in hostvars[inventory_hostname]['ansible_'+interface] -%}
       {%- set int_details = hostvars[inventory_hostname]['ansible_'+interface].ipv4 -%}
       {%- set net_string = '/'.join([int_details.network,int_details.netmask]) -%}
       {%- if ssh_ip|ipaddr(net_string) -%}

So next step should be to evaluate whether this bug occurs against the same hardware on the 0.8.0 release branch.

@redshiftzero
Copy link
Contributor

If this problem arose with enable_ssh_over_tor = true, then this should be resolved by #3466

@conorsch
Copy link
Contributor Author

Re-ran playbook on 0.8.0~rc2 and encountered no problems. As @redshiftzero mentioned, @dachary's changes in #3466 appear to resolve the problem. Still, I'm leaving this ticket open to be absolutely sure. We'll need to perform a fresh install of 0.8.0~rc2 on these hardware machines prior to release, so if another member of the team agrees the issue is resolved, let's close.

@ghost
Copy link

ghost commented Jun 22, 2018

Yes, this ticket should be left open because it will show in situations where enable_ssh_over_tor = false.

@redshiftzero
Copy link
Contributor

Fair - let's see if anyone can reproduce with enable_ssh_over_tor = false (if we can't I advocate that we do not make late stage changes as this particular code path requires a fair bit of manual testing to ensure no regressions were introduced)

@b-meson
Copy link
Contributor

b-meson commented Jun 22, 2018

I also saw this issue when enable_ssh_over_tor = false was configured. (SSH only over LAN). @conorsch's fix worked for me.

@redshiftzero
Copy link
Contributor

On which branch?

@b-meson
Copy link
Contributor

b-meson commented Jun 22, 2018

@redshiftzero sorry I tested 0.8.0rc2

@redshiftzero
Copy link
Contributor

redshiftzero commented Jun 22, 2018

Confirming: you ran ./securedrop-admin install on the 0.8.0-rc2 tag and hit this bug with enable_ssh_over_tor = false on the HP Proliant? You then made the change suggested in the issue description here, and you were able to successfully install and SSH into both servers?

@b-meson
Copy link
Contributor

b-meson commented Jun 23, 2018

@redshiftzero yes that is absolutely correct. I did a fresh install on the 0.8.2-rc2 tag. I had enable_ssh_over_tor = false configured when I was installing on the Proliant server. It threw this error

TASK [restrict-direct-access : Copy IPv4 iptables rules.] **********************************************************************************
fatal: [app]: FAILED! => {"changed": false, "msg": "AnsibleUndefinedVariable: 'dict object' has no attribute 'ipv4'"}
fatal: [mon]: FAILED! => {"changed": false, "msg": "AnsibleUndefinedVariable: 'dict object' has no attribute 'ipv4'"}

I applied Conor's patch and the install continued and I was able to SSH into both the app and the mon server

@redshiftzero
Copy link
Contributor

OK excellent, thanks for the clarity. We should indeed make this suggested change

conorsch pushed a commit that referenced this issue Jun 23, 2018
The conditional logic checking for whether the 'ipv4' attribute was
defined wasn't properly structured: it assumed the attribute was present
in order to check it. Slight tweak to the syntax to make a more graceful
evaluation to False.

Closes #3555.
redshiftzero pushed a commit that referenced this issue Jun 23, 2018
The conditional logic checking for whether the 'ipv4' attribute was
defined wasn't properly structured: it assumed the attribute was present
in order to check it. Slight tweak to the syntax to make a more graceful
evaluation to False.

Closes #3555.

(cherry picked from commit e2d7fbc)
@b-meson
Copy link
Contributor

b-meson commented Jun 25, 2018

I reran a clean install of the 0.8rc2 debs using the Ansible logic found in the release/0.8 branch. I set SSH_over_tor = no and was able to complete the clean install without an issues. This was using the an HP Proliant DL 368 G7 as the app and one Dell PowerEdge 620 as the mon server. I can confirm that this issue is resolved. On to full QA in #3512

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants