Feature/orchestration #112

hellais · 2017-06-12T10:00:19Z

This is based on top of the monitoring pull request and includes all the roles for deploying probe orchestration backends.

Fix calling of the proteus-* service command

* Add admin password to vault * Write the proteus notify certificates

* Add sbs' key

1) As stated in ooni/orchestra#13, we must not set a topic for Android because that leads to FCM rejecting the message. 2) Define two different topics for iOS, one for production and one for testing, because I guess we'll need them soon.

fix,refactor(proteus-events): topics correctness

bassosimone · 2017-08-16T11:29:35Z

ansible/roles/postgresql/tasks/main.yml

@@ -3,4 +3,4 @@
 - include: config.yml
  become: yes
  become_user: postgres
-  become_method: su
+  become_method: sudo


We do have sudo in dom0, right?

Right, Also, that' default become_method.

Conflicts: ansible/roles/adm/tasks/main.yml

darkk · 2017-08-16T20:38:42Z

ansible/roles/nodejs/tasks/install-yarn.yml

+---
+- name: Add YarnPkg apt package signing key
+  apt_key:
+    url: "https://dl.yarnpkg.com/debian/pubkey.gpg"


It's "better" to import from keyservers using full fingerprint.

Makes sense.

darkk · 2017-08-16T20:39:15Z

ansible/roles/nodejs/tasks/install-yarn.yml

+
+- name: Add YarnPkg apt repository
+  apt_repository:
+    repo: 'deb https://dl.yarnpkg.com/debian/ stable main'


Is stable for stable yarn release or for debian:stable ?

It's for debian:stable

darkk · 2017-08-16T20:42:35Z

ansible/roles/proteus-db/tasks/main.yml

+  shell: "psql -U {{ proteus_db_user}} -h 127.0.0.1 -d {{ proteus_db_name }} -f {{ proteus_db_schema_path}}/{{ item }}"
+  environment:
+    PGPASSWORD: "{{ proteus_db_password }}"
+  with_items: "{{ proteus_db_files }}"


What will happen when one needs to ALTER something?

Actually all of this stuff can probably be removed as it's now part of proteus itself. We only need to create the proteus_db_user and proteus will take care of populating the database.

IIRC, @bassosimone was struggling with sql updates embedded into the binary while launching concurrent processes as same DB is used by various services. So there was a suggestion to move SQL files away from the binary and apply them using some other tooling. Is it still the case?

No, I think you were the one suggesting that as a solution, but the issue in the migration scripts was solved, so there is no plan to go that route in the near future.

darkk · 2017-08-16T20:48:04Z

ansible/roles/proteus-events/tasks/main.yml

+    createhome: no
+    shell: /sbin/nologin
+    comment: "Proteus User"
+    state: present


UID clash is a blocker to merge it back to master, so I'll have to fix it and redeploy. It'll require some minor proteus downtime.

Ok. How much downtime are we talking about?

Minutes if everything is OK, couple of hours if something goes seriously wrong :)

darkk · 2017-08-16T20:50:10Z

ansible/roles/proteus-events/templates/proteus-events.service

+[Service]
+User={{ proteus_user }}
+Group={{ proteus_group }}
+ExecStart={{ proteus_path }}/proteus-events --config {{ proteus_config_path }}/proteus-events.toml start


Is it the user for proteus-events or for proteus-frontend ?

They all use the same user (proteus)

In any case this is the user running the proteus-events binary.

I mean, is there any reason for these services to share unix user? As far as I see, there is no -- these services share no objects.

I'm sorry for the form of the question :-)

darkk · 2017-08-16T20:51:55Z

ansible/roles/proteus-notify/templates/gorush.service

+
+[Service]
+User={{ proteus_user }}
+Group={{ proteus_group }}


Is it the user for proteus-frontend or for gorush ?

In this case it's for gorush.

darkk · 2017-08-16T20:55:09Z

ansible/deploy-orchestration.yml

+- hosts: orchestration
+  gather_facts: true
+  roles:
+    - docker_py


The single use of orchestration group I see is to install docker and docker is useless for db-1 hosts. What's the point of that group?

I think I was previously using this group as a "hack" to run dom0 and gather_facts pre-emptively. I guess it can be removed.

darkk · 2017-08-16T20:55:32Z

ansible/deploy-orchestration.yml

+  gather_facts: false
+  vars:
+    psql_db_name: proteus
+    psql_db_user: proteus


It's good to have different db_name and db_user for grepability

darkk · 2017-08-16T20:55:50Z

ansible/deploy-proteus-events.yml

+    proteus_events_https_port: 443
+    letsencrypt_nginx: yes
+    letsencrypt_domains: "{{ inventory_hostname }}"
+    proteus_database_url: "postgres://proteus:{{ proteus_database_password_testing }}@db-1.proteus.test.ooni.io:5432/proteus?sslmode=require"


sslmode=require is nice, but it's using /etc/ssl/certs/ssl-cert-snakeoil., so what's the point?

It's better to protect against a purely passive attack and requiring an active attack to leak the password.
Also it's probably not a big deal since we are communicating inside of our network.

darkk · 2017-08-16T20:57:09Z

ansible/roles/proteus-events/tasks/main.yml

+    mode: 0755
+    owner: "{{ proteus_user }}"
+    group: "{{ proteus_group }}"
+    copy: no


@bassosimone please, add sha256sums file to releases of proteus binaries and the script that generates releases.

These checksum files may be used to speedup re-run of the playbook as ger_url module accepts checksum argument like that.

Filed a ticket about it: ooni/orchestra#22

darkk · 2017-08-16T20:57:44Z

ansible/deploy-proteus-notify.yml

+    proteus_apn_key_password: "{{ proteus_apn_key_password_testing }}"
+    proteus_apn_key_content: "{{ proteus_apn_key_content_testing }}"
+    proteus_environment: "production"
+    proteus_notify_basic_auth_password: "{{ proteus_notify_basic_auth_password_testing }}"


@hellais and @bassosimone, please, read paragraph on variable precedence in ansible, it explains that you probably don't need inderection like proteus_database_password_testing, you just need proteus_database_password defined in right places

If you can clean it up a bit that would be good. Though while deploying a first version of production I accidentally used the testing password in production and to avoid the risk of confusion I found it more clear to just label them as such.

Though while deploying a first version of production I accidentally used the testing password in production and to avoid the risk of confusion I found it more clear to just label them as such.

Yeah, I guess this is a good argument in favour of a little more redundancy and robustness.

robustness

Yep. I just wanted to say, that reuse of proteus_user in four different roles is probably a bad idea :-)

darkk · 2017-08-18T13:36:00Z

ansible/roles/proteus-notify/defaults/main.yml

+gorush_url: "{{ gorush_download_url }}/v{{ gorush_version }}/gorush-v{{ gorush_version }}-linux-{{ go_arch }}"
+gorush_path: "{{ proteus_install_path }}/gorush-v{{ gorush_version }}-linux-{{ go_arch }}"
+
+notification_backend: "gorush"


Should code for notification_backend: proteus be removed and corresponding binaries and services cleaned up?

… stripping at f47ab38

… it typo?)

darkk · 2017-08-18T14:16:26Z

ansible/roles/proteus-registry/handlers/main.yml

+- name: restart proteus
+  systemd: name=proteus-registry.service state=restarted
+- name: reload proteus
+  systemd: name=proteus-registry.service state=reloaded


IMHO same handler name for different actions is a way to subtle bugs.

See ansible/ansible#9784 and ansible/ansible#11496

Conflicts: ansible/inventory

hellais and others added 30 commits May 13, 2017 17:03

Run apt-get update before running adm and docker_py apt commands

132710a

Add roles for deploying orchestration

edd5e42

Bind all orchestrators on port 443

7b03b97

Fix calling of the proteus-* service command

Fix naming of proteus-events

7632958

Fix hostname for database connection

3fc7652

Also allow localhost non peer based auth for postgres user

723a010

Add roles for nodejs and proteus-frontend

35a282e

Split up proteus deployment playbooks

968ca4d

* Add admin password to vault * Write the proteus notify certificates

Update to proteus v0.1.0-beta.3

0579e4a

Merge branch 'master' into feature/orchestration

8fc0fc0

Bump versions of proteus-*

b8cee0d

Update orchestration roles to deploy production endpoints

581e2f0

Bump proteus-* to beta.5

1df5c72

* Add sbs' key

Remove duplicate string in vault

f13c904

Remove DB requirements (now managed via migration scripts)

50a9e1c

Move proteus_database_url outside of vault

f3a8d3b

requirements.txt: force v2.2.2 for ansible

e185aca

.gitignore: ignore venv and .DS_Store

a863c94

g/o/vault: add testing FCM key

3f4b927

Remove dom0 related role

6164b31

Use sudo as become method

a64930a

Fix proteus_db_url hostname

3a9cb82

r/p-e/d/main.yml: fix typo

5028e44

Install yarn globally for proteus-frontend

e2dc014

Always restart the frontend

4a561be

fix,refactor(proteus-events): topics correctness

dc87810

1) As stated in ooni/orchestra#13, we must not set a topic for Android because that leads to FCM rejecting the message. 2) Define two different topics for iOS, one for production and one for testing, because I guess we'll need them soon.

Merge pull request #129 from TheTorProject/fix/fcm

09d7cd8

fix,refactor(proteus-events): topics correctness

feature(roles/proteus-*): use proteus v0.1.0-beta.7

d175732

feature(roles/proteus-*): use proteus v0.1.0-beta.8

473d17b

fix(proteus-events): propagate Android click_action

b5b547b

bassosimone reviewed Aug 16, 2017

View reviewed changes

darkk added 4 commits August 16, 2017 22:35

Merge branch 'master' into feature/orchestration

f3c1bc3

Conflicts: ansible/roles/adm/tasks/main.yml

ansible: avoid vault_ names everywhere besides vars.yml close to vault

f0abef3

README: fix syntax

4c91cd1

ansible: avoid vault_ names once again, check that in Travis

f47ab38

darkk reviewed Aug 16, 2017

View reviewed changes

darkk added 2 commits August 18, 2017 02:13

ansible: drop docker from orchestration group

c85a971

inventory: rearrange the file, bring irc-bouncer back to hkg

358ebf9

darkk reviewed Aug 18, 2017

View reviewed changes

darkk added 3 commits August 18, 2017 17:08

ansible: fix template loops that happened after bling vault_ prefix…

7757a11

… stripping at f47ab38

orchestration: restart gorush instead of proteus in gorush tasks (was…

6ade6dd

… it typo?)

orchestration: drop shared proteus system UID for services

1f22f09

darkk reviewed Aug 18, 2017

View reviewed changes

darkk added 4 commits August 21, 2017 13:43

ansible: run_once should not be combined with when

6562c66

See ansible/ansible#9784 and ansible/ansible#11496

node_exporter: fix systemd-related leftovers after 9533b91

85f8c3a

ansible: limit log filename length on Linux

3af4a33

Merge branch 'master' into feature/orchestration

2430ead

Conflicts: ansible/inventory

darkk merged commit df0b67d into master Aug 21, 2017

darkk mentioned this pull request Aug 21, 2017

Cleanup orchestration deployment #145

Closed

11 tasks

hellais deleted the feature/orchestration branch August 22, 2017 18:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/orchestration #112

Feature/orchestration #112

hellais commented Jun 12, 2017

bassosimone Aug 16, 2017

darkk Aug 16, 2017

darkk Aug 16, 2017

hellais Aug 16, 2017 •

edited

Loading

darkk Aug 16, 2017

hellais Aug 16, 2017

darkk Aug 16, 2017

hellais Aug 16, 2017

darkk Aug 17, 2017

hellais Aug 17, 2017

darkk Aug 16, 2017

darkk Aug 16, 2017

hellais Aug 16, 2017

darkk Aug 17, 2017

darkk Aug 16, 2017

hellais Aug 16, 2017

hellais Aug 16, 2017

darkk Aug 17, 2017

hellais Aug 17, 2017

darkk Aug 16, 2017

hellais Aug 16, 2017

darkk Aug 16, 2017

hellais Aug 16, 2017

darkk Aug 16, 2017

darkk Aug 16, 2017

hellais Aug 16, 2017

darkk Aug 16, 2017

hellais Aug 16, 2017 •

edited

Loading

darkk Aug 16, 2017

hellais Aug 16, 2017

bassosimone Aug 16, 2017

darkk Aug 17, 2017

darkk Aug 18, 2017

darkk Aug 18, 2017

Feature/orchestration #112

Feature/orchestration #112

Conversation

hellais commented Jun 12, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hellais Aug 16, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hellais Aug 16, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hellais Aug 16, 2017 •

edited

Loading

hellais Aug 16, 2017 •

edited

Loading