-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/orchestration #112
Conversation
Fix calling of the proteus-* service command
* Add admin password to vault * Write the proteus notify certificates
* Add sbs' key
1) As stated in ooni/orchestra#13, we must not set a topic for Android because that leads to FCM rejecting the message. 2) Define two different topics for iOS, one for production and one for testing, because I guess we'll need them soon.
fix,refactor(proteus-events): topics correctness
@@ -3,4 +3,4 @@ | |||
- include: config.yml | |||
become: yes | |||
become_user: postgres | |||
become_method: su | |||
become_method: sudo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do have sudo
in dom0
, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, Also, that' default become_method.
--- | ||
- name: Add YarnPkg apt package signing key | ||
apt_key: | ||
url: "https://dl.yarnpkg.com/debian/pubkey.gpg" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's "better" to import from keyservers using full fingerprint.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense.
|
||
- name: Add YarnPkg apt repository | ||
apt_repository: | ||
repo: 'deb https://dl.yarnpkg.com/debian/ stable main' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is stable
for stable yarn release or for debian:stable ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's for debian:stable
shell: "psql -U {{ proteus_db_user}} -h 127.0.0.1 -d {{ proteus_db_name }} -f {{ proteus_db_schema_path}}/{{ item }}" | ||
environment: | ||
PGPASSWORD: "{{ proteus_db_password }}" | ||
with_items: "{{ proteus_db_files }}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What will happen when one needs to ALTER something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually all of this stuff can probably be removed as it's now part of proteus itself. We only need to create the proteus_db_user and proteus will take care of populating the database.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC, @bassosimone was struggling with sql updates embedded into the binary while launching concurrent processes as same DB is used by various services. So there was a suggestion to move SQL files away from the binary and apply them using some other tooling. Is it still the case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I think you were the one suggesting that as a solution, but the issue in the migration scripts was solved, so there is no plan to go that route in the near future.
createhome: no | ||
shell: /sbin/nologin | ||
comment: "Proteus User" | ||
state: present |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/sbin/nologin
is not a valid shell.
UID will clash with superq as soon as dom0 with node_exporter is deployed there:
events.proteus.test.ooni.io | SUCCESS | rc=0 | (stdout) uid=10007(proteus) gid=10007(proteus) groups=10007(proteus)
notify.proteus.test.ooni.io | SUCCESS | rc=0 | (stdout) uid=10007(proteus) gid=10007(proteus) groups=10007(proteus)
proteus.test.ooni.io | SUCCESS | rc=0 | (stdout) uid=10007(proteus) gid=10007(proteus) groups=10007(proteus)
registry.proteus.test.ooni.io | SUCCESS | rc=0 | (stdout) uid=10007(proteus) gid=10007(proteus) groups=10007(proteus)
notify.proteus.ooni.io | SUCCESS | rc=0 | (stdout) uid=10006(proteus) gid=10006(proteus) groups=10006(proteus)
registry.proteus.ooni.io | SUCCESS | rc=0 | (stdout) uid=10006(proteus) gid=10006(proteus) groups=10006(proteus)
events.proteus.ooni.io | SUCCESS | rc=0 | (stdout) uid=10006(proteus) gid=10006(proteus) groups=10006(proteus)
proteus.ooni.io | SUCCESS | rc=0 | (stdout) uid=10006(proteus) gid=10006(proteus) groups=10006(proteus)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
UID clash is a blocker to merge it back to master, so I'll have to fix it and redeploy. It'll require some minor proteus downtime.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. How much downtime are we talking about?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minutes if everything is OK, couple of hours if something goes seriously wrong :)
[Service] | ||
User={{ proteus_user }} | ||
Group={{ proteus_group }} | ||
ExecStart={{ proteus_path }}/proteus-events --config {{ proteus_config_path }}/proteus-events.toml start |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it the user for proteus-events
or for proteus-frontend
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They all use the same user (proteus)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In any case this is the user running the proteus-events binary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean, is there any reason for these services to share unix user? As far as I see, there is no -- these services share no objects.
I'm sorry for the form of the question :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No reason.
|
||
[Service] | ||
User={{ proteus_user }} | ||
Group={{ proteus_group }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it the user for proteus-frontend or for gorush ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case it's for gorush.
ansible/deploy-orchestration.yml
Outdated
- hosts: orchestration | ||
gather_facts: true | ||
roles: | ||
- docker_py |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The single use of orchestration
group I see is to install docker and docker is useless for db-1
hosts. What's the point of that group?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I was previously using this group as a "hack" to run dom0 and gather_facts pre-emptively. I guess it can be removed.
gather_facts: false | ||
vars: | ||
psql_db_name: proteus | ||
psql_db_user: proteus |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's good to have different db_name and db_user for grepability
proteus_events_https_port: 443 | ||
letsencrypt_nginx: yes | ||
letsencrypt_domains: "{{ inventory_hostname }}" | ||
proteus_database_url: "postgres://proteus:{{ proteus_database_password_testing }}@db-1.proteus.test.ooni.io:5432/proteus?sslmode=require" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sslmode=require
is nice, but it's using /etc/ssl/certs/ssl-cert-snakeoil.
, so what's the point?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's better to protect against a purely passive attack and requiring an active attack to leak the password.
Also it's probably not a big deal since we are communicating inside of our network.
mode: 0755 | ||
owner: "{{ proteus_user }}" | ||
group: "{{ proteus_group }}" | ||
copy: no |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bassosimone please, add sha256sums file to releases of proteus binaries and the script that generates releases.
These checksum files may be used to speedup re-run of the playbook as ger_url
module accepts checksum
argument like that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Filed a ticket about it: ooni/orchestra#22
proteus_apn_key_password: "{{ proteus_apn_key_password_testing }}" | ||
proteus_apn_key_content: "{{ proteus_apn_key_content_testing }}" | ||
proteus_environment: "production" | ||
proteus_notify_basic_auth_password: "{{ proteus_notify_basic_auth_password_testing }}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hellais and @bassosimone, please, read paragraph on variable precedence in ansible, it explains that you probably don't need inderection like proteus_database_password_testing
, you just need proteus_database_password
defined in right places
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you can clean it up a bit that would be good. Though while deploying a first version of production I accidentally used the testing password in production and to avoid the risk of confusion I found it more clear to just label them as such.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Though while deploying a first version of production I accidentally used the testing password in production and to avoid the risk of confusion I found it more clear to just label them as such.
Yeah, I guess this is a good argument in favour of a little more redundancy and robustness.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
robustness
Yep. I just wanted to say, that reuse of proteus_user
in four different roles is probably a bad idea :-)
gorush_url: "{{ gorush_download_url }}/v{{ gorush_version }}/gorush-v{{ gorush_version }}-linux-{{ go_arch }}" | ||
gorush_path: "{{ proteus_install_path }}/gorush-v{{ gorush_version }}-linux-{{ go_arch }}" | ||
|
||
notification_backend: "gorush" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should code for notification_backend: proteus
be removed and corresponding binaries and services cleaned up?
- name: restart proteus | ||
systemd: name=proteus-registry.service state=restarted | ||
- name: reload proteus | ||
systemd: name=proteus-registry.service state=reloaded |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO same handler name for different actions is a way to subtle bugs.
Conflicts: ansible/inventory
This is based on top of the monitoring pull request and includes all the roles for deploying probe orchestration backends.