Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Schedule module: creating an excess task #23

Open
akimrx opened this issue Apr 19, 2021 · 15 comments
Open

Schedule module: creating an excess task #23

akimrx opened this issue Apr 19, 2021 · 15 comments

Comments

@akimrx
Copy link
Contributor

akimrx commented Apr 19, 2021

SUMMARY

Hi there. As you can see, I have only one scheduled task in the playbook and I just wanted to turn it off. Instead of updating an existing task, the module created a new one. This, unfortunately, does not have the effect that would be expected from disabling, because the old job still exists.

Also, if I want to change, for example, the interval for a task, the module will simply create a new one, thus generating exactly the same task and now it will be executed even more often.

The task I'm running:

- name: proxysql - configure scheduler
  proxysql_scheduler:
    login_user: "{{ proxysql.admin.user }}"
    login_password: "{{ proxysql.admin.password }}"
    login_port: "{{ proxysql.admin.listen.port.plain | int }}"
    filename: "{{ proxysql.path.lib }}/{{ item.bin }}"
    interval_ms: "{{ item.interval }}"
    arg1: "{{ item.arg1 | default('NULL') }}"
    arg2: "{{ item.arg2 | default('NULL') }}"
    arg3: "{{ item.arg3 | default('NULL') }}"
    arg4: "{{ item.arg4 | default('NULL') }}"
    arg5: "{{ item.arg5 | default('NULL') }}"
    state: present
    active: no
    load_to_runtime: true
    save_to_disk: true
  with_items:
    - bin: replica_check
      interval: 5000  # 5s
      arg1: "127.0.0.1"
      arg2: "{{ proxysql.admin.listen.port.plain }}"
      arg3: "{{ proxysql_replication_hostgroups.product.reader_hostgroup }}"
      arg4: warning
      arg5: "{{ proxysql.path.log }}/replica_check.log"
  tags:
    - proxysql-external-checks

Task result:

TASK [service/proxysql : proxysql - configure scheduler] ********************************************************************************************************************************************
changed: [HIDDEN SERVER NAME] => (item={'bin': 'replica_check', 'interval': 5000, 'arg1': '127.0.0.1', 'arg2': '7030', 'arg3': '2', 'arg4': 'warning', 'arg5': '/opt/log/proxysql/replica_check.log'})

Scheduler table from ProxySQL:

ProxySQL> select * from scheduler;
+----+--------+-------------+---------------------------------+-----------+------+------+---------+-------------------------------------+---------+
| id | active | interval_ms | filename                        | arg1      | arg2 | arg3 | arg4    | arg5                                | comment |
+----+--------+-------------+---------------------------------+-----------+------+------+---------+-------------------------------------+---------+
| 1  | 1      | 5000        | /opt/lib/proxysql/replica_check | 127.0.0.1 | 7030 | 2    | warning | /opt/log/proxysql/replica_check.log |         |
| 2  | 0      | 5000        | /opt/lib/proxysql/replica_check | 127.0.0.1 | 7030 | 2    | warning | /opt/log/proxysql/replica_check.log |         |
+----+--------+-------------+---------------------------------+-----------+------+------+---------+-------------------------------------+---------+
ISSUE TYPE
  • Bug Report
COMPONENT NAME

proxysql_scheduler

ANSIBLE VERSION
$ ansible --version
ansible 2.9.18
  config file = /etc/ansible/ansible.cfg
  configured module search path = ['/Users/akimrx/<HIDDEN>/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /Users/akimrx/<HIDDEN>/ansible/venv/lib/python3.7/site-packages/ansible
  executable location = /Users/akimrx/<HIDDEN>/ansible/venv/bin/ansible
  python version = 3.7.6 (default, Mar 29 2021, 14:13:20) [Clang 11.0.3 (clang-1103.0.32.62)]
CONFIGURATION
ANSIBLE_PIPELINING(/etc/ansible/ansible.cfg) = True
ANSIBLE_SSH_ARGS(/etc/ansible/ansible.cfg) = -o ControlMaster=auto -o ControlPersist=60s -o PreferredAuthentications=publickey
COLOR_CHANGED(/etc/ansible/ansible.cfg) = bright blue
COLOR_ERROR(/etc/ansible/ansible.cfg) = bright red
COLOR_HIGHLIGHT(/etc/ansible/ansible.cfg) = white
COLOR_VERBOSE(/etc/ansible/ansible.cfg) = dark gray
COLOR_WARN(/etc/ansible/ansible.cfg) = bright purple
DEFAULT_FORKS(/etc/ansible/ansible.cfg) = 30
DEFAULT_GATHER_SUBSET(/etc/ansible/ansible.cfg) = ['all', '!facter', '!ohai']
DEFAULT_MANAGED_STR(/etc/ansible/ansible.cfg) = This file is managed by ansible, don't make changes here - they will be overwritten.
DEFAULT_POLL_INTERVAL(/etc/ansible/ansible.cfg) = 1
DEFAULT_PRIVATE_KEY_FILE(/etc/ansible/ansible.cfg) = /Users/akimrx/.ssh/id_rsa
DEFAULT_REMOTE_USER(/etc/ansible/ansible.cfg) = root
DEFAULT_ROLES_PATH(env: ANSIBLE_ROLES_PATH) = ['/Users/akimrx/<HIDDEN>/playbooks/roles']
DEFAULT_STDOUT_CALLBACK(/etc/ansible/ansible.cfg) = skippy
DEFAULT_TRANSPORT(/etc/ansible/ansible.cfg) = smart
DISPLAY_SKIPPED_HOSTS(/etc/ansible/ansible.cfg) = False
DUPLICATE_YAML_DICT_KEY(env: ANSIBLE_DUPLICATE_YAML_DICT_KEY) = ignore
HOST_KEY_CHECKING(/etc/ansible/ansible.cfg) = False
INVENTORY_CACHE_ENABLED(/etc/ansible/ansible.cfg) = True
INVENTORY_CACHE_PLUGIN(/etc/ansible/ansible.cfg) = jsonfile
PLAYBOOK_DIR(env: ANSIBLE_PLAYBOOK_DIR) = /Users/akimrx/<HIDDEN>/ansible/playbooks
PLAYBOOK_VARS_ROOT(env: ANSIBLE_PLAYBOOK_VARS_ROOT) = all
RETRY_FILES_ENABLED(/etc/ansible/ansible.cfg) = False
OS / ENVIRONMENT
  • macOS Catalina 10.15.6
  • Darwin macbook-pro 19.6.0 Darwin Kernel Version 19.6.0: Thu Jun 18 20:49:00 PDT 2020; root:xnu-6153.141.1~1/RELEASE_X86_64 x86_64
RECOMMENDATIONS

I think it is necessary to add the id for the scheduled task. That is, just as it is done in the query rules module -- rule_id.
That is:

  proxysql_scheduler:
    ...
    id: 1  <--- This
    filename: "{{ proxysql.path.lib }}/{{ item.bin }}"
    interval_ms: "{{ item.interval }}"
    ...
@akimrx
Copy link
Contributor Author

akimrx commented Apr 22, 2021

So far, I've solved the problem by adding a task that will delete all the schedules before adding them, but this is a temporary crutch, and not a normal solution, as it seems to me.

- name: proxysql - clean proxysql scheduler
  community.mysql.mysql_query:
    login_user: "{{ proxysql.admin.user }}"
    login_password: "{{ proxysql.admin.password }}"
    login_host: 127.0.0.1
    login_port: "{{ proxysql.admin.listen.port.plain | int }}"
    query:
      - "DELETE FROM scheduler"
      - "SAVE SCHEDULER TO DISK"
  tags:
    - proxysql-scheduler

- name: proxysql - configure scheduler
  proxysql_scheduler:
    login_user: "{{ proxysql.admin.user }}"
    login_password: "{{ proxysql.admin.password }}"
    login_port: "{{ proxysql.admin.listen.port.plain | int }}"
    filename: "{{ proxysql.path.lib }}/{{ item.bin }}"
    interval_ms: "{{ item.interval }}"
...

@markuman
Copy link
Member

proxysql_scheduler search for the entry with

SELECT count(*) AS `schedule_count`
               FROM scheduler
               WHERE active = %s
                 AND interval_ms = %s
                 AND filename = %s"""

so when you request the same task-parameters just with active: no, it won't find and create a new one.
the only option to solve this issue is, to treat only the filename column as a unique key. even if it is not. otherwise it is impossible to tell if you want to add a new schedule task or adjust the existing one.

current scheduler table looks like this

CREATE TABLE scheduler (
    id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
    active INT CHECK (active IN (0,1)) NOT NULL DEFAULT 1,
    interval_ms INTEGER CHECK (interval_ms>=100 AND interval_ms<=100000000) NOT NULL,
    filename VARCHAR NOT NULL,
    arg1 VARCHAR,
    arg2 VARCHAR,
    arg3 VARCHAR,
    arg4 VARCHAR,
    arg5 VARCHAR,
    comment VARCHAR NOT NULL DEFAULT '')

say you got this

ProxySQL> select * from scheduler;
+----+--------+-------------+---------------------------------+-----------+------+------+---------+-------------------------------------+---------+
| id | active | interval_ms | filename                        | arg1      | arg2 | arg3 | arg4    | arg5                                | comment |
+----+--------+-------------+---------------------------------+-----------+------+------+---------+-------------------------------------+---------+
| 1  | 1      | 5000        | /opt/lib/proxysql/replica_check | 127.0.0.1 | 7030 | 2    | warning | /opt/log/proxysql/replica_check.log |         |
+----+--------+-------------+---------------------------------+-----------+------+------+---------+-------------------------------------+---------+

and request

- name: some scheduler task
  proxysql_scheduler:
    filename:  /opt/lib/proxysql/replica_check
    interval_ms: 5000
    arg1: "127.0.0.1"
    arg2: 7030
    arg3: 2
    arg4: error
    arg5:  /opt/log/proxysql/replica_check.log
    state: present
    active: yes
    load_to_runtime: true
    save_to_disk: true

it is undefined if you want a)

ProxySQL> select * from scheduler;
+----+--------+-------------+---------------------------------+-----------+------+------+---------+-------------------------------------+---------+
| id | active | interval_ms | filename                        | arg1      | arg2 | arg3 | arg4    | arg5                                | comment |
+----+--------+-------------+---------------------------------+-----------+------+------+---------+-------------------------------------+---------+
| 1  | 1      | 5000        | /opt/lib/proxysql/replica_check | 127.0.0.1 | 7030 | 2    | error | /opt/log/proxysql/replica_check.log |         |
+----+--------+-------------+---------------------------------+-----------+------+------+---------+-------------------------------------+---------+

or b)

ProxySQL> select * from scheduler;
+----+--------+-------------+---------------------------------+-----------+------+------+---------+-------------------------------------+---------+
| id | active | interval_ms | filename                        | arg1      | arg2 | arg3 | arg4    | arg5                                | comment |
+----+--------+-------------+---------------------------------+-----------+------+------+---------+-------------------------------------+---------+
| 1  | 1      | 5000        | /opt/lib/proxysql/replica_check | 127.0.0.1 | 7030 | 2    | warning | /opt/log/proxysql/replica_check.log |         |
| 2  | 1      | 5000        | /opt/lib/proxysql/replica_check | 127.0.0.1 | 7030 | 2    | error   | /opt/log/proxysql/replica_check.log |         |
+----+--------+-------------+---------------------------------+-----------+------+------+---------+-------------------------------------+---------+

the current implementation treats active, interval_ms and filename as one combined private key.

Another possibility is to abuse the comment column with some conventions....find task by comment that must be uniq.

Any other ideas or opinions about this? @akimrx @Andersson007

@Andersson007
Copy link
Contributor

CC @bmildren

@Andersson007
Copy link
Contributor

Can we:

  1. add the id field (optional) as was suggested? I think it will break nothing backwards

@markuman
Copy link
Member

True. And update a scheduler task happens only when an id is provided.
But it would be helpful if we provide a proxysql_scheduler_info module to determine ids.

@Andersson007
Copy link
Contributor

or when a user defined it originally (is it possible to set it when creating the task?), they can manipulate tasks using those ids later

@Andersson007
Copy link
Contributor

True. And update a scheduler task happens only when an id is provided.
But it would be helpful if we provide a proxysql_scheduler_info module to determine ids.

But it's always a good idea to have info modules. It's also possible to create one proxysql_info module like in mysql_info which will return all the information from the cluster. But I'm not against the separate modules as well

@Andersson007
Copy link
Contributor

I'm personally in favor of one module like mysql_info

@markuman
Copy link
Member

And update a scheduler task happens only when an id is provided.

or when a user defined it originally (is it possible to set it when creating the task?), they can manipulate tasks using those ids later

Yes, I think it is possible. I just wanted to point out, if the id is not provided, all we can do is creating a new scheduler task.
So idempotent behavior is only possible when the id is provided.

@Andersson007
Copy link
Contributor

Yes, I think it is possible. I just wanted to point out, if the id is not provided, all we can do is creating a new scheduler task.
So idempotent behavior is only possible when the id is provided.

Yes and, imo, it should be reflected in the documentation

@markuman
Copy link
Member

Do we still need a proxysql_scheduler_info module? Because proxysql_info exists since 1.2.0

    - name: Add a schedule
      community.proxysql.proxysql_scheduler:
        login_user: 'admin'
        login_password: 'admin'
        interval_ms: 1000
        filename: "/opt/maintenance.py"
        state: present
        load_to_runtime: False

    - name: readout
      community.proxysql.proxysql_info:
        login_user: admin
        login_password: admin
      register: proxysql_information

    - debug:
        var: proxysql_information.scheduler

results in


TASK [debug] ********************************************************************************************************************************************************************************************************************
ok: [proxysql] => {
    "proxysql_information.scheduler": [
        {
            "active": "1",
            "arg1": null,
            "arg2": null,
            "arg3": null,
            "arg4": null,
            "arg5": null,
            "comment": "",
            "filename": "/opt/maintenance.py",
            "id": "1",
            "interval_ms": "1000"
        }
    ]
}

So the only task here is to fix the module itself, and not to add a _info module.

@Andersson007
Copy link
Contributor

So the only task here is to fix the module itself, and not to add a _info module.

Sounds very sensible to me

@markuman
Copy link
Member

markuman commented Feb 4, 2022

here we discussed the behaviour of adding or updating schedule tasks and decided to fallback to a new id parameter for updates.
So far so good.

Say now, we ended up here

ProxySQL> select * from scheduler;
+----+--------+-------------+---------------------------------+-----------+------+------+---------+-------------------------------------+---------+
| id | active | interval_ms | filename                        | arg1      | arg2 | arg3 | arg4    | arg5                                | comment |
+----+--------+-------------+---------------------------------+-----------+------+------+---------+-------------------------------------+---------+
| 1  | 1      | 5000        | /opt/lib/proxysql/replica_check | 127.0.0.1 | 7030 | 2    | warning | /opt/log/proxysql/replica_check.log |         |
| 2  | 1      | 5000        | /opt/lib/proxysql/replica_check | 127.0.0.1 | 7030 | 2    | error   | /opt/log/proxysql/replica_check.log |         |
+----+--------+-------------+---------------------------------+-----------+------+------+---------+-------------------------------------+---------+

and we'll request (basically from the example section)

- name: remove scheduler task
  proxysql_scheduler:
    filename:  /opt/lib/proxysql/replica_check
    config_file: '~/proxysql.cnf'
    state: absent

Currently the module will fail because it found more than one rule. https://github.com/ansible-collections/community.proxysql/blob/main/plugins/modules/proxysql_scheduler.py#L391

To keep backwards compatiblity, we must create a 2nd absent logic.
id is mutually exclusive with all other parameters when state is absent.

And maybe we should think about to deprecate the current absentlogic, because it is very very limited.

@markuman
Copy link
Member

markuman commented Feb 4, 2022

id is mutually exclusive with all other parameters when state is absent.

is this possible?

@Andersson007
Copy link
Contributor

Andersson007 commented Feb 7, 2022

id is mutually exclusive with all other parameters when state is absent.

is this possible?

  1. yep, just with a condition if id and state != 'absent' then fail or even alternatively we can just mention in the option's doc that it's ignored otherwise and ignore instead of failing
  2. deprecation of logic in case of its limitation sounds good. If we gonna do that, we should
  • announce it now as a breaking change (that it's coming in release blahblah)
  • publish the announcement as a part of changelog fragment of the following release
  • create the breaking PR
  • merge it before the next major release

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants