-
Notifications
You must be signed in to change notification settings - Fork 13.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AIRFLOW-6326] Sort cli commands and arg #6881
Conversation
airflow/bin/cli.py
Outdated
cls._add_subcommand(sub_subparsers, command) | ||
else: | ||
for arg in sub['args']: | ||
for arg in sorted(sub['args'], key=lambda x: cls.args[x].flags[0]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I'm not sure if this change is acceptable? Should we keep args in manual or alphabetical? Because some of args help user more easy to catch specific arg like DAG_ID
or RUN_ID
# airflow shceduler -h
# before
optional arguments:
-h, --help show this help message and exit
-d DAG_ID, --dag_id DAG_ID
The id of the dag to run
-sd SUBDIR, --subdir SUBDIR
File location or directory from which to look for the dag. Defaults to '[AIRFLOW_HOME]/dags' where [AIRFLOW_HOME] is the value you set for 'AIRFLOW_HOME' config you set in 'airflow.cfg'
-n NUM_RUNS, --num_runs NUM_RUNS
Set the number of runs to execute before exiting
-p, --do_pickle Attempt to pickle the DAG object to send over to the workers, instead of letting workers run their version of the code
--pid [PID] PID file location
-D, --daemon Daemonize instead of running in the foreground
--stdout STDOUT Redirect stdout to this file
--stderr STDERR Redirect stderr to this file
-l LOG_FILE, --log-file LOG_FILE
Location of the log file
# after
optional arguments:
-h, --help show this help message and exit
--pid [PID] PID file location
--stderr STDERR Redirect stderr to this file
--stdout STDOUT Redirect stdout to this file
-D, --daemon Daemonize instead of running in the foreground
-d DAG_ID, --dag_id DAG_ID
The id of the dag to run
-l LOG_FILE, --log-file LOG_FILE
Location of the log file
-n NUM_RUNS, --num_runs NUM_RUNS
Set the number of runs to execute before exiting
-p, --do_pickle Attempt to pickle the DAG object to send over to the workers, instead of letting workers run their version of the code
-sd SUBDIR, --subdir SUBDIR
File location or directory from which to look for the dag. Defaults to '[AIRFLOW_HOME]/dags' where [AIRFLOW_HOME] is the value you set for 'AIRFLOW_HOME' config you set in 'airflow.cfg'
# airflow dags -h
# before
usage: airflow dags trigger [-h] [-sd SUBDIR] [-r RUN_ID] [-c CONF]
[-e EXEC_DATE]
dag_id
positional arguments:
dag_id The id of the dag
optional arguments:
-h, --help show this help message and exit
-sd SUBDIR, --subdir SUBDIR
File location or directory from which to look \for the dag. Defaults to '[AIRFLOW_HOME]/dags' where [AIRFLOW_HOME] is the value you set \for 'AIRFLOW_HOME' config you set \in 'airflow.cfg'
-r RUN_ID, --run_id RUN_ID
Helps to identify this run
-c CONF, --conf CONF JSON string that gets pickled into the DagRun\'s conf attribute
-e EXEC_DATE, --exec_date EXEC_DATE
The execution date of the DAG
# after
usage: airflow dags trigger [-h] [-c CONF] [-e EXEC_DATE] [-r RUN_ID]
[-sd SUBDIR]
dag_id
positional arguments:
dag_id The id of the dag
optional arguments:
-h, --help show this help message and exit
-c CONF, --conf CONF JSON string that gets pickled into the DagRun\'s conf attribute
-e EXEC_DATE, --exec_date EXEC_DATE
The execution date of the DAG
-r RUN_ID, --run_id RUN_ID
Helps to identify this run
-sd SUBDIR, --subdir SUBDIR
File location or directory from which to look \for the dag. Defaults to '[AIRFLOW_HOME]/dags' where [AIRFLOW_HOME] is the value you set \for 'AIRFLOW_HOME' config you set \in 'airflow.cfg'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current order of the elements is random. No one had ever dealt with their order before. I think we don't have to worry about the current order. I prefer alphabetical order.
@@ -1022,10 +1022,10 @@ def _add_subcommand(cls, subparsers, sub): | |||
if subcommands: | |||
sub_subparsers = sub_proc.add_subparsers(dest='subcommand') | |||
sub_subparsers.required = True | |||
for command in subcommands: | |||
for command in sorted(subcommands, key=lambda x: x['name']): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this change is acceptable, it keep subcommand
in alphabetical, it change cli command like below
# airflow pools -h
# before
usage: airflow pools [-h] {list,get,set,delete,import,export} ...
positional arguments:
{list,get,set,delete,import,export}
list List pools
get Get pool size
set Configure pool
delete Delete pool
import Import pools
export Export all pools
optional arguments:
-h, --help show this help message and exit
# after
usage: airflow pools [-h] {delete,export,get,import,list,set} ...
positional arguments:
{delete,export,get,import,list,set}
delete Delete pool
export Export all pools
get Get pool size
import Import pools
list List pools
set Configure pool
optional arguments:
-h, --help show this help message and exit
All Travis-CI failed due to Timeout, force-push to start CI |
e884156
to
9b63932
Compare
b38649f
to
9b63932
Compare
Mistake commit with others patch I submit, force-push to start Travis-CI |
6400aa8
to
c39aed3
Compare
c39aed3
to
ea62ebc
Compare
@mik-laj Thanks for approving, but Travis still sad |
ea62ebc
to
06629ac
Compare
Is there anything I can do to help with this? |
I run Line 321 in 3d82e8c
this line run, table airflow.dag_run add a new dag run record but state still on running, so after 8 mins cause timeout error.
The point is I could not found why dag run don't change to |
06629ac
to
7a54a9a
Compare
Finally I found out CI failed due to The more odd think is odd thing, when I only change to sort of # before
{
'func': lazy_load_command('airflow.cli.commands.task_command.task_run'),
'name': 'run',
'help': "Run a single task instance",
'args': (
'dag_id', 'task_id', 'execution_date', 'subdir', # <---change this line
'mark_success', 'force', 'pool', 'cfg_path',
'local', 'raw', 'ignore_all_dependencies', 'ignore_dependencies',
'ignore_depends_on_past', 'ship_dag', 'pickle', 'job_id', 'interactive',),
},
# after
{
'func': lazy_load_command('airflow.cli.commands.task_command.task_run'),
'name': 'run',
'help': "Run a single task instance",
'args': (
'execution_date', 'dag_id', 'task_id', 'subdir',
'mark_success', 'force', 'pool', 'cfg_path',
'local', 'raw', 'ignore_all_dependencies', 'ignore_dependencies',
'ignore_depends_on_past', 'ship_dag', 'pickle', 'job_id', 'interactive',),
}, with error message |
Oh. We should not sort positional arguments. How can we fix it? |
I think, we can sort using followng function: @classmethod
def sort_args(cls, args):
def partition(pred, iterable):
"""
Use a predicate to partition entries into false entries and true entries
"""
t1, t2 = tee(iterable)
return filterfalse(pred, t1), filter(pred, t2)
positional, optional = partition(lambda x: cls.args[x].flags[0].startswith("-"), args)
yield from positional
yield from sorted(optional, key=lambda x: cls.args[x].flags[0]) |
What do you think about not ignoring case (key= lambda d: d.lower())? So we will have the following method: @classmethod
def sort_args(cls, args):
def partition(pred, iterable):
"""
Use a predicate to partition entries into false entries and true entries
"""
t1, t2 = tee(iterable)
return filterfalse(pred, t1), filter(pred, t2)
positional, optional = partition(lambda x: cls.args[x].flags[0].startswith("-"), args)
yield from positional
yield from sorted(optional, key=lambda x: cls.args[x].flags[0].lower().lstrip("-")) Examplle output:
I would like to draw your attention to the |
I think maybe we should keep the dash, when we use single is just the flag on open or close of some function, but double dash mean want to get some parameter from command line. IMO their different, and I think we should sort them separate usage: airflow tasks run [-h] [--cfg_path CFG_PATH] [--pool POOL] [--ship_dag]
[-A] [-f] [-i] [-I] [-int] [-l] [-m] [-p PICKLE]
[-sd SUBDIR]
dag_id task_id execution_date
positional arguments:
dag_id The id of the dag
task_id The id of the task
execution_date The execution date of the DAG
optional arguments:
-h, --help show this help message and exit
--cfg_path CFG_PATH Path to config file to use instead of airflow.cfg
--pool POOL Resource pool to use
--ship_dag Pickles (serializes) the DAG and ships it to the worker
-A, --ignore_all_dependencies
Ignores all non-critical dependencies, including ignore_ti_state and ignore_task_deps
-f, --force Ignore previous task instance state, rerun regardless if task already succeeded/failed
-i, --ignore_dependencies
Ignore task-specific dependencies, e.g. upstream, depends_on_past, and retry delay dependencies
-I, --ignore_depends_on_past
Ignore depends_on_past dependencies (but respect upstream dependencies)
-int, --interactive Do not capture standard output and error streams (useful for interactive debugging)
-l, --local Run the task using the LocalExecutor
-m, --mark_success Mark jobs as succeeded without running them
-p PICKLE, --pickle PICKLE
Serialized pickle object of the entire dag (used internally)
-sd SUBDIR, --subdir SUBDIR
File location or directory from which to look for the dag. Defaults to '[AIRFLOW_HOME]/dags' where [AIRFLOW_HOME] is the value you set for 'AIRFLOW_HOME' config you set in 'airflow.cfg' I think above is better than below usage: airflow tasks run [-h] [-A] [--cfg_path CFG_PATH] [-f] [-i] [-I] [-int]
[-l] [-m] [-p PICKLE] [--pool POOL] [-sd SUBDIR]
[--ship_dag]
dag_id task_id execution_date
positional arguments:
dag_id The id of the dag
task_id The id of the task
execution_date The execution date of the DAG
optional arguments:
-h, --help show this help message and exit
-A, --ignore_all_dependencies
Ignores all non-critical dependencies, including ignore_ti_state and ignore_task_deps
--cfg_path CFG_PATH Path to config file to use instead of airflow.cfg
-f, --force Ignore previous task instance state, rerun regardless if task already succeeded/failed
-i, --ignore_dependencies
Ignore task-specific dependencies, e.g. upstream, depends_on_past, and retry delay dependencies
-I, --ignore_depends_on_past
Ignore depends_on_past dependencies (but respect upstream dependencies)
-int, --interactive Do not capture standard output and error streams (useful for interactive debugging)
-l, --local Run the task using the LocalExecutor
-m, --mark_success Mark jobs as succeeded without running them
-p PICKLE, --pickle PICKLE
Serialized pickle object of the entire dag (used internally)
--pool POOL Resource pool to use
-sd SUBDIR, --subdir SUBDIR
File location or directory from which to look for the dag. Defaults to '[AIRFLOW_HOME]/dags' where [AIRFLOW_HOME] is the value you set for 'AIRFLOW_HOME' config you set in 'airflow.cfg'
--ship_dag Pickles (serializes) the DAG and ships it to the worker what do you think? @mik-laj |
I agree with you this point, but different with others, the reason in the previous comment |
440e1c4
to
f553f77
Compare
One other point since we are discussing cli changes and cleaning up. I just realised that we have some "short" options that are really "long" ones: namely -int and -sd in the run task. This is against the idea of short and long options in Unix (and argparse follows that). The main reason to have short options is that you can combine short options:
When you have more than one letter "short" options this might become quickly ambiguous. We do not have -s yet but if we add few more options this might become a problem. Also in the argparse documentation https://docs.python.org/2/library/argparse.html it's mentioned that short options should be single character only. Does it bother others? Or is it just me ? Should we maybe fix that? |
Yeah, I think is a problem too, short option represent turn on or turn off something, if the option have multiple value should use long one. IMO, we should fix that. |
f553f77
to
ee8b766
Compare
It's so odd in https://travis-ci.org/apache/airflow/jobs/632880273?utm_medium=notification&utm_source=github_status Did some step I miss for the full test? |
Codecov Report
@@ Coverage Diff @@
## master #6881 +/- ##
==========================================
- Coverage 85.06% 84.78% -0.28%
==========================================
Files 680 680
Lines 38753 38764 +11
==========================================
- Hits 32965 32866 -99
- Misses 5788 5898 +110
Continue to review full report at Codecov.
|
Seem that CI just failed random. |
Don't merge this PR as we still have discuss in #6881 (comment) |
We have some weird, really weird behaviour - from time to time py37/mysql impersonation_tests are failiing - and then this failure propagates to other tests and eventually hangs the whole build. I just added pytest-instafail: #7064 to show failures immediately when they happen so that when it happens next time we will be able to hunt it down. Of course the builds started to work 🥇 when I added it, so we do not yet know the reason :( |
BTW. Created https://issues.apache.org/jira/browse/AIRFLOW-6472 "Some of the options we have are long options with single -" |
We still have discuss in #6881 (comment), It's will be great if @potiuk @mik-laj to share your idea. |
@zhongjiajie : I think it will be best if sort according to long option only (especially if we fix the short options to be one-letter). General idea behind long options is that people - when learning CLI tend to use longer options and they are preferred in automation/scripts. They are easier to memorize and they carry the meaning. Also when using autocomplete, it is much easier to learn as you start using longer option prefix and it gets auto-completed. The short (one letter) options are for "expert mode" only and they are pretty much just a shorthand if you get familiar with the CLI and want to use it quickly / have no autocomplete and you are very familiar with the tool itself. That's why i prefer to sort only on long options. |
BTW. I added WIP in the title - to indicate this is still in progress :). We have nice Bot now that will keep it from accidentally being merged :) |
ee8b766
to
e249275
Compare
Code now sort by long option and CI green, the behavior like below usage: airflow tasks run [-h] [--cfg_path CFG_PATH] [-f] [-A] [-i] [-I] [-int]
[-l] [-m] [-p PICKLE] [--pool POOL] [--ship_dag]
[-sd SUBDIR]
dag_id task_id execution_date
positional arguments:
dag_id The id of the dag
task_id The id of the task
execution_date The execution date of the DAG
optional arguments:
-h, --help show this help message and exit
--cfg_path CFG_PATH Path to config file to use instead of airflow.cfg
-f, --force Ignore previous task instance state, rerun regardless if task already succeeded/failed
-A, --ignore_all_dependencies
Ignores all non-critical dependencies, including ignore_ti_state and ignore_task_deps
-i, --ignore_dependencies
Ignore task-specific dependencies, e.g. upstream, depends_on_past, and retry delay dependencies
-I, --ignore_depends_on_past
Ignore depends_on_past dependencies (but respect upstream dependencies)
-int, --interactive Do not capture standard output and error streams (useful for interactive debugging)
-l, --local Run the task using the LocalExecutor
-m, --mark_success Mark jobs as succeeded without running them
-p PICKLE, --pickle PICKLE
Serialized pickle object of the entire dag (used internally)
--pool POOL Resource pool to use
--ship_dag Pickles (serializes) the DAG and ships it to the worker
-sd SUBDIR, --subdir SUBDIR
File location or directory from which to look for the dag. Defaults to '[AIRFLOW_HOME]/dags' where [AIRFLOW_HOME] is the value you set for 'AIRFLOW_HOME' config you set in 'airflow.cfg' usage: airflow dags trigger [-h] [-c CONF] [-e EXEC_DATE] [-r RUN_ID]
[-sd SUBDIR]
dag_id
positional arguments:
dag_id The id of the dag
optional arguments:
-h, --help show this help message and exit
-c CONF, --conf CONF JSON string that gets pickled into the DagRun's conf attribute
-e EXEC_DATE, --exec_date EXEC_DATE
The execution date of the DAG
-r RUN_ID, --run_id RUN_ID
Helps to identify this run
-sd SUBDIR, --subdir SUBDIR
File location or directory from which to look for the dag. Defaults to '[AIRFLOW_HOME]/dags' where [AIRFLOW_HOME] is the value you set for 'AIRFLOW_HOME' config you set in 'airflow.cfg' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perfect. We sort, just like in curl.
Usage: curl [options...] <url>
--abstract-unix-socket <path> Connect via abstract Unix domain socket
--alt-svc <file name> Enable alt-svc with this cache file
--anyauth Pick any authentication method
-a, --append Append to target file when uploading
--basic Use HTTP Basic Authentication
--cacert <file> CA certificate to verify peer against
--capath <dir> CA directory to verify peer against
-E, --cert <certificate[:password]> Client certificate file and password
--cert-status Verify the status of the server certificate
--cert-type <type> Certificate file type (DER/PEM/ENG)
--ciphers <list of ciphers> SSL ciphers to use
--compressed Request compressed response
--compressed-ssh Enable SSH compression
-K, --config <file> Read config from a file
--connect-timeout <seconds> Maximum time allowed for connection
--connect-to <HOST1:PORT1:HOST2:PORT2> Connect to host
-C, --continue-at <offset> Resumed transfer offset
-b, --cookie <data|filename> Send cookies from string/file
-c, --cookie-jar <filename> Write cookies to <filename> after operation
Thanks @zhongjiajie ! |
Thank Jarek's and Kamil's help on this patch. ❤️ |
Make sure you have checked all steps below.
Jira
Description
Sort cli commands and arg
Tests
NOT need test
Commits