Skip to content

Dharam/lib configmap#4803

Closed
dharamsk wants to merge 502 commits intoapache:masterfrom
postmates:dharam/lib-configmap
Closed

Dharam/lib configmap#4803
dharamsk wants to merge 502 commits intoapache:masterfrom
postmates:dharam/lib-configmap

Conversation

@dharamsk
Copy link

No description provided.

sergiohgz and others added 30 commits January 10, 2019 13:23
* Better instructions for airflow flower

It is not clear in the documentation that you need to have flower installed to successful run airflow flower. If you don't have flower installed, running airflow flower will show the following error which is not of much help:

airflow flower                                                                                       
[2018-11-20 17:01:14,836] {__init__.py:51} INFO - Using executor SequentialExecutor                                                      
Traceback (most recent call last):                                                                                                       
  File "/mnt/secondary/workspace/f4/typo-backend/pipelines/model-pipeline/airflow/bin/airflow", line 32, in <module>                     
    args.func(args)                                                                                                                      
  File "/mnt/secondary/workspace/f4/typo-backend/pipelines/model-pipeline/airflow/lib/python3.6/site-packages/airflow/utils/cli.py", line
 74, in wrapper                                                                                                                          
    return f(*args, **kwargs)                                                                                                            
  File "/mnt/secondary/workspace/f4/typo-backend/pipelines/model-pipeline/airflow/lib/python3.6/site-packages/airflow/bin/cli.py", line 1
221, in flower                                                                                                                           
    broka, address, port, api, flower_conf, url_prefix])                                                                                 
  File "/mnt/secondary/workspace/f4/typo-backend/pipelines/model-pipeline/airflow/lib/python3.6/os.py", line 559, in execvp              
    _execvpe(file, args)                                                                                                                 
  File "/mnt/secondary/workspace/f4/typo-backend/pipelines/model-pipeline/airflow/lib/python3.6/os.py", line 604, in _execvpe            
    raise last_exc.with_traceback(tb)                                                                                                    
  File "/mnt/secondary/workspace/f4/typo-backend/pipelines/model-pipeline/airflow/lib/python3.6/os.py", line 594, in _execvpe            
    exec_func(fullname, *argrest)                                                                                                        FileNotFoundError: [Errno 2] No such file or directory

* Update use-celery.rst
Related Commits:
1. [AIRFLOW-2657](PR apache#3531)
2. [AIRFLOW-3233](PR apache#4069)

Added for both www/ and www_rbac
apache#3765)

It allows remote_host to be passed to operator with XCOM.
…pache#3793)

There may be different combinations of arguments, and
some processings are being done 'silently', while users
may not be fully aware of them.

For example
- User only needs to provide either `ssh_hook`
  or `ssh_conn_id`, while this is not clear in doc
- if both provided, `ssh_conn_id` will be ignored.
- if `remote_host` is provided, it will replace
  the `remote_host` which wasndefined in `ssh_hook`
  or predefined in the connection of `ssh_conn_id`

These should be documented clearly to ensure it's
transparent to the users. log.info() should also be
used to remind users and provide clear logs.

In addition, add instance check for ssh_hook to ensure
it is of the correct type (SSHHook).

Tests are updated for this PR.
Add an operator and hook to manipulate and use Azure
CosmosDB documents, including creation, deletion, and
updating documents and collections.

Includes sensor to detect documents being added to a
collection.
…#4287)

Users will use either API or web UI to delete DAG (after DAG file is
removed):

- Using API: provide one boolean parameter to let users
             decide if they want to keep records in Log table
             when they delete a DAG.
             Default value it True (to keep records in Log table).
- From UI: will keep records in the Log table when delete records for a
           specific DAG ID (pop-up message is updated accordingly).
There are two log lines in the k8sexecutor that can cause schedulers to crash
due to too many logs.
Closes apache#3628 from andscoop/Add-connection-close-
to-mongo-hook
…e#4322)

For larger DAGs topological_sort was found to be very inefficient. Made
some small changes to the code to improve the data structures used in the
method.
The select queries on sla_miss table produce a great % of DB traffic and
thus made the DB CPU usage unnecessarily high. It would be a low hanging
fruit to add an index and reduce the load.
* [AIRFLOW-2747] Explicit re-schedule of sensors

Add `mode` property to sensors. If set to `reschedule` an
AirflowRescheduleException is raised instead of sleeping which sets
the task back to state `NONE`. Reschedules are recorded in new
`task_schedule` table and visualized in the Gantt view. New TI
dependency checks if a sensor task is ready to be re-scheduled.

* Reformat sqlalchemy imports

* Make `_handle_reschedule` private

* Remove print

* Add comment

* Add comment

* Don't record reschule request in test mode
…apache#4276)

Local users were always a superuser, this adds a column to the DB (and defaults to false,
which is going to cause a bit of an upgrade pain for people, but defaulting to not being an
admin is the only secure default.)
Updated documentation to elaborate on the (yesterday|tomorrow)_.*
variables' relations to the execution date.
victornoel and others added 26 commits January 19, 2019 15:29
…re (apache#4218)

Signed-off-by: Victor Noel <victor.noel@brennus-analytics.com>
Add new TriggerRule that triggers only if all upstream do not fail (success or skipped tasks are allowed)
In emr_create_job_flow_operator.py the :type clearly mismatches with
the :param name, suggesting a copy&paste mistake.
…4086)

* [AIRFLOW-3245] fix list processing in resolve_template_files

* [AIRFLOW-3245] add tests

* [AIRFLOW-3245] modify tests
…at it does not crash (apache#3650)

If there is any issue in DB connection then rest of the functions take care of those exceptions but in heartbeat of scheduler, there is no handling for this kind of situation.

Airflow Scheduler should not crash if a "transient" DB exception occurs in the heartbeat of scheduler.
Fixing the operator name from DataFlowOperation  to DataFlowJavaOperator  in Documentation
Some function parameters were undocumented. Additional docstrings
were added for clarity.
Montor Task Instances creation rates by Operator type.
These stats can provide some visibility on how much workload Airflow is
getting. They can be used for resource allocation in the long run (i.e.
to determine when we should scale up workers) and debugging in scenarios
like the creation rate of certain type of Task Instances spikes.
* [AIRFLOW-1298] Fix 'clear only_failed'

* [AIRFLOW-1298] Fix 'clear only_failed'
One voilation that slipped in by PR that didn't rebase onto
latest master
@dharamsk dharamsk closed this Feb 28, 2019
@dharamsk
Copy link
Author

disregard. this was created erroneously.

@dharamsk dharamsk deleted the dharam/lib-configmap branch February 28, 2019 23:34
@dharamsk dharamsk restored the dharam/lib-configmap branch February 28, 2019 23:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.