New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AIRFLOW-2615] Makes sure DagBag doesn't parse DAGs twice during startup #3614

Closed
wants to merge 1 commit into
base: master
from

Conversation

Projects
None yet
5 participants
@verdan
Copy link
Contributor

verdan commented Jul 19, 2018

Make sure you have checked all steps below.

JIRA

  • My PR addresses the following Airflow JIRA issues and references them in the PR title.

Description

  • Here are some details about my PR, including screenshots of any UI changes:
    Webserver parses DagBag twice during start up, thus causes webserver start up being slow with large number of DAG files. This change reduces the start up time of webserver to almost half by making sure the DagBag model doesn't parse DAGs twice during the start up process. More details and discussions here: #3506
    FYI: @yrqls21

Tests

  • My PR adds the following unit tests OR does not need testing for this extremely good reason:

Commits

  • My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

Documentation

  • In case of new functionality, my PR adds documentation that describes how to use it.
    • When adding new operators/hooks/sensors, the autoclass documentation generation needs to be added.

Code Quality

  • Passes git diff upstream/master -u -- "*.py" | flake8 --diff
@KevinYang21

This comment has been minimized.

Copy link
Contributor

KevinYang21 commented Jul 19, 2018

LGTM, tyvm for the change, going to be a huge improvement for us.

@codecov-io

This comment has been minimized.

Copy link

codecov-io commented Jul 19, 2018

Codecov Report

Merging #3614 into master will decrease coverage by <.01%.
The diff coverage is 80%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3614      +/-   ##
==========================================
- Coverage   77.05%   77.05%   -0.01%     
==========================================
  Files         204      204              
  Lines       15730    15734       +4     
==========================================
+ Hits        12121    12124       +3     
- Misses       3609     3610       +1
Impacted Files Coverage Δ
airflow/bin/cli.py 64.35% <100%> (+0.09%) ⬆️
airflow/www_rbac/views.py 72.85% <66.66%> (-0.04%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e9a09c2...c1ee5c9. Read the comment docs.

@feng-tao
Copy link
Contributor

feng-tao left a comment

lgtm, thanks.

app = cached_app_rbac(conf) if settings.RBAC else cached_app(conf)
pid, stdout, stderr, log_file = setup_locations(
"webserver", args.pid, args.stdout, args.stderr, args.log_file)
os.environ.pop("SKIP_DAGS_PARSING")

This comment has been minimized.

@feng-tao

feng-tao Jul 19, 2018

Contributor

nit: single quote instead of double quote

@bolkedebruin

This comment has been minimized.

Copy link
Contributor

bolkedebruin commented Jul 19, 2018

@verdan Please make Sure to have your commit messages connect to a JIRA in the title. See other merged PRs for example. Eg. I would expect [AIRFLOW-2615] as part of title with the rest of guidelines. Your other PRs should e updated to do the same.

Great work btw!

@verdan verdan force-pushed the verdan:double-dag-parsing branch from d9141cc to c1ee5c9 Jul 19, 2018

@verdan

This comment has been minimized.

Copy link
Contributor Author

verdan commented Jul 19, 2018

Oops!! sorry about that. Fixed them all.

@asfgit asfgit closed this in 4be1ffe Jul 24, 2018

lxneng added a commit to lxneng/incubator-airflow that referenced this pull request Aug 10, 2018

[AIRFLOW-2615] Limit DAGs parsing to once only
Closes apache#3614 from verdan/double-dag-parsing

dlebech added a commit to trustpilot/incubator-airflow that referenced this pull request Sep 11, 2018

[AIRFLOW-2615] Limit DAGs parsing to once only
Closes apache#3614 from verdan/double-dag-parsing

dalupus added a commit to modmed/incubator-airflow that referenced this pull request Sep 19, 2018

[AIRFLOW-2615] Limit DAGs parsing to once only
Closes apache#3614 from verdan/double-dag-parsing

aliceabe pushed a commit to aliceabe/incubator-airflow that referenced this pull request Jan 3, 2019

[AIRFLOW-2615] Limit DAGs parsing to once only
Closes apache#3614 from verdan/double-dag-parsing

kaxil added a commit that referenced this pull request Jan 7, 2019

[AIRFLOW-2615] Limit DAGs parsing to once only
Closes #3614 from verdan/double-dag-parsing

kaxil added a commit that referenced this pull request Jan 9, 2019

[AIRFLOW-2615] Limit DAGs parsing to once only
Closes #3614 from verdan/double-dag-parsing

ashb added a commit to ashb/airflow that referenced this pull request Jan 10, 2019

[AIRFLOW-2615] Limit DAGs parsing to once only
Closes apache#3614 from verdan/double-dag-parsing

cfei18 pushed a commit to cfei18/incubator-airflow that referenced this pull request Jan 23, 2019

Limit DAGs parsing to once only
Closes apache#3614 from verdan/double-dag-parsing
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment