fix concurrency by lucafuji · Pull Request #1120 · apache/airflow

lucafuji · 2016-03-04T23:26:19Z

Hi, I change the check from database to base executor to ensure each task_instance will get the consistent view of running instances in memory. Coding style may be not very good. But I will appreciate if you can review the logic, thanks a lot!

pass checks for python3

mistercrunch · 2016-03-10T20:07:22Z

Sorry this doesn't work for us as the DB is the source of truth, the executor's state can get out of sync and ultimately we trust what is in the DB...

lucafuji · 2016-03-11T00:45:00Z

@mistercrunch
Trusting DB as the source of truth is the root of cause of this issue actually .....
As I mentioned in #1085
If the concurrency is implemented by checking how many running task instances in database, which means it's possible all tasks can start running at the same time because when they query the database, they all get a count of 0.

And I'm wondering how the executor's state can get out of sync? By multiple schedulers?.

Maybe a better solution is using curator/zookeeper's semaphore to solve this problem(each of the task get a resource from the semaphore).

In summary, simply counting the number of running tasks can lead to concurrency issues when multiple task_instance is checking the database at the same time. We need to solve it using other mechanism

mistercrunch · 2016-03-13T23:34:02Z

Right, it's a tricky situation as we want to support multiple schedulers in the future, and we certainly don't want to make zookeeper a hard requirement. Having support for multiple executors (and even more in the future), that only leaves the DB as the source of truth, and means we'd have to use it as a message queue almost, which isn't great.

lucafuji · 2016-03-15T17:28:29Z

@mistercrunch Just curious how you are going to solve this issue using DB though? thanks

mistercrunch · 2016-03-15T21:06:47Z

Yeah I think that's the idea, it may take a little while though. I'm going to write a design proposal and submit it out for review soon.

landscape-bot · 2016-05-02T22:30:06Z

Repository health decreased by 0.05% when pulling cca807f on lucafuji:concurrency_fix into 0fd94d9 on airbnb:master.

No new problems were introduced.
2 problems were fixed (including 0 errors and 1 code smell).

artwr · 2016-10-16T22:42:09Z

@lucafuji I am not sure we have a clear design for this on our end, but if you still want to be involved feel free to reply. If we do not hear from you, we might close this PR for now as part of a cleanup effort for long standing PRs.

lucafuji · 2016-10-17T18:17:32Z

@artwr Feel free to abandon this one, this PR also has race condition issue though.
Looking forward to your new design

lucafuji mentioned this pull request Mar 4, 2016

Concurrency=1 not honored when depends_on_past=False #1085

Closed

fix concurrency

cca807f

pass checks for python3

lucafuji force-pushed the concurrency_fix branch from 182f0d7 to cca807f Compare March 7, 2016 19:59

jlowin added the Missing JIRA Issue label May 2, 2016

lucafuji closed this Oct 17, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix concurrency#1120

fix concurrency#1120
lucafuji wants to merge 1 commit intoapache:masterfrom
lucafuji:concurrency_fix

lucafuji commented Mar 4, 2016

Uh oh!

mistercrunch commented Mar 10, 2016

Uh oh!

lucafuji commented Mar 11, 2016

Uh oh!

mistercrunch commented Mar 13, 2016

Uh oh!

lucafuji commented Mar 15, 2016

Uh oh!

mistercrunch commented Mar 15, 2016

Uh oh!

landscape-bot commented May 2, 2016

Uh oh!

artwr commented Oct 16, 2016

Uh oh!

lucafuji commented Oct 17, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

lucafuji commented Mar 4, 2016

Uh oh!

mistercrunch commented Mar 10, 2016

Uh oh!

lucafuji commented Mar 11, 2016

Uh oh!

mistercrunch commented Mar 13, 2016

Uh oh!

lucafuji commented Mar 15, 2016

Uh oh!

mistercrunch commented Mar 15, 2016

Uh oh!

landscape-bot commented May 2, 2016

Uh oh!

artwr commented Oct 16, 2016

Uh oh!

lucafuji commented Oct 17, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants