Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AIRFLOW-1746] Add a Nomad operator to trigger job from Airflow #2708

Closed
wants to merge 6 commits into from

Conversation

etrabelsi
Copy link
Contributor

@etrabelsi etrabelsi commented Oct 19, 2017

Add a new operator to run a job on nomad.

Description

  • Nomad allows declarative job file for scheduling virtualized, containerized, and standalone applications. For now it make sense to run only batch jobs because I wait for termination, and services doesn't terminate (if its seems essential its small code change).

Tests

  • [] Written tests for hooks

JIRA

My PR addresses the following Airflow JIRA1746 issues and references them in the PR title. For example, "[AIRFLOW-XXX] My Airflow PR"
https://issues.apache.org/jira/browse/AIRFLOW-1746

…omad and wait for success in semiliar way to boto-core waiter
@Acehaidrey
Copy link
Contributor

Acehaidrey commented Oct 20, 2017

You'll need to also provide tests. You will also want to create a JIRA and tie this to that. It's just the procedure to get your PR reviewed.
But this is interesting operator! Cool work.

@etrabelsi etrabelsi changed the title Nomad Operator AIRFLOW-1746 Add a Nomad operator to trigger job from Airflow Oct 21, 2017
@etrabelsi
Copy link
Contributor Author

@Acehaidrey

  • should i use connection_id instead of expecting to get ip and port ? or should it be implentation detail of the person who uses it?
  • where do i add dependency of the python-nomad == 0.5.0?
  • examples needed to be added?
  • documentation ?

@etrabelsi
Copy link
Contributor Author

@Acehaidrey

@Acehaidrey
Copy link
Contributor

Hi @etrabelsi I am so sorry for missing your comments in December, I never saw them. Let me respond in a bit

@christopherRiddersater
Copy link

@Acehaidrey
Have you had any time to look into this regarding the questions from @etrabelsi?
It would be great to have this feature.

@Fokko
Copy link
Contributor

Fokko commented Apr 9, 2018

Could you extend the tests a bit? The current test is really minimal. At least I would expect mocking some of the response objects.

@etrabelsi
Copy link
Contributor Author

@Fokko
I have no problem, let me know what you want me to test more and i will add it.
Regarding my questions:

  • should i use connection_id instead of expecting to get ip and port ? or should it be implentation detail of the person who uses it?
  • where do i add dependency of the python-nomad == 0.5.0?

@Fokko
Copy link
Contributor

Fokko commented Apr 9, 2018

Yes, good point. Your current code isn't fully in line with the design of Airflow. You should get the ip and port form the connection. Like:

conn = self.get_connection(self.redis_conn_id)
self.host = conn.host
self.port = int(conn.port)

https://github.com/apache/incubator-airflow/blob/master/airflow/contrib/hooks/redis_hook.py

The dependency should be added to the setup.py in the root.

I would like to see mocking these calls: self.nomad_client.job.get_allocations(job_id). These calls should be mocked so you get an example response, and some edge cases. What will happen if the job doesn't exists, or if the connection to Nomad times out.

Hope this helps, if there are any further questions, please let me know.

Cheers, Fokko

@Acehaidrey
Copy link
Contributor

Hi sorry for late responses. Let me know if Fokko's comments make sense. I can help you in terms of connection and where to put that dependency if you mock a test.

@etrabelsi
Copy link
Contributor Author

etrabelsi commented Apr 10, 2018

@Fokko regarding your statements:

"I would like to see mocking these calls: self.nomad_client.job.get_allocations(job_id)"

This is python-nomad call which is tested https://github.com/jrxFive/python-nomad/blob/6352f513d873710a51be62f24a59c5be66f28ef2/tests/test_job.py
any other test?

@Acehaidrey
I wrote the change and will add it :)

- Installion of python-nomad dependency
@etrabelsi etrabelsi force-pushed the master branch 7 times, most recently from 6b9b1b0 to 1e29353 Compare May 1, 2018 06:21
@ali5h
Copy link

ali5h commented May 7, 2018

Nomad looks more like an airflow executer instead of operator

@etrabelsi
Copy link
Contributor Author

@Fokko regarding your statements:
"I would like to see mocking these calls: self.nomad_client.job.get_allocations(job_id)"

This is python-nomad call which is tested https://github.com/jrxFive/python-nomad/blob/6352f513d873710a51be62f24a59c5be66f28ef2/tests/test_job.py
any other test?

@Fokko @Acehaidrey I see that the tests fail on "The job exceeded the maximum time limit for jobs, and has been terminated." what should i do ?

@shariat why the fact that it can be used as executor contradicting the usability of an operator? (there are similar example for operators jenkins operator,ecs operator.

@ali5h
Copy link

ali5h commented May 7, 2018

You certainly can. Imho, u probably will uncover much more potential by using it as an executer

@eyaltrabelsi
Copy link

@Fokko @Acehaidrey

@Dannnir
Copy link

Dannnir commented Jun 11, 2018

@etrabelsi @Fokko @Acehaidrey we really need this feature, any chance to release it soon?

README.md Outdated
@@ -240,7 +240,7 @@ Currently **officially** using Airflow:
1. [Zenly](https://zen.ly) [[@cerisier](https://github.com/cerisier) & [@jbdalido](https://github.com/jbdalido)]
1. [Zymergen](https://www.zymergen.com/)
1. [99](https://99taxis.com) [[@fbenevides](https://github.com/fbenevides), [@gustavoamigo](https://github.com/gustavoamigo) & [@mmmaia](https://github.com/mmmaia)]

1. [Yotpo](https://www.yotpo.com) [[@etrabelsi](https://github.com/etrabelsi)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The list is alphabetical

self.connection = self.get_connection(nomad_conn_id)
self.nomad_client = self.get_nomad_client(*args, **kwargs)

def get_nomad_client(self, *args, **kwargs):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one should override get_conn, please refer to other hooks, for example: https://github.com/apache/incubator-airflow/blob/master/airflow/hooks/http_hook.py

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Fokko what do you mean get_nomad_client returns a client and not a connection do I need get_conn anyway ? thanks

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the client represents a connection, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Fokko not an airflow connection. its more like client/resource of boto3 or jdbc for example

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Fokko ?

Copy link

@ndmar ndmar Sep 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I understand from a quick check, the get_conn seems to be intended to be the convention on a hook for retrieving a connection for an external resource, and it typically uses an airflow connection to supply the connection details. It appears that the proposed function here does that, it's just named get_nomad_client instead.

In this particular case, it seems that simply renaming get_nomad_client to get_conn would be sufficient. Does that sound right?

# See the License for the specific language governing permissions and
# limitations under the License.

"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please put this comment in the docs of the Operator itself, for example: https://github.com/apache/incubator-airflow/blob/master/airflow/hooks/http_hook.py#L29

# limitations under the License.
#

""" Hook to manage connection to nomad server
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please put this comment in the docs of the Hook itself, for example: https://github.com/apache/incubator-airflow/blob/master/airflow/hooks/http_hook.py#L29

setup.py Outdated
@@ -280,6 +281,7 @@ def do_setup():
'devel_hadoop': devel_hadoop,
'doc': doc,
'docker': docker,

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this newline

@stale stale bot removed the stale Stale PRs per the .github/workflows/stale.yml policy file label Sep 11, 2019
@potiuk
Copy link
Member

potiuk commented Sep 11, 2019

Interesting why it has been closed even if you commented it :)

@potiuk
Copy link
Member

potiuk commented Sep 11, 2019

A number of other PRs were un-staled after comments, so this one is strange :D

@yishan-lin
Copy link

@djenriquez Apologies for the delay. As of now, it seems more likely that we may architect some parts of a potential Airflow integration differently. We'd likely not shepherd this exact PR through once we get started but will need to scope it out first. :)

@yishan-lin
Copy link

In the meantime, we would love people's +1s on this ticket to the Apache Airflow Committee, so we can optimize for building and providing users with a first-class integration, rather than the incomplete experience of a lagging fork.

https://issues.apache.org/jira/browse/AIRFLOW-5633

@stale
Copy link

stale bot commented Nov 24, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale Stale PRs per the .github/workflows/stale.yml policy file label Nov 24, 2019
@shantanugadgil
Copy link

I have added "+1" on this issue, though I see the "stale-bot" marking this as "stale".

Is this Github issue and JIRA ticket (https://issues.apache.org/jira/browse/AIRFLOW-5633) still on the radar? :wonder:

@stale stale bot removed the stale Stale PRs per the .github/workflows/stale.yml policy file label Nov 25, 2019
@shantanugadgil
Copy link

Should we login/signup on the Apache JIRA and then click "Vote for this issue"? (I have done that as well). Hope this gets implemented soon! :)

@KarimZaoui
Copy link

@yiheng Any updates on this?

@LanceNorskog
Copy link

I have not used this PR, but based on experience with other Airflow user-contributed code: Please, please pull back all logs from executed Nomad jobs into the Airflow task log. If you already are, that's great, you're already doing better than the GCP control tasks :)

Cheers!

@stale
Copy link

stale bot commented Feb 3, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale Stale PRs per the .github/workflows/stale.yml policy file label Feb 3, 2020
@stale stale bot closed this Feb 10, 2020
@shantanugadgil
Copy link

just checking to see if posting a comment revives this issue.

@yishan-lin
Copy link

This issue got closed by stalebot, but a first-class Airflow integration is on the roadmap for us internally. We're working on some other integrations first (e.g Spinnaker) before we'll get there but still in plans!

@potiuk
Copy link
Member

potiuk commented Mar 10, 2020

Looking forward to it! If you need any help with that - please let us know. Happy to help!

@retarpt
Copy link

retarpt commented May 25, 2021

Any updates on this?

@sblanton
Copy link

Very excited about this, Stalebot!

@retarpt
Copy link

retarpt commented Jun 16, 2021

🤕 🤞

@Oloremo
Copy link

Oloremo commented Jun 25, 2021

@yishan-lin any updates on this?

@leonardobsjr
Copy link

Please revive this :|

@ashb
Copy link
Member

ashb commented Jul 8, 2021

@leonardoam We'll happily review and accept such a PR, so "well volunteered?" 😁

@sblanton
Copy link

sblanton commented Jul 8, 2021

What are the next steps to get this "official"...I'll see what I can do. I know nothing about the process and procedures. What I do know is that Airflow crashed not 15 minutes ago due to exceeding available resources with that Docker Compose garbage. This would not have happened with Nomad managing resources. This makes AIRFLOW LOOK BAD and management are never going to look at the details. "Airflow crashed again" is what management hears and they think "Let's use something that doesn't crash"

@sblanton
Copy link

sblanton commented Jul 8, 2021

And isn't this issue already a PR? I don't see any response... who are the reviewers? What needs to be done?

@potiuk
Copy link
Member

potiuk commented Jul 8, 2021

The issue has been closed due to inactivity of the original creator @sblanton. We need someone who "owns" the PR and leads it to successful completion.

The process is described here https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst (and you have even https://github.com/apache/airflow/blob/main/CONTRIBUTORS_QUICK_START.rst ). Airflow is a community managed project so you are free to take overy and lead it to completion. I think it could be a nice contribution (though you might mistake Nomad operator with Nomad managing resources for Airflow). Those are two different things and I doubt either would solve the deployment problem you have.

BTW. The officially released and community supported solution for deploying Airflow is Helm Chart. https://airflow.apache.org/docs/helm-chart/stable/index.html and there you can manage your resources as you wish.

The docker compose we publish here: https://airflow.apache.org/docs/apache-airflow/stable/start/docker.html is a quick-start for development only . However there is an open issue to add more production-ready examples (and maybe even a generator for docker-compose) #16031. Not sure if you would like to work on it but maybe that's a good idea? I can assign you the issue if you want.

@sblanton
Copy link

sblanton commented Jul 11, 2021

Excellent... thank you for the tips @potiuk . I'll review.

@shantanugadgil
Copy link

Is this still planned to be merged?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale Stale PRs per the .github/workflows/stale.yml policy file
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet