Skip to content

HTTP Operator and sensor#103

Merged
mistercrunch merged 4 commits intoapache:masterfrom
gtoonstra:http_operator_sensor
Jul 9, 2015
Merged

HTTP Operator and sensor#103
mistercrunch merged 4 commits intoapache:masterfrom
gtoonstra:http_operator_sensor

Conversation

@gtoonstra
Copy link
Contributor

New, clean pull request for issue #92 , not riddled with rebase artifacts.

@mistercrunch
Copy link
Member

History for this here:
#99

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

http_conn_id because of how you can set default_args at your DAGs level, conn_id would conflict for default settings

@gtoonstra
Copy link
Contributor Author

Should be complete now.

  • timeout is now by default 0, which means that the server decides when to break the connection. The majority of servers have this specified at around 1-3 minutes, unless it's a keepAlive or streaming connection. Do you want to use a default timeout anywhere, or leave it as is?
  • 'template_fields' is very basic and includes one param. I don't know how this works in depth. Could you have a look if that should be extended please?
  • same for 'template_ext'
  • is ui_color ok?
  • template_fields for sensor?

I'm happy if you take it as is and add what you need, if that makes it easier.

@mistercrunch
Copy link
Member

Awesome! I think this is solid as far as I can tell. A few details:

  • I'd throw endpoint in template_fields for sensor as well. template_fields is a bad name for this feature, but the idea is that sometime between the __init__ and the execute method is run, the object attributes in that list get templated. So say if someone would want a date in their endpoint they would have to use templating for that, as in endpoint = "/apiv2/user/getmodified/{{ execution_date.isoformat()[:10].replace("-", "") }}"
  • add an entry in docs/code.rst for your 2 operators and hook to show up in the API part of the documentation
  • a unit test would be nice in airflow/tests/core.py

Another thought (outside the scope of this PR) is around authentication. I read the requests library part about sessions and authentication and it seem like people could graft that onto the hook you wrote eventually to support different types of authentication.

@gtoonstra
Copy link
Contributor Author

If something is needed for authentication, let's identify how and what and create an issue for it.

I did not run the process to create docs (never done that), please evaluate if that's necessary.

The http_default connection here is google.com. Obviously that needs to be present for this to work. How do you document this, or do you create a default connection in the installer?

If the sensor fails in the test, it will poll forever. Is there a max_retries so the test fail or succeed at some point (mostly useful for CI).

@mistercrunch
Copy link
Member

Awesome. This is coming together nicely!

@gtoonstra
Copy link
Contributor Author

I added a timeout and initdb now creates a http_default connection for google.com .

I ran the tests before and they errored when example_http_operator is in the example_dags, because the constructor is looking for it (when not supplied). The hive and other operators are not part of example dags, so they don't get imported and don't create that error. Maybe better to remove the example_http_operator or put it in a different folder.

I tried to get the sensor to error by changing the response_check to False, but that didn't work. The output gets eaten up too, so I can't see what's happening. The test always appears to succeed for the sensor, this could an issue with the testing platform (does it check for the return value for a sensor?).

@mistercrunch
Copy link
Member

I fetched your branch to run some tests and ended changing things around a bit. The main thing was the missing force=True in the operator run call. Without force=True, the operator doesn't re-run as it knows it has succeeded in the past.

d0f16b1

I think everything else look fine at this point, if you want to keep iterating you can bring this commit to your PR, or I can just merge my local branch as is. I think you get "credited" for your commits in any case.

@gtoonstra
Copy link
Contributor Author

Please go ahead with the merge from your local branch and then close this PR. Anything else that's needed, let's create a separate issue and pr for that.

@mistercrunch mistercrunch mentioned this pull request Jul 9, 2015
@mistercrunch mistercrunch merged commit 5190f8c into apache:master Jul 9, 2015
@mistercrunch
Copy link
Member

img

@gtoonstra gtoonstra deleted the http_operator_sensor branch July 9, 2015 20:38
saguziel added a commit to saguziel/incubator-airflow that referenced this pull request Sep 25, 2017
[AIRFLOW-961] run onkill when SIGTERMed
mobuchowski pushed a commit to mobuchowski/airflow that referenced this pull request Jan 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants