Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Airflow webserver emits malformed HTTP Response Status Line when endpoint/resource is not found #29167

Closed
1 of 2 tasks
vitaly-krugl opened this issue Jan 25, 2023 · 2 comments
Closed
1 of 2 tasks
Labels
affected_version:2.3 Issues Reported for 2.3 area:webserver Webserver related Issues Can't Reproduce The problem cannot be reproduced kind:bug This is a clearly a bug pending-response

Comments

@vitaly-krugl
Copy link

vitaly-krugl commented Jan 25, 2023

Apache Airflow version

Other Airflow 2 version (please specify below)

What happened

Our app is using Airflow 2.3.2 and running the Airflow webserver. My team won't be able to upgrade to the latest airflow version for a long time, so logging the bug against airflow version v2.3.2.

The issue: when accessing a non-existing endpoint, the airflow web server returns a malformed HTTP status line: HTTP/1.1 b'404 Not Found'. Note the "b" in front of the message and the single quotes surrounding the text - this is invalid in HTTP status line.
This breaks HTTP client's ability to parse the response.

What you think should happen instead

Airflow web server must return a valid HTTP status line, such as

HTTP/1.1 404 Not Found

How to reproduce

Make an HTTP request to a non-existent endpoint. Using wget - note the ERROR message printed by wget: ERROR -1: Malformed status line

wget -d http://localhost:80/healthhhhh
DEBUG output created by Wget 1.20.3 on linux-gnu.

Reading HSTS entries from /root/.wget-hsts
URI encoding = 'ANSI_X3.4-1968'
iconv UTF-8 -> ANSI_X3.4-1968
iconv outlen=60 inlen=30
converted 'http://localhost:80/healthhhhh' (ANSI_X3.4-1968) -> 'http://localhost:80/healthhhhh' (UTF-8)
Converted file name 'healthhhhh' (UTF-8) -> 'healthhhhh' (ANSI_X3.4-1968)
--2023-01-25 18:55:26--  http://localhost/healthhhhh
Resolving localhost (localhost)... ::1, 127.0.0.1
Caching localhost => ::1 127.0.0.1
Connecting to localhost (localhost)|::1|:80... Closed fd 3
failed: Connection refused.
Connecting to localhost (localhost)|127.0.0.1|:80... connected.
Created socket 3.
Releasing 0x0000564e6bfc8b90 (new refcount 1).

---request begin---
GET /healthhhhh HTTP/1.1
User-Agent: Wget/1.20.3 (linux-gnu)
Accept: */*
Accept-Encoding: identity
Host: localhost
Connection: Keep-Alive

---request end---
HTTP request sent, awaiting response... 
---response begin---
HTTP/1.1 b'404 Not Found'
Server: gunicorn
Date: Wed, 25 Jan 2023 18:55:26 GMT
Connection: close
Transfer-Encoding: chunked
Content-Type: text/plain

---response end---
-1
2023-01-25 18:55:26 ERROR -1: Malformed status line.
Closed fd 3

Operating System

Ubuntu 5.10

Versions of Apache Airflow Providers

No response

Deployment

Other

Deployment details

No response

Anything else

This bug appears to be isolated to the 404 NOT FOUND case. Here is another wget session which solicits the 400 BAD REQUEST response which is properly formatted and doesn't have the b' in front of the message. Note that this request passes an invalid query arg to api/v1/connections in order to coerce the 400 BAD REQUEST:

root@1b69c66c14af:/usr/local/akamai/abattery-app# wget -d http://localhost:80/airflow/api/v1/connections?zzz=123
Setting --header (header) to RemoteUser:vkruglik
Setting --header (header) to RemoteUser:vkruglik
DEBUG output created by Wget 1.20.3 on linux-gnu.

Reading HSTS entries from /root/.wget-hsts
URI encoding = 'ANSI_X3.4-1968'
iconv UTF-8 -> ANSI_X3.4-1968
iconv outlen=108 inlen=54
converted 'http://localhost:80/airflow/api/v1/connections?zzz=123' (ANSI_X3.4-1968) -> 'http://localhost:80/airflow/api/v1/connections?zzz=123' (UTF-8)
Converted file name 'connections?zzz=123' (UTF-8) -> 'connections?zzz=123' (ANSI_X3.4-1968)
--2023-01-25 19:03:07--  http://localhost/airflow/api/v1/connections?zzz=123
Resolving localhost (localhost)... ::1, 127.0.0.1
Caching localhost => ::1 127.0.0.1
Connecting to localhost (localhost)|::1|:80... Closed fd 3
failed: Connection refused.
Connecting to localhost (localhost)|127.0.0.1|:80... connected.
Created socket 3.
Releasing 0x0000563e23366e50 (new refcount 1).

---request begin---
GET /airflow/api/v1/connections?zzz=123 HTTP/1.1
User-Agent: Wget/1.20.3 (linux-gnu)
Accept: */*
Accept-Encoding: identity
Host: localhost
Connection: Keep-Alive
RemoteUser: vkruglik

---request end---
HTTP request sent, awaiting response... 
---response begin---
HTTP/1.1 400 BAD REQUEST
Server: gunicorn
Date: Wed, 25 Jan 2023 19:03:07 GMT
Connection: close
Content-Type: application/problem+json
Content-Length: 210
Access-Control-Allow-Headers: 
Access-Control-Allow-Methods: 
Access-Control-Allow-Origin: 
X-Robots-Tag: noindex, nofollow

---response end---
400 BAD REQUEST
Closed fd 3
2023-01-25 19:03:07 ERROR 400: BAD REQUEST.

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@vitaly-krugl vitaly-krugl added area:core kind:bug This is a clearly a bug labels Jan 25, 2023
@vitaly-krugl vitaly-krugl changed the title Invalid HTTP Response Status Line when endpoint/resource is not found Airflow webserver emits malformed HTTP Response Status Line when endpoint/resource is not found Jan 25, 2023
@Taragolis
Copy link
Contributor

Unable reproduce in 2.3.2: breeze start-airflow --db-reset --use-airflow-version 2.3.2

curl

curl -v http://localhost:28080/healthhhhh
*   Trying 127.0.0.1:28080...
* Connected to localhost (127.0.0.1) port 28080 (#0)
> GET /healthhhhh HTTP/1.1
> Host: localhost:28080
> User-Agent: curl/7.85.0
> Accept: */*
> 
* Mark bundle as not supporting multiuse
< HTTP/1.1 404 NOT FOUND
< Server: gunicorn
< Date: Wed, 25 Jan 2023 21:17:36 GMT
< Connection: close
< Content-Type: text/html; charset=utf-8
< Content-Length: 468
< X-Robots-Tag: noindex, nofollow
< Set-Cookie: session=16ffaa94-10b9-4677-978c-f8ee8c403c0c.0aAFTRvIpMR9oIud6ga0-eYtrvA; Expires=Fri, 24-Feb-2023 21:17:36 GMT; HttpOnly; Path=/; SameSite=Lax
< 


<!DOCTYPE html>
<html lang="en">
  <head>
    <title>Airflow 404</title>
    <link rel="icon" type="image/png" href="/static/pin_32.png">
  </head>
  <body>
    <div style="font-family: verdana; text-align: center; margin-top: 200px;">
      <img src="/static/pin_100.png" width="50px" alt="pin-logo" />
      <h1>Airflow 404</h1>
      <p>Page cannot be found.</p>
      <a href="/">Return to the main page</a>
      <p>7095c21fe1e0</p>
    </div>
  </body>
* Closing connection 0
</html>%

wget

wget -d http://localhost:28080/healthhhhh
DEBUG output created by Wget 1.21.3 on darwin22.1.0.

Reading HSTS entries from /Users/taragolis/.wget-hsts
URI encoding = ‘UTF-8’
Converted file name 'healthhhhh' (UTF-8) -> 'healthhhhh' (UTF-8)
--2023-01-26 01:14:07--  http://localhost:28080/healthhhhh
Resolving localhost (localhost)... ::1, 127.0.0.1
Caching localhost => ::1 127.0.0.1
Connecting to localhost (localhost)|::1|:28080... connected.
Created socket 5.
Releasing 0x0000600003400200 (new refcount 1).

---request begin---
GET /healthhhhh HTTP/1.1
Host: localhost:28080
User-Agent: Wget/1.21.3
Accept: */*
Accept-Encoding: identity
Connection: Keep-Alive

---request end---
HTTP request sent, awaiting response... 
---response begin---
HTTP/1.1 404 NOT FOUND
Server: gunicorn
Date: Wed, 25 Jan 2023 21:14:07 GMT
Connection: close
Content-Type: text/html; charset=utf-8
Content-Length: 468
X-Robots-Tag: noindex, nofollow
Set-Cookie: session=974d883c-5efa-4470-a73f-e73ace134f09.9ygwQXy1g5ssudTYNmCIfyiVsgA; Expires=Fri, 24-Feb-2023 21:14:07 GMT; HttpOnly; Path=/; SameSite=Lax

---response end---
404 NOT FOUND

Stored cookie localhost 28080 / <permanent> <insecure> [expiry 2023-02-25 01:14:07] session 974d883c-5efa-4470-a73f-e73ace134f09.9ygwQXy1g5ssudTYNmCIfyiVsgA
URI content encoding = ‘utf-8’
Closed fd 5
2023-01-26 01:14:07 ERROR 404: NOT FOUND.

@Taragolis
Copy link
Contributor

I guess something wrong with your locale, try to set to UTF-8 instead of C on both client and airflow server

@Taragolis Taragolis added the Can't Reproduce The problem cannot be reproduced label Jan 25, 2023
@eladkal eladkal added area:webserver Webserver related Issues affected_version:2.3 Issues Reported for 2.3 and removed area:core labels Jan 26, 2023
@apache apache locked and limited conversation to collaborators Feb 19, 2023
@potiuk potiuk converted this issue into discussion #29612 Feb 19, 2023

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
affected_version:2.3 Issues Reported for 2.3 area:webserver Webserver related Issues Can't Reproduce The problem cannot be reproduced kind:bug This is a clearly a bug pending-response
Projects
None yet
Development

No branches or pull requests

3 participants