Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't import CSV over HTTP due to incorrect HTTP Host header value #13399

Open
tomasherout opened this issue Feb 16, 2024 · 4 comments
Open

Can't import CSV over HTTP due to incorrect HTTP Host header value #13399

tomasherout opened this issue Feb 16, 2024 · 4 comments
Assignees

Comments

@tomasherout
Copy link

Neo4j when loads CSV data over HTTP does set invalid HTTP Host header (IP:port instead of hostname from URL).

  • Neo4j version: 5.16.0
  • Operating system: Linux (running in Kubernetes with official Helm chart)

Steps to reproduce

Run LOAD CSV WITH HEADERS FROM "http://hostname/path/file.csv" AS row RETURN count(*)

Expected behavior

HTTP GET request must contain header Host with value hostname (instead of resolved IP addresss)., for example request by curl:

curl -v 'http://servername:9000/test.csv?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=ACCESSKEY%2F20240216%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20240216T064332Z&X-Amz-Expires=604800&X-Amz-SignedHeaders=host&X-Amz-Signature=112eaebab28be522d1ed4b7c0adb9a929337ebe9a847d705e333fae3b8b31337'

> GET /test.csv?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=ACCESSKEY%2F20240216%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20240216T064332Z&X-Amz-Expires=604800&X-Amz-SignedHeaders=host&X-Amz-Signature=112eaebab28be522d1ed4b7c0adb9a929337ebe9a847d705e333fae3b8b31337 HTTP/1.1
> Host: servername:9000   <---- CORRECT value
> User-Agent: curl/8.4.0
> Accept: */*
> 
< HTTP/1.1 200 OK
< Accept-Ranges: bytes
< Content-Length: 1800737
< Content-Security-Policy: block-all-mixed-content
< Content-Type: text/csv; charset=utf-8
< ETag: "bf94ac2d2f2f8248c1ede45de7ade6a6"
< Last-Modified: Fri, 16 Feb 2024 06:43:32 GMT
< Vary: Origin
< X-Amz-Request-Id: 17B445C477A2C88E
< X-Xss-Protection: 1; mode=block
< Date: Fri, 16 Feb 2024 07:01:08 GMT
< 
{ [10966 bytes data]

Actual behavior

Host header has value as resolved IP address and port insted of hostname:

LOAD CSV WITH HEADERS FROM "http://servername:9000/test.csv?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=ACCESSKEY%2F20240216%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20240216T064332Z&X-Amz-Expires=604800&X-Amz-SignedHeaders=host&X-Amz-Signature=112eaebab28be522d1ed4b7c0adb9a929337ebe9a847d705e333fae3b8b31337"`

makes following HTTP request:

GET /test.csv?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=ACCESSKEY%2F20240216%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20240216T064332Z&X-Amz-Expires=604800&X-Amz-SignedHeaders=host&X-Amz-Signature=112eaebab28be522d1ed4b7c0adb9a929337ebe9a847d705e333fae3b8b31337 HTTP/1.1
User-Agent: NeoLoadCSV_Java/17.0.10+7
Host: 10.229.14.29:9000   <---- BAD value
Accept: text/html, image/gif, image/jpeg, *; q=.2, */*; q=.2
Connection: keep-alive

Causes invalid response because unexpected Host header value:

HTTP/1.1 403 Forbidden
Accept-Ranges: bytes
Content-Length: 397
Content-Security-Policy: block-all-mixed-content
Content-Type: application/xml
Vary: Origin
X-Amz-Request-Id: 17B444F268862189
X-Xss-Protection: 1; mode=block
Date: Fri, 16 Feb 2024 06:46:06 GMT
<?xml version="1.0" encoding="UTF-8"?>
<Error><Code>SignatureDoesNotMatch</Code><Message>The request signature we calculated does not match the signature you provided. Check your key and signing method.</Message><Key>test.csv</Key><BucketName>neo4j</BucketName><Resource>/test.csv</Resource><RequestId>17B444F268862189</RequestId><HostId>0cc3fecb-7862-44fa-aa60-e073c8a964ad</HostId></Error>
@phil198
Copy link
Contributor

phil198 commented Feb 29, 2024

Hi @tomasherout,

I can see that neo4j does substitute the resolved ip when using http to avoid an extra DNS lookup and minimise the possibility of DNS spoofing, but is the error below not related to the combination of using HTTP and Amz-SignedHeaders, Amz-Signature parameters? Have you tried using HTTPS?

<?xml version="1.0" encoding="UTF-8"?>
<Error><Code>SignatureDoesNotMatch</Code><Message>The request signature we calculated does not match the signature you provided. Check your key and signing method.</Message><Key>test.csv</Key><BucketName>neo4j</BucketName><Resource>/test.csv</Resource><RequestId>17B444F268862189</RequestId><HostId>0cc3fecb-7862-44fa-aa60-e073c8a964ad</HostId></Error>

@tomasherout
Copy link
Author

tomasherout commented Feb 29, 2024

Hello Phil,

error SignatureDoesNotMatch is the result of unexpected value of Host header (X-Amz-SignedHeaders=host in URI Search Params) because the value is different then what has been signed.

RFC 2616 (HTTP/1.1) https://datatracker.ietf.org/doc/html/rfc2616#section-14.23 says: The Host field value MUST represent
the naming authority of the origin server or gateway given by the original URL.

I can try HTTPS tomorrow and I will provide an update.

UPDATE: HTTPS works okay

@phil198
Copy link
Contributor

phil198 commented Mar 4, 2024

hi @tomasherout, thanks for trying with HTTPS. Just to update you that we are still looking into this.

@phil198
Copy link
Contributor

phil198 commented Mar 11, 2024

hi @tomasherout, thanks for your on-going patience. This could take some time as we are discussing internally how best to proceed in the most secure way. In the meanwhile, your options are to stick with HTTPS, or to use neo4j version <=5.15 if possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants