Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add leading slash to the _bulk url #16

Merged
merged 1 commit into from
Apr 25, 2017

Conversation

zudov
Copy link
Contributor

@zudov zudov commented Apr 25, 2017

Having a forward slash is consistent with e.g. scroll-chan and utils/url, and sometimes saves from serious troubles.

I was setting up my application in AWS environment with Amazon ElasticSearch Service.

Everything was fine, except that bulk-chan didn't work. When I would try to execute the code like:

;; localhost:9200 is a ssh tunnel, that directs me to Amazon'ss ES cluster
;; through a gateway machine.
(let [es (qbits.spandex/client {:hosts ["http://localhost:9200"]})
      {:keys [input-ch output-ch]} (qbits.spandex/bulk-chan es)
      ops [{:index {:_index :test
                    :_type  :test
                    :_id    1}}
           {}]]
  (clojure.core.async/put! input-ch ops)
  (clojure.core.async/go-loop []
    (clojure.pprint/pprint (clojure.core.async/<!! output-ch)))
  (clojure.core.async/close! input-ch))

I would get 400 BAD_REQUEST with no body:

[{:type clojure.lang.ExceptionInfo
  :message "Response Exception"
  :data #qbits.spandex.Response{:body "", :status 400, :headers {"Content-Length" "0", "Connection" "Close"}, :hosts #object[org.apache.http.HttpHost 0x602a2cdc "http://localhost:9200"], :type :qbits.spandex/response-exception}}]

Capturing the requests with ngrep gave me the following

T *.*.*.*:**** -> *.*.*.*:**** [AP]
  PUT _bulk HTTP/1.1..content-type: text/plain..Content-Length: 54..Host: localhost:9200..Connection: Keep-Alive..User-Agent: Apache-HttpAsyncClient/4.1.2 (Java/1.8.0_74)....{"index":{"_index":"test","_type":"test","_id":1}}.{}.

T *.*.*.*:**** -> *.*.*.*:**** [AP]
  HTTP/1.1 400 BAD_REQUEST..Content-Length: 0..Connection: Close....

The same code would work just fine with a self-hosted ES.
After long time of eyegazing and trying many different things, I realized that the problem is due to missing leading slash, which causes some pedantic AWS's proxy to reject the requests.

Adding the slash fixed the problem:

T *.*.*.*:**** -> *.*.*.*:**** [AP]
  PUT /_bulk HTTP/1.1..content-type: text/plain..Content-Length: 54..Host: localhost:9200..Connection: Keep-Alive..User-Agent: Apache-HttpAsyncClient/4.1.2 (Java/1.8.0_74)....{"index":{"_index":"test","_type":"test","_id":1}}.{}.

T *.*.*.*:**** -> *.*.*.*:**** [AP]
  HTTP/1.1 200 OK..Access-Control-Allow-Origin: *..Content-Type: application/json; charset=UTF-8..Content-Length: 198..Connection: keep-alive....{"took":9,"errors":false,"items":[{"index":{"_index":"test","_type":"test","_id":"1","_version":10,"result":"updated","_shards":{"total":2,"successful":2,"failed":0},"created":false,"status":200}}]}

Having a forward slash is consistent with e.g. `scroll-chan`
and `utils/url`, and sometimes saves from serious troubles.

I was setting up my application in AWS environment with
[Amazon ElasticSearch Service](https://aws.amazon.com/elasticsearch-service/).

Everything was fine, except that `bulk-chan` didn't work.
When I would try to execute the code like:

```clojure
;; localhost:9200 is a ssh tunnel, that directs me to Amazon'ss ES cluster
;; through a gateway machine.
(let [es (qbits.spandex/client {:hosts ["http://localhost:9200"]})
      {:keys [input-ch output-ch]} (qbits.spandex/bulk-chan es)
      ops [{:index {:_index :test
                    :_type  :test
                    :_id    1}}
           {}]]
  (clojure.core.async/put! input-ch ops)
  (clojure.core.async/go-loop []
    (clojure.pprint/pprint (clojure.core.async/<!! output-ch)))
  (clojure.core.async/close! input-ch))
```

I would get 400 BAD_REQUEST with no body:

```clojure
[{:type clojure.lang.ExceptionInfo
  :message "Response Exception"
  :data #qbits.spandex.Response{:body "", :status 400, :headers {"Content-Length" "0", "Connection" "Close"}, :hosts #object[org.apache.http.HttpHost 0x602a2cdc "http://localhost:9200"], :type :qbits.spandex/response-exception}}]
```

Capturing the requests with `ngrep` gave me the following

```
T *.*.*.*:**** -> *.*.*.*:**** [AP]
  PUT _bulk HTTP/1.1..content-type: text/plain..Content-Length: 54..Host: localhost:9200..Connection: Keep-Alive..User-Agent: Apache-HttpAsyncClient/4.1.2 (Java/1.8.0_74)....{"index":{"_index":"test","_type":"test","_id":1}}.{}.

T *.*.*.*:**** -> *.*.*.*:**** [AP]
  HTTP/1.1 400 BAD_REQUEST..Content-Length: 0..Connection: Close....
```

The same code would work just fine with a self-hosted ES.
After long time of eyegazing and trying many different things, I realized
that the problem is due to missing leading slash, which causes some pedantic AWS's
proxy to reject the requests.

Adding the slash fixed the problem:

```
T *.*.*.*:**** -> *.*.*.*:**** [AP]
  PUT /_bulk HTTP/1.1..content-type: text/plain..Content-Length: 54..Host: localhost:9200..Connection: Keep-Alive..User-Agent: Apache-HttpAsyncClient/4.1.2 (Java/1.8.0_74)....{"index":{"_index":"test","_type":"test","_id":1}}.{}.

T *.*.*.*:**** -> *.*.*.*:**** [AP]
  HTTP/1.1 200 OK..Access-Control-Allow-Origin: *..Content-Type: application/json; charset=UTF-8..Content-Length: 198..Connection: keep-alive....{"took":9,"errors":false,"items":[{"index":{"_index":"test","_type":"test","_id":"1","_version":10,"result":"updated","_shards":{"total":2,"successful":2,"failed":0},"created":false,"status":200}}]}
```
@zudov
Copy link
Contributor Author

zudov commented Apr 25, 2017

I was thinking, that maybe it's a better idea to use utils/url to construct urls in all the places where we need some predefined url (at least scroll-chan, and bulk-chan). What do you think?

@mpenet mpenet merged commit 48d70a1 into mpenet:master Apr 25, 2017
@mpenet
Copy link
Owner

mpenet commented Apr 25, 2017

Good catch, thanks. I'll cut a release tomorrow morning.

@mpenet
Copy link
Owner

mpenet commented Apr 25, 2017

Using utils/url doesn't bring anything in these cases I believe. But with both scroll and bulk the user can overwrite :url and do that if for instance one wants to qualify the index at that level.

@zudov
Copy link
Contributor Author

zudov commented Apr 25, 2017

Thanks for the tip about overwriting :url, would help me until the release rolls out.

@mpenet
Copy link
Owner

mpenet commented Apr 26, 2017

it's on clojars as 0.3.8

@zudov zudov deleted the bugfix/bulk-leading-slash branch April 29, 2017 19:04
@lvh
Copy link

lvh commented Jul 10, 2017

@zudov How did you do the request signing? Or do you just use a blessed IP?

@mpenet
Copy link
Owner

mpenet commented Jul 10, 2017

I remember seeing a project on gh that does this. Cant remember its name now, but you might be able to find it with gh search.

@mpenet
Copy link
Owner

mpenet commented Jul 10, 2017

https://github.com/apeckham/elasticsearch-helloworld/blob/master/src/es_typeahead/core.clj

I am on mobile phone, cant go into details. Hopefully it can help.

That s something that would be quite nice to have as a "plugin" project, I might just add this tomorrow.

@lvh
Copy link

lvh commented Jul 11, 2017

Yeah. If I had to do it now, I'd use the AWS ES proxy. Involves a 2nd process, but totally transparent to spandex.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants