Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Feature to use TransportClient instead of NodeClient #17

wants to merge 14 commits into


None yet
4 participants

With a large cluster, using NodeClient it is possible to encounter 'too many open files' errors due to the full-mesh connections between each elasticsearch node (including hadoop workers).

To counter that, I added the possibility to use TransportClient and define the desire entry-point nodes, thus limiting the number of connections each hadoop workers has. Using this type of client introduce 'double-hop' request for indexing. So far, I didn't saw much decrease in performance, but there is certainly one, which is counterbalanced by the ability to use all the nodes on the cluster.

Feel free to accept the pull request, but I am using it in production for 1-2 weeks without any issues.

alexmc6 commented on d061ee5 Nov 5, 2012

Can anyone confirm this has actually fixed the problem and that it works fine in Pig 0.10? I am getting a versioning error which suggests it isn't

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment