Skip to content

zookeeper support via initialization inside pydrill #17

@PythonicNinja

Description

@PythonicNinja

Hi Wojtek,

I was just checking out your pydrill module and it looks great -- I'd love to use it in a project I'm working on. One question though, I didn't see any documentation about connecting to a Zookeeper quorum rather than a specific drillbit. Is this currently supported?

Thanks,
Dan


Hello!

Currently pyDrill doesn’t support zookeeper, it requires you to first use zookeeper bindings [1] to determine which bits are running and connect to one of them.
State is shared across all bits with zookeeper, so any change related to settings will take affect.

I can add support for zookeeper so that before query i would ask ip of bits connected to quorum.

Here is example how to determine ip’s of bits.
import zc.zk
zk = zc.zk.ZooKeeper('127.0.0.1:2181')
zk.get_children('/drill’) # [u'sys.options', u'running', u'sys.storage_plugins', u'drillbit1’]
zk.get_children('/drill/drillbit1’) # [u'faa0c8a3-b569-4280-bf04-53f4b76c93e4’, u'aacaf088-d72a-4b7f-ae34-f15fad2cddef’]

PYDRILL_ZOOKEEPER = ['127.0.0.1:2181’, ‚127.0.0.2:2181’]
I think it could be supported by env variable or parameter used to initialize pyDrill.

Please share your ideas so that i can enhance pyDrill to support your needs.


Thanks for your reply, the zc code you've provided is very helpful.

I think zookeeper connectivity would be a great addition to pydrill, as connecting to a single node isn't particularly suitable for a production environment. Perhaps allowing the pydrill.client.PyDrill class to take keyword arguments similar to the JDBC/ODBC drivers, something like (ConnectionType='ZooKeeper', ZKQuorum='Server1:Port1,Server2:Port2', ZKClusterID= '') would be nice.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions