Multiple devices per datanode, per tier #26

bissont · 2017-10-18T01:29:14Z

I don't believe Crail supports multiple devices per datanode for a specific tier. For instance, exporting of two nvmef targets from a storagetier on a single datanode, like:

crail.storage.blkdev.datapath /dev/nvme0n1,/dev/nvme1n1,

I'm trying to figure out scope out how much effort this would be, but I first wanted to check and see if there is already any plans or existing work to support such functionality? This would probably be most useful to the blk-dev repo, where we could just expose multiple iscsi/nvmef targets to a namenode, like in the above conf example

Thanks,
Tim

The text was updated successfully, but these errors were encountered:

patrickstuedi · 2017-10-18T03:03:01Z

Hi Tim, The model in Crail is that datanodes are per device (NIC, SSD, etc.). Scaling to multiple NICs or SSDs is done by starting multiple datanodes, one per device. Essentially an IP/port is exposing a storage namespace of a host, so the handling of multiple devices is done at the global Crail level. The design was chosen to keep datanodes and metadata server simple. That being said, there nothing in the Crail storage interface that wouldn't permit building a new datanode which exposes multiple devices. It should be very simple to build, functionally, One has to be careful, though, with performance. Currently the metadata server has know knowledge about the storage topologies inside a single datanode. If a datanode exports multiple devices, these registrations will appear at the namenode as resources of a single host. Consequently the namenode cannot distribute block allocations over the different devices during file writes. That could be fixed by registering individual blocks in a round robin manner with the namenode instead of registering entire regions. But it's not a great fix. If we see that there is a real need for more complex datanodes handling multiple devices, we may need to add explicit support for it at the namenode. Currently I don't see such a need, but feel free to provide details about your use cases and why starting datanodes per device is not sufficient. I'd be interested in hearing.. Thanks! -Patrick

…

On Wed, Oct 18, 2017 at 3:29 AM, Tim Bisson ***@***.***> wrote: I don't believe Crail supports multiple devices per datanode for a specific tier. For instance, exporting of two nvmef targets from a storagetier on a single datanode, like: crail.storage.blkdev.datapath /dev/nvme0n1,/dev/nvme1n1, I'm trying to figure out scope out how much effort this would be, but I first wanted to check and see if there is already any plans or existing work to support such functionality? This would probably be most useful to the blk-dev repo, where we could just expose multiple iscsi/nvmef targets to a namenode, like in the above conf example Thanks, Tim — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#26>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ARxkOk8ysMFsnrlXgjvRfWgUs8tky-3Pks5stVRrgaJpZM4P9CU9> .

bissont · 2017-10-18T19:01:26Z

Thanks for the response Patrick.

The use case I was thinking of was performance-based: with multiple NVMe drives per datanode, we wanted to see if their aggregate bandwidth can get close to that of memory. However, there are different ways to accomplish this - such as with containers or volume mangers - without having to modify/change Crail's design.

patrickstuedi · 2017-10-18T23:33:53Z

Hi Tim, Crail is all about performance, if a single datanode managing multiple devices gives better performance than multiple datanodes managing a single device each then we should consider the option. Have a look at our NVMf blog http://www.crail.io/blog/2017/08/crail-nvme-fabrics-v1.html There, the configuration is, each storage server has two SSDs and runs two Crail datanodes each. We reach the network line speed easily by aggregating the bandwidth of the two SSDs. In the past we have also done similar experiments with 4 SSDs and also there we were able to aggregate the bandwidth of the devices using the "one datanode per device" approach. Please let us know why you think having a single datanode managing multiple devices will give better performance.

…

-Patrick

On Wed, Oct 18, 2017 at 9:01 PM, Tim Bisson ***@***.***> wrote: Thanks for the response Patrick. The use case I was thinking of was performance-based: with multiple NVMe drives per datanode, we wanted to see if their aggregate bandwidth can get close to that of memory. However, there are different ways to accomplish this - such as with containers or volume mangers - without having to modify/change Crail's design. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#26 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ARxkOtqF8_wahfjt4zUPu0hxSKmFXGnuks5stksHgaJpZM4P9CU9> .

bissont · 2017-10-19T22:24:20Z

Hi,

My objective is to run multiple SSDs on a physical server. I didn't fully understand your first reply, but now I see that what you are suggesting is a another way to achieve the same thin. I was assuming only one datanode per storage server. I like Crail's design is because you don't have to worry about making a datanode scale to the number of SSDs you give it.

Tim

patrickstuedi · 2017-10-19T22:31:59Z

Hi Tim, Yes, that was one of the ideas behind the design. But we can always re-evaluate it at any point in time...

…

-Patrick

On Fri, Oct 20, 2017 at 12:24 AM, Tim Bisson ***@***.***> wrote: Hi, My objective is to run multiple SSDs on a physical server. I didn't fully understand your first reply, but now I see that what you are suggesting is a another way to achieve the same thin. I was assuming only one datanode per storage server. I like Crail's design is because you don't have to worry about making a datanode scale to the number of SSDs you give it. Tim — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#26 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ARxkOpyZOMiSfOxjFCWXa1zcrMMxnNEEks5st8wVgaJpZM4P9CU9> .

bissont closed this as completed Oct 18, 2017

bissont mentioned this issue Nov 7, 2017

Enabling multiple bllk-devices in a crail cluster zrlio/crail-blkdev#2

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multiple devices per datanode, per tier #26

Multiple devices per datanode, per tier #26

bissont commented Oct 18, 2017

patrickstuedi commented Oct 18, 2017 via email

bissont commented Oct 18, 2017

patrickstuedi commented Oct 18, 2017 via email

bissont commented Oct 19, 2017

patrickstuedi commented Oct 19, 2017 via email

Multiple devices per datanode, per tier #26

Multiple devices per datanode, per tier #26

Comments

bissont commented Oct 18, 2017

patrickstuedi commented Oct 18, 2017 via email

bissont commented Oct 18, 2017

patrickstuedi commented Oct 18, 2017 via email

bissont commented Oct 19, 2017

patrickstuedi commented Oct 19, 2017 via email