New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an option to allow multiple copies on a single host to count towards goals #231

Closed
Zorlin opened this Issue Dec 26, 2014 · 9 comments

Comments

Projects
None yet
6 participants
@Zorlin
Copy link

Zorlin commented Dec 26, 2014

Hi there,

I've run into some interesting use cases for people who need to store massive amounts of data but for whatever reason (can't afford to, don't want to, etc) want to store it on a single machine. It would be very cool if you could enable LizardFS to recognize each disk/brick/volume of data as separate and count them towards the goals.

Enabling single-system usage of LizardFS is a very strange use case, but it makes sense and as a bonus it may help ease people into the use of the system.

@onlyjob

This comment has been minimized.

Copy link
Member

onlyjob commented Dec 26, 2014

I support this request. For example I have powerful server with 20 HDDs where I want to keep two replicas and spread third replica amongst 10 computers on LAN. With absence of documentation I'm not sure if LizardFS is capable of doing such thing...

@Zorlin

This comment has been minimized.

Copy link
Author

Zorlin commented Dec 26, 2014

Hi Dmitry,

As far as I'm aware (and I'm fairly confident) neither MooseFS nor LizardFS
support this yet.

~ b
On Dec 26, 2014 2:58 AM, "Dmitry Smirnov" notifications@github.com wrote:

I support this request. For example I have powerful server with 20 HDDs
where I want to keep two replicas and spread third replica amongst 10
computers on LAN. With absence of documentation I'm not sure if LizardFS is
capable of doing such thing...


Reply to this email directly or view it on GitHub
#231 (comment).

@Zorlin

This comment has been minimized.

Copy link
Author

Zorlin commented Dec 26, 2014

Perhaps the most elegant solution would be a "copy factor" value for each
host - if you set it to 2, this host will count up to 2 valid copies, and
so on.
On Dec 26, 2014 3:07 AM, "Benjamin" zorlin@gmail.com wrote:

Hi Dmitry,

As far as I'm aware (and I'm fairly confident) neither MooseFS nor
LizardFS support this yet.

~ b
On Dec 26, 2014 2:58 AM, "Dmitry Smirnov" notifications@github.com
wrote:

I support this request. For example I have powerful server with 20 HDDs
where I want to keep two replicas and spread third replica amongst 10
computers on LAN. With absence of documentation I'm not sure if LizardFS is
capable of doing such thing...


Reply to this email directly or view it on GitHub
#231 (comment).

@marcinsulikowski

This comment has been minimized.

Copy link
Contributor

marcinsulikowski commented Dec 29, 2014

For example I have powerful server with 20 HDDs where I want to keep two replicas and spread third replica amongst 10 computers on LAN

If you really want to keep 2 copies on a single machine, you can run two chunkservers there. Prepare two configuration files, e.g.: /etc/mfs/mfschunkserver-1.cfg and /etc/mfs/mfschunkserver-2.cfg. The following settings need to be set differently in each of them:

CSSERV_LISTEN_PORT = xxxx <- choose two different ports to listen on
DATA_PATH = xxxx <- choose two different directories to store chunkserver's stats and lockfile
HDD_CONF_FILENAME = xxxx <- create two different hdd files

for example:

$ cat /etc/mfs/mfschunkserver-1.cfg
MASTER_HOST = lizardfs-master
SYSLOG_IDENT = chunkserver-1
CSSERV_LISTEN_PORT = 9561
DATA_PATH = /var/lib/lizardfs/chunkserver-1
HDD_CONF_FILENAME = /etc/mfs/mfshdd-1.cfg
$ cat /etc/mfs/mfschunkserver-2.cfg
MASTER_HOST = lizardfs-master
SYSLOG_IDENT = chunkserver-2
CSSERV_LISTEN_PORT = 9562
DATA_PATH = /var/lib/lizardfs/chunkserver-2
HDD_CONF_FILENAME = /etc/mfs/mfshdd-2.cfg
$ cat /etc/mfs/mfshdd-1.cfg
/mnt/hd_01
/mnt/hd_02
/mnt/hd_03
/mnt/hd_04
$ cat /etc/mfs/mfshdd-2.cfg
/mnt/hd_05
/mnt/hd_06
/mnt/hd_07
/mnt/hd_08

Now you can run both:

# mfschunkserver -c /etc/mfs/mfschunkserver-1.cfg start
# mfschunkserver -c /etc/mfs/mfschunkserver-2.cfg start

If you want to force LizardFS to store two copies of each chunk on these two chunkserver (located on a single machine) and a third copy somewhere else you can use the 'custom goals' mechanism (http://lizardfs.com/lizardfs-2-5-4-release-announcement/ and man mfsgoals.cfg on the master server) as follows:

  • label two chunkservers located on the powerful machine using the same label, e.g. powerful by adding the following in both /etc/mfs/mfschunkserver-1.cfg and /etc/mfs/mfschunkserver-2.cfg: LABEL = powerful
  • define a custom goal using an entry in mfsgoals.cfg like this 19 my_goal_name : powerful powerful _ on the master server
  • set this goal for some files on a client node, e.g. mfssetgoal -r my_goal_name /mnt/lizardfs
@Zorlin

This comment has been minimized.

Copy link
Author

Zorlin commented Dec 30, 2014

Hi @marcinsulikowski - I understand I can do that, but that's a lot of work. Especially if I'm trying to recommend this to someone new to LizardFS. I view this as a sort of "my first cluster" way to get started :)

@onlyjob

This comment has been minimized.

Copy link
Member

onlyjob commented Dec 30, 2014

Very interesting. Thank you very much @marcinsulikowski for detailed instructions.

FWIW in Ceph this is fairly easy to do by setting

osd crush location = host=hostname-fake

in OSD configuration. Ceph runs "chunkserver" (i.e. OSD) per HDD and host is a property of a volume. Setting host not equal to hostname (I append -fake) makes OSD appear as if it belongs to a different host so chunk mapping system tricked to place a replica to fake host.

I reckon placing two replicas to one host may be elegantly possible if/when LizardFS allows volumes to be combined to virtual host or "target" which can have label assigned...

@4Dolio

This comment has been minimized.

Copy link

4Dolio commented Jan 23, 2015

As marcinsulikowski has already pointed out, configuring multiple chunkservers each with their own hdd.conf and unique port it is possible to do this already. As such I do not believe there is a need for any changes to support this request other than perhaps documenting that it is already possible.

@onlyjob

This comment has been minimized.

Copy link
Member

onlyjob commented Dec 10, 2015

As such I do not believe there is a need for any changes to support this request other than perhaps documenting that it is already possible.

Agreed. It is easy to setup and operate more than one chunkserver per host.
I think we can close this bug.

@hradec

This comment has been minimized.

Copy link

hradec commented Sep 19, 2017

You can setup multiple chunkservers on the same machine very easily with docker.

docker run -d --restart always --net=host  -e MASTER_HOST=localhost \
     --name=chunk_HD01  \
     -e MFS_CSSERV_LISTEN_PORT=9460 \
     -e MFS_LABEL=HD01 \
     -v /mnt/HD01/:/mnt/HD01:rw \
     -e ACTION=chunk   hradec/docker_lizardfs_git 

docker run -d --restart always --net=host  -e MASTER_HOST=localhost \
     --name=chunk_HD02 \
     -e MFS_CSSERV_LISTEN_PORT=9461 \
     -e MFS_LABEL=HD02 \
     -v /mnt/HD02/:/mnt/HD02:rw \
     -e ACTION=chunk   hradec/docker_lizardfs_git 

where your first hard drive is mounted at /mnt/HD1, the second at /mnt/HD2, and son on...

This docker image let you set ANY of the mfschunkserver.conf options by using -e MFS_<option name> in the command line. (as for example, the chunkserver port number - -e MFS_CSSERV_LISTEN_PORT=9461)

By mounting the local /mnt/HD1 path in the container path /mnt/HD1 (-v /mnt/HD01/:/mnt/HD01:rw), triggers the image to auto-generate the mfshdds.conf file from whatever folder shows up at /mnt/*

You can mount just one path for each chunkservers (as I demonstrated), or you can mount as many paths as you want for a chunkserver, covering booth situations - 1 chunkserver for multiple disks (LizardFS Default setup), or multiple chunkservers for multiple disks!

The --net=host forces the container to use the main hardware nic's in you machine, so theres no overhead of a virtual lan layer.

I've tested running LizardFS as docker against a native build in my Arch Linux, and there's no difference in speed, whatsoever! BUT, If you choose to run docker in OSX or Windows, then it will have a big slowdown, since docker runs in a virtual machine on those OSes.

The image can be used to start any of the servers, include metalogger, shadow, master and cgi. Just set the server in the -e ACTION=<server name>

running the image like this:
docker run -ti --rm hradec/docker_lizardfs_git

will display a quick help on how to use it!

my 2 cents...
-H

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment