Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gluster 11.0 Creates Distributed-replicate instead of replicate #4107

Open
awptechnologies opened this issue Apr 9, 2023 · 16 comments · Fixed by #4119
Open

Gluster 11.0 Creates Distributed-replicate instead of replicate #4107

awptechnologies opened this issue Apr 9, 2023 · 16 comments · Fixed by #4119

Comments

@awptechnologies
Copy link

When creating a Gluster volume, I set replica to 2 with 2 bricks and it automatically creates a distributed-replicate instead of just replicate. I didn't even think that was possible with only 2 bricks. I saw in a few forums that a downgrade to 10.1 solved the issue but i already have a lot of vms on this volume and don't want to go through hassle to downgrade.

gluster volume create haVMzGFS replica 2 ZeusGFS.domain.example:/zeusSSD/haVMz AresGFS.domain.example/aresSSD/haVMz force

The command completed successfully but it created a distributed-replicate instead of just replicate.

I expected it to create a volume with the type replicate only.

Mandatory info:
Volume Name: haVMzGFS
Type: Distributed-Replicate
Volume ID: f227c10a-37e0-4822-be2b-9d51066429ba
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: ZeusGFSdomain.example:/zeusSSD/haVMz
Brick2: AresGFS.domain.example:/aresSSD/haVMz
Options Reconfigured:
storage.owner-gid: 107
storage.owner-uid: 107
diagnostics.count-fop-hits: on
diagnostics.latency-measurement: on
cluster.granular-entry-heal: enable
storage.fips-mode-rchecksum: on
transport.address-family: inet
performance.client-io-threads: off
cluster.heal-timeout: 5
cluster.self-heal-daemon: enable
cluster.quorum-reads: false
cluster.quorum-count: 1
network.ping-timeout: 2
cluster.favorite-child-policy: mtime
cluster.data-self-heal-algorithm: full
cluster.quorum-type: fixed

Status of volume: haVMzGFS
Gluster process TCP Port RDMA Port Online Pid

Brick ZeusGFS.domain.example:/zeusS
SD/haVMz 58404 0 Y 58405
Brick AresGFS.domain.example:/aresS
SD/haVMz 60409 0 Y 105336
Self-heal Daemon on localhost N/A N/A Y 58472
Self-heal Daemon on AresGFS N/A N/A Y 105368

Task Status of Volume haVMzGFS

There are no active volume tasks

@awptechnologies
Copy link
Author

Also was wondering if there is a way to allow volume to store TPM 2.0 vm disk for proxmox. I can run all of my vms from the volume but as soon as i try and enable tpm i get this error:

/bin/swtpm exit with status 256:
TASK ERROR: start failed: command 'swtpm_setup --tpmstate file://gluster://192.168.1.20/haVMzGFS/images/101/vm-101-disk-2.raw --createek --create-ek-cert --create-platform-cert --lock-nvram --config /etc/swtpm_setup.conf --runas 0 --not-overwrite --tpm2 --ecc' failed: exit code 1

@madmax01
Copy link

yes i have this issue too.. created 2-way . but then shows Distributed-Replicate.

hard to understand why "working" parts are breaking after an release

@mykaul
Copy link
Contributor

mykaul commented Apr 16, 2023

gluster volume create haVMzGFS replica 2 ZeusGFS.domain.example:/zeusSSD/haVMz AresGFS.domain.example/aresSSD/haVMz force

What was your intention when you used 'force' in the command?

@mykaul
Copy link
Contributor

mykaul commented Apr 16, 2023

Also was wondering if there is a way to allow volume to store TPM 2.0 vm disk for proxmox. I can run all of my vms from the volume but as soon as i try and enable tpm i get this error:

Please open a different issue.

@madmax01
Copy link

just as feedback..... the force i personally use to overcome the "split-brain alert" because its just an 2-way... pretty sure the user had same reason to use force

@mykaul
Copy link
Contributor

mykaul commented Apr 18, 2023

just as feedback..... the force i personally use to overcome the "split-brain alert" because its just an 2-way... pretty sure the user had same reason to use force

So you give up on validation of correct deployment, and then complain that your deployment is not what you've intended it to be... ?
(I'm not saying there can't be a bug somewhere in Gluster of course, just that your expectations are not aligned with your actions).

@madmax01
Copy link

madmax01 commented Apr 18, 2023

i'am not the one created the ticket.. re/read please..... The ticket creator blame an Bug where a 2-way replication created as Distributed-Replication. I just shared my own Feedback what a "force" can be used. to risk Split-brain is up to the user (and there are ways to make it bit more reliable with client+server quorum settings)

whats the point to not use it > and accepting the split-brain message.. can go straight force when you know this message come.... Apologize.....its not my ticket.. but youre Answer is not pointing to the Requester Question... 2-way setup is not an Distributed-Replication... its an replication of 2 ;) (the command above tells replica 2.. he didn't used it without ending up in distribution. Distribution replication normally need to match up "number of bricks must be a multiple of the replica count" - And this is simple not the Case ;).

@zemzema
Copy link

zemzema commented Apr 18, 2023

Same problem here, first time I have tried to add-brick and saw that I'm getting distribuded-replicate, than I create volume, and result is the same.

gluster volume create test replica 2 s1:/storage/brick1/test s2:/storage/brick1/test
Replica 2 volumes are prone to split-brain. Use Arbiter or Replica 3 to avoid this. See: http://docs.gluster.org/en/latest/Administrator-Guide/Split-brain-and-ways-to-deal-with-it/.
Do you still want to continue?
(y/n) y
volume create: test: success: please start the volume to access data
You have new mail in /var/spool/mail/root
gluster volume start test
volume start: test: success
gluster volume info test

Volume Name: test
Type: Distributed-Replicate
Volume ID: 6399239d-eb64-4801-8eb0-ad3072d00903
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: s1:/storage/brick1/test
Brick2: s2:/storage/brick1/test
Options Reconfigured:
cluster.granular-entry-heal: on
storage.fips-mode-rchecksum: on
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off

@mohit84
Copy link
Contributor

mohit84 commented Apr 18, 2023

Same problem here, first time I have tried to add-brick and saw that I'm getting distribuded-replicate, than I create volume, and result is the same.

gluster volume create test replica 2 s1:/storage/brick1/test s2:/storage/brick1/test Replica 2 volumes are prone to split-brain. Use Arbiter or Replica 3 to avoid this. See: http://docs.gluster.org/en/latest/Administrator-Guide/Split-brain-and-ways-to-deal-with-it/. Do you still want to continue? (y/n) y volume create: test: success: please start the volume to access data You have new mail in /var/spool/mail/root gluster volume start test volume start: test: success gluster volume info test

Volume Name: test Type: Distributed-Replicate Volume ID: 6399239d-eb64-4801-8eb0-ad3072d00903 Status: Started Snapshot Count: 0 Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: s1:/storage/brick1/test Brick2: s2:/storage/brick1/test Options Reconfigured: cluster.granular-entry-heal: on storage.fips-mode-rchecksum: on transport.address-family: inet nfs.disable: on performance.client-io-threads: off

Here the volume type "Distribute-replicate" is expected. By default when we do create a volume with a single brick
the volume type is Distribute, here in this case use replica 2 It means the files will distribute across replicated
set of bricks(2).

@zemzema
Copy link

zemzema commented Apr 19, 2023

But how then to from distribute one brick volume get only replica volume by adding one more brick.

I can't remember this happening in previous versions.

@mohit84
Copy link
Contributor

mohit84 commented Apr 19, 2023

You can convert pure distribute(single brick) to replica(2) or more after adding more bricks. Can you please let me know the previous version when the behavior was not the same? I don't remember so I have to check it.

@zemzema
Copy link

zemzema commented Apr 19, 2023

In version 8, it seems to me that it behaved differently.

@icolombi
Copy link

Same behaviour here. I create a 3 node cluster with this command:

gluster volume create share replica 3 qa-gluster-01:/data/glusterfs/share/brick1/brick qa-gluster-02:/data/glusterfs/share/brick1/brick qa-gluster-03:/data/glusterfs/share/brick1/brick

With Gluster v10.x I have this result:

gluster v info share

Volume Name: share
Type: Replicate
Status: Started
Number of Bricks: 1 x 3 = 3

With Gluster 11:

gluster v info share

Volume Name: share
Type: Distributed-Replicate
Status: Started
Number of Bricks: 1 x 3 = 3

Please note that based on the data in the bricks the data seems replicated to all three nodes correctly.

@mohit84
Copy link
Contributor

mohit84 commented Apr 19, 2023

Same behaviour here. I create a 3 node cluster with this command:

gluster volume create share replica 3 qa-gluster-01:/data/glusterfs/share/brick1/brick qa-gluster-02:/data/glusterfs/share/brick1/brick qa-gluster-03:/data/glusterfs/share/brick1/brick

With Gluster v10.x I have this result:

gluster v info share

Volume Name: share
Type: Replicate
Status: Started
Number of Bricks: 1 x 3 = 3

With Gluster 11:

gluster v info share

Volume Name: share
Type: Distributed-Replicate
Status: Started
Number of Bricks: 1 x 3 = 3

Please note that based on the data in the bricks the data seems replicated to all three nodes correctly.

I have to check the patch from that volume type changed but there is no issue in functionality.
The functionality wise behavior is the same the only difference is representation.

@mohit84
Copy link
Contributor

mohit84 commented Apr 20, 2023

I have checked the behavior was changed because of this patch (#3662), I will send a patch to correct the same.

mohit84 added a commit to mohit84/glusterfs that referenced this issue Apr 20, 2023
The commit 5118f1a corrects
the distCount value but due to that voltype has changed
because the function get_vol_type is not updated.

Solution: No need to measure voltype if distCount is 1

Fixes: gluster#4107
Change-Id: I16e7e906d64b01398b40c0a634924a5bf9069b58
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
xhernandez pushed a commit that referenced this issue Apr 20, 2023
The commit 5118f1a corrects
the distCount value but due to that voltype has changed
because the function get_vol_type is not updated.

Solution: No need to measure voltype if distCount is 1

Fixes: #4107
Change-Id: I16e7e906d64b01398b40c0a634924a5bf9069b58
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
@mykaul
Copy link
Contributor

mykaul commented Apr 20, 2023

I think it's an important enough fix for a 11.0.1 or so...

@mohit84 mohit84 reopened this Apr 20, 2023
mohit84 added a commit to mohit84/glusterfs that referenced this issue Apr 20, 2023
The commit 5118f1a corrects
the distCount value but due to that voltype has changed
because the function get_vol_type is not updated.

Solution: No need to measure voltype if distCount is 1

> Fixes: gluster#4107
> Change-Id: I16e7e906d64b01398b40c0a634924a5bf9069b58
> Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
> (Reviewed on upstream link gluster#4119)

Fixes: gluster#4107
Change-Id: I16e7e906d64b01398b40c0a634924a5bf9069b58
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
Shwetha-Acharya pushed a commit that referenced this issue Apr 21, 2023
The commit 5118f1a corrects
the distCount value but due to that voltype has changed
because the function get_vol_type is not updated.

Solution: No need to measure voltype if distCount is 1

> Fixes: #4107
> Change-Id: I16e7e906d64b01398b40c0a634924a5bf9069b58
> Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
> (Reviewed on upstream link #4119)

Fixes: #4107
Change-Id: I16e7e906d64b01398b40c0a634924a5bf9069b58

Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants