Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't rebuild meta after disk failure - "No rdir assigned" even though it is #2115

Closed
M-Pixel opened this issue Jan 20, 2021 · 2 comments
Closed

Comments

@M-Pixel
Copy link

M-Pixel commented Jan 20, 2021

ISSUE TYPE
  • Bug Report
  • Documentation Report (if it's not a bug, then it's a documentation/error-message problem)
COMPONENT NAME

meta*

SDS VERSION
openio 7.0.1
CONFIGURATION
# OpenIO managed
[OPENIO]
# endpoints
conscience=10.147.19.4:6000
zookeeper=10.147.19.2:6005,10.147.19.3:6005,10.147.19.4:6005
proxy=10.147.19.4:6006
event-agent=beanstalk://10.147.19.4:6014
ecd=10.147.19.4:6017

udp_allowed=yes

ns.meta1_digits=2
ns.storage_policy=ECLIBEC144D1
ns.chunk_size=104857600
ns.service_update_policy=meta2=KEEP|3|1|;rdir=KEEP|1|1|;

iam.connection=redis+sentinel://10.147.19.2:6012,10.147.19.3:6012,10.147.19.4:6012?sentinel_name=OPENIO-master-1
container_hierarchy.connection=redis+sentinel://10.147.19.2:6012,10.147.19.3:6012,10.147.19.4:6012?sentinel_name=OPENIO-master-1
bucket_db.connection=redis+sentinel://10.147.19.2:6012,10.147.19.3:6012,10.147.19.4:6012?sentinel_name=OPENIO-master-1

sqliterepo.repo.soft_max=1000
sqliterepo.repo.hard_max=1000
sqliterepo.cache.kbytes_per_db=4096
OS / ENVIRONMENT
NAME="Ubuntu"
VERSION="18.04.5 LTS (Bionic Beaver)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 18.04.5 LTS"
VERSION_ID="18.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=bionic
UBUNTU_CODENAME=bionic
SUMMARY

The meta partition on one of my nodes became corrupted due to an improper shutdown. After replacing it with a clean XFS partition, I found the instructions on how to reinitialize a meta2 partition. Following these instructions, I run openio-admin meta2 rebuild 10.147.19.4:6001, but the response is ERROR Failed to fill queue: No rdir assigned to volume 10.147.19.4:6001. However, according to the output of openio rdir assignments meta2 (below), unless I'm misinterpreting it, my .4 node's meta2 has an rdir on my .3 node.

+------------------+------------------+------------------------+------------------------+
| Rdir             | Meta2            | Rdir location | Meta2 location |
+------------------+------------------+------------------------+------------------------+
| 10.147.19.3:6300 | 10.147.19.2:6120 | redacted_20   | redacted_10    |
| 10.147.19.3:6300 | 10.147.19.4:6120 | redacted_20   | redacted_30    |
| 10.147.19.4:6300 | 10.147.19.3:6120 | redacted_30   | redacted_20    |
+------------------+------------------+------------------------+------------------------+
STEPS TO REPRODUCE
  • Set up a 3 node cluster according to the documentation
  • Delete everything in one of the meta partitions
  • Run the following (replace IP address as relevant)
openio-admin meta2 rebuild 10.147.19.4:6001
EXPECTED RESULTS
OPENIO|045256CF2FA8BBEAC689666EBA5BD7E9A7BFE76BA2E6511CC9C012B98125F56C OK None
OPENIO|237E50CABB28E3A2EA6BB1AC414C8838F5BE2FFA31D226CEEC54CE4D3AE74CF2 OK None
[...]
OPENIO|DFB66CDA3F33A74B4A9E69B09D06159A36DA81EA9AD7C2BF8C899D352CE4E1E7 OK None
OPENIO|DD506C822AA5E2FC32C274C28FCC801F14B9CF5FA46602ED811B1C972F399BD3 OK None
ACTUAL RESULTS
1631 7FD0FD6FA768 oio.cli.common.clientmanager ERROR Failed to fill queue: No rdir assigned to volume 10.147.19.4:6001
@fvennetier
Copy link
Member

Hello. There is a confusion in the service ports. Your rdir services are all on port 6300, your meta2 services on port 6120, but you try to rebuild a service on port 6001 (which does not exist or is not a meta2 service).

@M-Pixel
Copy link
Author

M-Pixel commented Jan 20, 2021

Thanks, you're right, the port used in the documentation isn't the same as the one used in the Ansible playbook.

@M-Pixel M-Pixel closed this as completed Jan 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants