New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OpenTSDB should reliably operate on a single node #796
Comments
OpenTSDB has nothing to do with that error. You're going to have to take a look in the HBase log and see what caused the region error. Is it splitting when this happens? How much heap memory do you have for the HBase master and regionserver? |
Thanks for opening the issue, we'll use this as a placeholder for a
single-node mode for OpenTSDB.
|
We had similar issues. Initially, I grudgingly accepted a virtual server Not on a virtual are you?
|
We had a similar setup here for about a year and there were no issues. Running on a physical machine too. |
We are running on hardware. (I see my previous email reply hasn't made it onto Github). We have a 1TB drive with the default 12GB of Regionserver RAM, which should be enough, according to this article. If I recall correctly it appeared during HBase startup, i.e. after restarting either the host or just the HBase regionserver service: the region in question wouldn't come back online. Our first machine has been running for more than a year has triggered it twice. The second one triggered it 3 times in a couple of months. On the second machine there where other processes also contending for CPU and disk access. |
We run single node OpenTSDB with HBase writing to local file (RAID backed) in stead of HDFS when deploying to smaller clusters. OpenTSDB easily handles the ingestion rate (about 7000 dps).
However we have had repeated file level corruption problems. Over the last few months our 2 test systems have 5 times had an HBase 'tsdb' region is stuck in a FAILED_OPEN state. The only way I could recover from this is to delete the region file from the disk.
Is there something we can improve in our setup to avoid these errors? I am thinking about moving to HDFS. Is it possible/worth while to run a single node HDFS (with mulitple JBOD disks for reliability).
The text was updated successfully, but these errors were encountered: