Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenTSDB should reliably operate on a single node #796

Open
IzakMarais opened this issue May 19, 2016 · 5 comments
Open

OpenTSDB should reliably operate on a single node #796

IzakMarais opened this issue May 19, 2016 · 5 comments
Milestone

Comments

@IzakMarais
Copy link

We run single node OpenTSDB with HBase writing to local file (RAID backed) in stead of HDFS when deploying to smaller clusters. OpenTSDB easily handles the ingestion rate (about 7000 dps).

However we have had repeated file level corruption problems. Over the last few months our 2 test systems have 5 times had an HBase 'tsdb' region is stuck in a FAILED_OPEN state. The only way I could recover from this is to delete the region file from the disk.
regions_in_transition

Is there something we can improve in our setup to avoid these errors? I am thinking about moving to HDFS. Is it possible/worth while to run a single node HDFS (with mulitple JBOD disks for reliability).

@kev009
Copy link
Contributor

kev009 commented May 19, 2016

OpenTSDB has nothing to do with that error.

You're going to have to take a look in the HBase log and see what caused the region error. Is it splitting when this happens? How much heap memory do you have for the HBase master and regionserver?

@johann8384
Copy link
Member

johann8384 commented May 19, 2016 via email

@nickman
Copy link
Contributor

nickman commented May 19, 2016

We had similar issues. Initially, I grudgingly accepted a virtual server
(VMware) to run hbase. We had a ton of disk and 8 GB of RAM. After we saw
similar problems (usually after a failed compaction) we switched to
physical and it has been humming ever since.

Not on a virtual are you?
On May 19, 2016 1:21 AM, "Izak Marais" notifications@github.com wrote:

We run single node OpenTSDB with HBase writing to local file (RAID backed)
in stead of HDFS when deploying to smaller clusters. OpenTSDB easily
handles the ingestion rate (about 7000 dps).

However we have had repeated file level corruption problems. Over the last
few months our 2 test systems have 5 times had an HBase 'tsdb' region is
stuck in a FAILED_OPEN state. The only way I could recover from this is to
delete the region file from the disk.
[image: regions_in_transition]
https://cloud.githubusercontent.com/assets/8266640/15383276/2e923a2c-1d92-11e6-82c5-92d521ceae62.PNG

Is there something we can improve in our setup to avoid these errors? I am
thinking about moving to HDFS. Is it possible/worth while to run a single
node HDFS (with mulitple JBOD disks for reliability).


You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub
#796

@vitorboschi
Copy link

We had a similar setup here for about a year and there were no issues. Running on a physical machine too.

@IzakMarais
Copy link
Author

We are running on hardware. (I see my previous email reply hasn't made it onto Github).

We have a 1TB drive with the default 12GB of Regionserver RAM, which should be enough, according to this article.

If I recall correctly it appeared during HBase startup, i.e. after restarting either the host or just the HBase regionserver service: the region in question wouldn't come back online.

Our first machine has been running for more than a year has triggered it twice. The second one triggered it 3 times in a couple of months. On the second machine there where other processes also contending for CPU and disk access.

@johann8384 johann8384 added this to the v2.4.0 milestone Jul 6, 2016
@johann8384 johann8384 modified the milestones: v2.5.0, 2.6.0 Oct 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants