Skip to content

Latest commit

 

History

History
executable file
·
338 lines (173 loc) · 19.6 KB

RELEASENOTES.md

File metadata and controls

executable file
·
338 lines (173 loc) · 19.6 KB

HBASE 2.1.0 Release Notes

These release notes cover new developer and user-facing incompatibilities, important issues, features, and major improvements.


  • HBASE-20691 | Blocker | Storage policy should allow deferring to HDFS

After HBASE-20691 we have changed the default setting of hbase.wal.storage.policy from "HOT" back to "NONE" which means we defer the policy to HDFS. This fixes the problem of release 2.0.0 that the storage policy of WAL directory will defer to HDFS and may not be "HOT" even if you explicitly set hbase.wal.storage.policy to "HOT"


  • HBASE-20839 | Blocker | Fallback to FSHLog if we can not instantiated AsyncFSWAL when user does not specify AsyncFSWAL explicitly

As we hack into the internal of DFSClient when implementing AsyncFSWAL to get better performance, a patch release of hadoop can make it broken.

So now, if user does not specify a wal provider, then we will first try to use 'asyncfs', i.e, the AsyncFSWALProvider. If we fail due to some compatible issues, we will fallback to 'filesystem', i.e, FSHLog.


  • HBASE-20193 | Critical | Basic Replication Web UI - Regionserver

After HBASE-20193, we add a section to web ui to show the replication status of each wal group. There are 2 parts of this section, they both show the peerId, wal group and current replicating log of each replication source. And one is showing the information of replication log queue, i.e. size of current log, log queue size and replicating offset. The other one is showing the delay of replication, i.e. last shipped age and replication delay. If the offset shows -1 and replication delay is UNKNOWN, that means replication is not started. This may be caused by this peer is disabled or the replicationEndpoint is sleeping due to some reason.


Now we have a 'basically work' solution for rolling upgrade from 1.4.x to 2.x. Please see the "Rolling Upgrade from 1.x to 2.x" section in ref guide for more details.


  • HBASE-20270 | Major | Turn off command help that follows all errors in shell

The command help that followed all errors, before, is now no longer available. Erroneous command inputs would now just show error-texts followed by the shell command to try for seeing the help message. It looks like: For usage try 'help “create”’. Operators can copy-paste the command to get the help message.


  • HBASE-20194 | Critical | Basic Replication WebUI - Master

After HBASE-20194, we added 2 parts to master's web page. One is Peers that shows all replication peers and some of their configurations, like peer id, cluster key, state, bandwidth, and which namespace or table it will replicate. The other one is replication status of all regionservers, we added a tab to region servers division, then we can check the replication delay of all region servers for any peer. This table shows AgeOfLastShippedOp, SizeOfLogQueue and ReplicationLag for each regionserver and the table is sort by ReplicationLag in descending order. By this way we can easily find the problematic region server. If the replication delay is UNKNOWN, that means this walGroup doesn't start replicate yet and it may get disabled. ReplicationLag will update once this peer start replicate.


  • HBASE-18569 | Major | Add prefetch support for async region locator

Add prefetch support for async region locator. The default value is 10. Set 'hbase.client.locate.prefetch.limit' in hbase-site.xml if you want to use another value for it.


  • HBASE-20642 | Major | IntegrationTestDDLMasterFailover throws 'InvalidFamilyOperationException

This changes client-side nonce generation to use the same nonce for re-submissions of client RPC DDL operations.


  • HBASE-20708 | Blocker | Remove the usage of RecoverMetaProcedure in master startup

Introduce an InitMetaProcedure to initialize meta table for a new HBase deploy. Marked RecoverMetaProcedure deprecated and remove the usage of it in the current code base. We still need to keep it in place for compatibility. The code in RecoverMetaProcedure has been moved to ServerCrashProcedure, and SCP will always be enabled and we will rely on it to bring meta region online.

For more on the issue addressed by this commit, see the design doc for overview and plan: https://docs.google.com/document/d/1\_872oHzrhJq4ck7f6zmp1J--zMhsIFvXSZyX1Mxg5MA/edit#heading=h.xy1z4alsq7uy


  • HBASE-20334 | Major | add a test that expressly uses both our shaded client and the one from hadoop 3

HBase now includes a helper script that can be used to run a basic functionality test for a given HBase installation at in dev_support. The test can optionally be given an HBase client artifact to rely on and can optionally be given specific Hadoop client artifacts to use.

For usage information see ./dev-support/hbase_nightly_pseudo-distributed-test.sh --help.

The project nightly tests now make use of this test to check running on top of Hadoop 2, Hadoop 3, and Hadoop 3 with shaded client artifacts.


  • HBASE-19735 | Major | Create a minimal "client" tarball installation

The HBase convenience binary artifacts now includes a client focused tarball that a) includes more docs and b) does not include scripts or jars only needed for running HBase cluster services.

The new artifact is made as a normal part of the assembly:single maven command.


  • HBASE-20615 | Major | emphasize use of shaded client jars when they're present in an install

HBase's built in scripts now rely on the downstream facing shaded artifacts where possible. In particular interest to downstream users, the hbase classpath and hbase mapredcp commands now return the relevant shaded client artifact and only those third paty jars needed to make use of them (e.g. slf4j-api, commons-logging, htrace, etc).

Downstream users should note that by default the hbase classpath command will treat having hadoop on the shell's PATH as an implicit request to include the output of the hadoop classpath command in the returned classpath. This long-existing behavior can be opted out of by setting the environment variable HBASE_DISABLE_HADOOP_CLASSPATH_LOOKUP to the value "true". For example: HBASE_DISABLE_HADOOP_CLASSPATH_LOOKUP="true" bin/hbase classpath.


  • HBASE-20333 | Critical | break up shaded client into one with no Hadoop and one that's standalone

Downstream users who need to use both HBase and Hadoop APIs should switch to relying on the new hbase-shaded-client-byo-hadoop artifact rather than the existing hbase-shaded-client artifact. The new artifact no longer includes and Hadoop classes.

It should work in combination with either the output of hadoop classpath or the Hadoop provided client-facing shaded artifacts in Hadoop 3+.


  • HBASE-20332 | Critical | shaded mapreduce module shouldn't include hadoop

The hbase-shaded-mapreduce artifact no longer include its own copy of Hadoop classes. Users who make use of the artifact via YARN should be able to get these classes from YARN's classpath without having to make any changes.


  • HBASE-20681 | Major | IntegrationTestDriver fails after HADOOP-15406 due to missing hamcrest-core

Users of our integration tests on Hadoop 3 can now add all needed dependencies by pointing at jars included in our binary convenience artifact.

Prior to this fix, downstream users on Hadoop 3 would need to get a copy of the Hamcrest v1.3 jar from elsewhere.


  • HBASE-19852 | Major | HBase Thrift 1 server SPNEGO Improvements

Adds two new properties for hbase-site.xml for THRIFT SPNEGO when in HTTP mode: * hbase.thrift.spnego.keytab.file * hbase.thrift.spnego.principal


  • HBASE-20590 | Critical | REST Java client is not able to negotiate with the server in the secure mode

Adds a negotiation logic between a secure java REST client and server. After this jira the Java REST client will start responding to the Negotiate challenge sent by the server. Adds RESTDemoClient which can be used to verify whether the secure Java REST client works against secure REST server or not.


  • HBASE-20634 | Critical | Reopen region while server crash can cause the procedure to be stuck

A second attempt at fixing HBASE-20173. Fixes unfinished keeping of server state inside AM (ONLINE=>SPLITTING=>OFFLINE=>null). Concurrent unassigns look at server state to figure if they should wait on SCP to wake them up or not.


  • HBASE-20579 | Minor | Improve snapshot manifest copy in ExportSnapshot

This patch adds an FSUtil.copyFilesParallel() to help copy files in parallel, and it will return all the paths of directories and files traversed. Thus when we copy manifest in ExportSnapshot, we can copy reference files concurrently and use the paths it returns to help setOwner and setPermission. The size of thread pool is determined by the configuration snapshot.export.copy.references.threads, and its default value is the number of runtime available processors.


  • HBASE-18116 | Major | Replication source in-memory accounting should not include bulk transfer hfiles

Before this change we would incorrectly include the size of enqueued store files for bulk replication in the calculation for determining whether or not to rate limit the transfer of WAL edits. Because bulk replication uses a separate and asynchronous mechanism for file transfer this could incorrectly limit the batch sizes for WAL replication if bulk replication in progress, with negative impact on latency and throughput.


  • HBASE-20592 | Minor | Create a tool to verify tables do not have prefix tree encoding

PreUpgradeValidator tool with DataBlockEncoding validator was added to verify cluster is upgradable to HBase 2.


  • HBASE-20501 | Blocker | Change the Hadoop minimum version to 2.7.1

HBase is no longer able to maintain compatibility with Apache Hadoop versions that are no longer receiving updates. This release raises the minimum supported version to Hadoop 2.7.1. Downstream users are strongly advised to upgrade to the latest Hadoop 2.7 maintenance release.

Downstream users of earlier HBase versions are similarly advised to upgrade to Hadoop 2.7.1+. When doing so, it is especially important to follow the guidance from the HBase Reference Guide's Hadoop section on replacing the Hadoop artifacts bundled with HBase.


  • HBASE-20601 | Minor | Add multiPut support and other miscellaneous to PE
  1. Add multiPut support Set --multiPut=number to enable batchput(meanwhile, --autoflush need be set to false)

  2. Add Connection Count support Added a new parameter connCount to PE. set --connCount=2 means all threads will share 2 connections. oneCon option and connCount option shouldn't be set at the same time.

  3. Add avg RT and avg TPS/QPS statstic for all threads

  4. Delete some redundant code Now RandomWriteTest is inherited from SequentialWrite.


  • HBASE-20544 | Blocker | downstream HBaseTestingUtility fails with invalid port

HBase now relies on an internal mechanism to determine when it is running a local hbase cluster meant for external interaction vs an encapsulated test. When created via the HBaseTestingUtility, ports for Master and RegionServer services and UIs will be set to random ports to allow for multiple parallel uses on a single machine. Normally when running a Standalone HBase Deployment (as described in the HBase Reference Guide) the ports will be picked according to the same defaults used in a full cluster set up. If you wish to instead use the random port assignment set hbase.localcluster.assign.random.ports to true.


  • HBASE-20004 | Minor | Client is not able to execute REST queries in a secure cluster

Added 'hbase.rest.http.allow.options.method' configuration property to allow user to decide whether Rest Server HTTP should allow OPTIONS method or not. By default it is enabled in HBase 2.1.0+ versions and in other versions it is disabled. Similarly 'hbase.thrift.http.allow.options.method' is added HBase 1.5, 2.1.0 and 3.0.0 versions. It is disabled by default.


  • HBASE-20327 | Minor | When qualifier is not specified, append and incr operation do not work (shell)

This change will enable users to perform append and increment operation with null qualifier via hbase-shell.


  • HBASE-18842 | Minor | The hbase shell clone_snaphost command returns bad error message

When attempting to clone a snapshot but using a namespace that does not exist, the HBase shell will now correctly report the exception as caused by the passed namespace. Previously, the shell would report that the problem was an unknown namespace but it would claim the user provided table name was not found as a namespace. Both before and after this change the shell properly used the passed namespace to attempt to handle the request.


  • HBASE-20406 | Major | HBase Thrift HTTP - Shouldn't handle TRACE/OPTIONS methods

When configured to do thrift-over-http, the HBase Thrift API Server no longer accepts the HTTP methods TRACE nor OPTIONS.


  • HBASE-20046 | Major | Reconsider the implementation for serial replication

Now in replication we can make sure the order of pushing logs is same as the order of requests from client. Set the serial flag to true for a replication peer to enable this feature.


  • HBASE-20159 | Major | Support using separate ZK quorums for client

After HBASE-20159 we allow client to use different ZK quorums by introducing three new properties: hbase.client.zookeeper.quorum and hbase.client.zookeeper.property.clientPort to specify client zookeeper properties (note that the combination of these two properties should be different from the server ZK quorums), and hbase.client.zookeeper.observer.mode to indicate whether the client ZK nodes are in observer mode (false by default)

HConstants.DEFAULT_ZOOKEPER_CLIENT_PORT has been removed in HBase 3.0 and replaced by the correctly spelled DEFAULT_ZOOKEEPER_CLIENT_PORT.


  • HBASE-20242 | Major | The open sequence number will grow if we fail to open a region after writing the max sequence id file

Now when opening a region, we will store the current max sequence id of the region to its max sequence id file instead of the 'next sequence id'. This could avoid the sequence id bumping when we fail to open a region, and also align to the behavior when we close a region.


  • HBASE-19024 | Critical | Configurable default durability for synchronous WAL

The default durability setting for the synchronous WAL is Durability.SYNC_WAL, which triggers HDFS hflush() to flush edits to the datanodes. We also support Durability.FSYNC_WAL, which instead triggers HDFS hsync() to flush _and_ fsync edits. This change introduces the new configuration setting "hbase.wal.hsync", defaulting to FALSE, that if set to TRUE changes the default durability setting for the synchronous WAL to FSYNC_WAL.


  • HBASE-19389 | Critical | Limit concurrency of put with dense (hundreds) columns to prevent write handler exhausted

After HBASE-19389 we introduced a RegionServer self-protection mechanism to prevent write handler getting exhausted by high concurrency put with dense columns, mainly through two new properties: hbase.region.store.parallel.put.limit.min.column.count to decide what kind of put (with how many columns within a single column family) to limit (100 by default) and hbase.region.store.parallel.put.limit to limit the concurrency (10 by default). There's another property for advanced user and please check source and javadoc of StoreHotnessProtector for more details.


  • HBASE-19397 | Major | Design procedures for ReplicationManager to notify peer change event from master

Introduce 5 procedures to do peer modifications: AddPeerProcedure RemovePeerProcedure UpdatePeerConfigProcedure EnablePeerProcedure DisablePeerProcedure

The procedures are all executed with the following stage:

  1. Call pre CP hook, if an exception is thrown then give up
  2. Check whether the operation is valid, if not then give up
  3. Update peer storage. Notice that if we have entered this stage, then we can not rollback any more.
  4. Schedule sub procedures to refresh the peer config on every RS.
  5. Do post cleanup if any.
  6. Call post CP hook. The exception thrown will be ignored since we have already done the work.

The procedure will hold an exclusive lock on the peer id, so now there is no concurrent modifications on a single peer.

And now it is guaranteed that once the procedure is done, the peer modification has already taken effect on all RSes.

Abstracte a storage layer for replication peer/queue manangement, and refactored the upper layer to remove zk related naming/code/comment.

Add pre/postExecuteProcedures CP hooks to RegionServerObserver, and add permission check for executeProcedures method which requires the caller to be system user or super user.

On rolling upgrade: just do not do any replication peer modifications during the rolling upgrading. There is no pb/layout changes on the peer/queue storage on zk.