Find file
Fetching contributors…
Cannot retrieve contributors at this time
507 lines (357 sloc) 21.3 KB
Yahoo! Distribution of Hadoop Change Log
Patches from the following Apache Jira issues have been applied
to this release in the order indicated. This is in addition to
the patches applied from issues referenced in CHANGES.txt.
HADOOP-6521. Fix backward compatiblity issue with umask when applications
use deprecated param dfs.umask in configuration or use
FsPermission.setUMask(). (suresh)
MAPREDUCE-1372. Fixed a ConcurrentModificationException in jobtracker.
(Arun C Murthy via yhemanth)
MAPREDUCE-1316. Fix jobs' retirement from the JobTracker to prevent memory
leaks via stale references. (Amar Kamat via acmurthy)
MAPREDUCE-1342. Fixed deadlock in global blacklisting of tasktrackers.
(Amareshwari Sriramadasu via acmurthy)
HADOOP-6460. Reinitializes buffers used for serializing responses in ipc
server on exceeding maximum response size to free up Java heap. (suresh)
MAPREDUCE-1100. Truncate user logs to prevent TaskTrackers' disks from
filling up. (Vinod Kumar Vavilapalli via acmurthy)
MAPREDUCE-1143. Fix running task counters to be updated correctly
when speculative attempts are running for a TIP.
(Rahul Kumar Singh via yhemanth)
HADOOP-6151, 6281, 6285, 6441. Add HTML quoting of the parameters to all
of the servlets to prevent XSS attacks. (omalley)
MAPREDUCE-896. Fix bug in earlier implementation to prevent
spurious logging in tasktracker logs for absent file paths.
(Ravi Gummadi via yhemanth)
MAPREDUCE-676. Fix Hadoop Vaidya to ensure it works for map-only jobs.
(Suhas Gogate via acmurthy)
HADOOP-5582. Fix Hadoop Vaidya to use new Counters in
org.apache.hadoop.mapreduce package. (Suhas Gogate via acmurthy)
HDFS-595. umask settings in configuration may now use octal or
symbolic instead of decimal. Update HDFS tests as such. (jghoman)
MAPREDUCE-1068. Added a verbose error message when user specifies an
incorrect -file parameter. (Amareshwari Sriramadasu via acmurthy)
MAPREDUCE-1171. Allow the read-error notification in shuffle to be
configurable. (Amareshwari Sriramadasu via acmurthy)
MAPREDUCE-353. Allow shuffle read and connection timeouts to be
configurable. (Amareshwari Sriramadasu via acmurthy)
HADOOP-6428. HttpServer sleeps with negative values (cos)
HADOOP-6386. NameNode's HttpServer can't instantiate InetSocketAddress:
IllegalArgumentException is thrown. (cos)
HDFS-781. Namenode metrics PendingDeletionBlocks is not decremented. (suresh)
MAPREDUCE-1185. Redirect running job url to history url if job is already
retired. (Amareshwari Sriramadasu and Sharad Agarwal via sharad)
MAPREDUCE-754. Fix NPE in expiry thread when a TT is lost. (Amar Kamat
via sharad)
MAPREDUCE-896. Modify permissions for local files on tasktracker before
deletion so they can be deleted cleanly. (Ravi Gummadi via yhemanth)
HADOOP-5771. Implements unit tests for LinuxTaskController.
(Sreekanth Ramakrishnan and Vinod Kumar Vavilapalli via yhemanth)
MAPREDUCE-1124. Import Gridmix3 and Rumen. (cdouglas)
MAPREDUCE-1063. Document gridmix benchmark. (cdouglas)
HDFS-758. Changes to report status of decommissioining on the namenode web
UI. (jitendra)
HADOOP-6234. Add new option dfs.umaskmode to set umask in configuration
to use octal or symbolic instead of decimal. (Jakob Homan via suresh)
MAPREDUCE-1147. Add map output counters to new API. (Amar Kamat via
MAPREDUCE-1182. Fix overflow in reduce causing allocations to exceed the
configured threshold. (cdouglas)
HADOOP-4933. Fixes a ConcurrentModificationException problem that shows up
when the history viewer is accessed concurrently.
(Amar Kamat via ddas)
MAPREDUCE-1140. Fix DistributedCache to not decrement reference counts for
unreferenced files in error conditions.
(Amareshwari Sriramadasu via yhemanth)
HADOOP-6203. FsShell rm/rmr error message indicates exceeding Trash quota
and suggests using -skpTrash, when moving to trash fails.
(Boris Shkolnik via suresh)
HADOOP-5675. Do not launch a job if DistCp has no work to do. (Tsz Wo
(Nicholas), SZE via cdouglas)
HDFS-457. Better handling of volume failure in Data Node storage,
This fix is a port from hdfs-0.22 to common-0.20 by Boris Shkolnik.
Contributed by Erik Steffl
HDFS-625. Fix NullPointerException thrown from ListPathServlet.
Contributed by Suresh Srinivas.
HADOOP-6343. Log unexpected throwable object caught in RPC.
Contributed by Jitendra Nath Pandey
MAPREDUCE-1186. Fixed DistributedCache to do a recursive chmod on just the
per-cache directory, not all of mapred.local.dir.
(Amareshwari Sriramadasu via acmurthy)
MAPREDUCE-1231. Add an option to distcp to ignore checksums when used with
the upgrade option.
(Jothi Padmanabhan via yhemanth)
MAPREDUCE-1219. Fixed JobTracker to not collect per-job metrics, thus
easing load on it. (Amareshwari Sriramadasu via acmurthy)
HDFS-761. Fix failure to process rename operation from edits log due to
quota verification. (suresh)
MAPREDUCE-1196. Fix FileOutputCommitter to use the deprecated cleanupJob
api correctly. (acmurthy)
HADOOP-6344. rm and rmr immediately delete files rather than sending
to trash, despite trash being enabled, if a user is over-quota. (jhoman)
MAPREDUCE-1160. Reduce verbosity of log lines in some Map/Reduce classes
to avoid filling up jobtracker logs on a busy cluster.
(Ravi Gummadi and Hong Tang via yhemanth)
HDFS-587. Add ability to run HDFS with MR test on non-default queue,
also updated junit dependendcy from junit-3.8.1 to junit-4.5 (to make
it possible to use Configured and Tool to process command line to
be able to specify a queue). Contributed by Erik Steffl.
MAPREDUCE-1158. Fix JT running maps and running reduces metrics.
MAPREDUCE-947. Fix bug in earlier implementation that was
causing unit tests to fail.
(Ravi Gummadi via yhemanth)
MAPREDUCE-1062. Fix MRReliabilityTest to work with retired jobs
(Contributed by Sreekanth Ramakrishnan)
MAPREDUCE-1090. Modified log statement in TaskMemoryManagerThread to
include task attempt id. (yhemanth)
MAPREDUCE-1098. Fixed the distributed-cache to not do i/o while
holding a global lock. (Amareshwari Sriramadasu via acmurthy)
MAPREDUCE-1048. Add occupied/reserved slot usage summary on
jobtracker UI. (Amareshwari Sriramadasu via sharad)
MAPREDUCE-1103. Added more metrics to Jobtracker. (sharad)
MAPREDUCE-947. Added commitJob and abortJob apis to OutputCommitter.
Enhanced FileOutputCommitter to create a _SUCCESS file for successful
jobs. (Amar Kamat & Jothi Padmanabhan via acmurthy)
MAPREDUCE-1105. Remove max limit configuration in capacity scheduler in
favor of max capacity percentage thus allowing the limit to go over
queue capacity. (Rahul Kumar Singh via yhemanth)
MAPREDUCE-1086. Setup Hadoop logging environment for tasks to point to
task related parameters. (Ravi Gummadi via yhemanth)
MAPREDUCE-739. Allow relative paths to be created inside archives.
HADOOP-6097. Multiple bugs w/ Hadoop archives (mahadev)
HADOOP-6231. Allow caching of filesystem instances to be disabled on a
per-instance basis (ben slusky via mahadev)
MAPREDUCE-826. harchive doesn't use ToolRunner / harchive returns 0 even
if the job fails with exception (koji via mahadev)
HDFS-686. NullPointerException is thrown while merging edit log and
image. (hairong)
HDFS-709. Fix TestDFSShell failure due to rename bug introduced by
HDFS-677. (suresh)
HDFS-677. Rename failure when both source and destination quota exceeds
results in deletion of source. (suresh)
HADOOP-6284. Add a new parameter, HADOOP_JAVA_PLATFORM_OPTS, to so that it allows setting java command options for
JAVA_PLATFORM. (Koji Noguchi via szetszwo)
MAPREDUCE-732. Removed spurious log statements in the node
blacklisting logic. (Sreekanth Ramakrishnan via yhemanth)
MAPREDUCE-144. Includes dump of the process tree in task diagnostics when
a task is killed due to exceeding memory limits.
(Vinod Kumar Vavilapalli via yhemanth)
MAPREDUCE-979. Fixed JobConf APIs related to memory parameters to
return values of new configuration variables when deprecated
variables are disabled. (Sreekanth Ramakrishnan via yhemanth)
MAPREDUCE-277. Makes job history counters available on the job history
viewers. (Jothi Padmanabhan via ddas)
HADOOP-5625. Add operation duration to clienttrace. (Lei Xu
via cdouglas)
HADOOP-5222. Add offset to datanode clienttrace. (Lei Xu via cdouglas)
HADOOP-6218. Adds a feature where TFile can be split by Record
Sequence number. Contributed by Hong Tang and Raghu Angadi.
MAPREDUCE-1088. Changed permissions on JobHistory files on local disk to
0744. Contributed by Arun C. Murthy.
HADOOP-6304. Use{Readable|Writable|Executable} where
possible in RawLocalFileSystem. Contributed by Arun C. Murthy.
MAPREDUCE-270. Fix the tasktracker to optionally send an out-of-band
heartbeat on task-completion for better job-latency. Contributed by
Arun C. Murthy
Configuration changes:
add mapreduce.tasktracker.outofband.heartbeat
MAPREDUCE-1030. Fix capacity-scheduler to assign a map and a reduce task
per-heartbeat. Contributed by Rahuk K Singh.
MAPREDUCE-1028. Fixed number of slots occupied by cleanup tasks to one
irrespective of slot size for the job. Contributed by Ravi Gummadi.
MAPREDUCE-964. Fixed start and finish times of TaskStatus to be
consistent, thereby fixing inconsistencies in metering tasks.
Contributed by Sreekanth Ramakrishnan.
HADOOP-5976. Add a new command, classpath, to the hadoop
script. Contributed by Owen O'Malley and Gary Murry
HADOOP-5784. Makes the number of heartbeats that should arrive
a second at the JobTracker configurable. Contributed by
Amareshwari Sriramadasu.
MAPREDUCE-945. Modifies MRBench and TestMapRed to use
ToolRunner so that options such as queue name can be
passed via command line. Contributed by Sreekanth Ramakrishnan.
HADOOP:5420 Correct bug in earlier implementation
by Arun C. Murthy
HADOOP-5363 Add support for proxying connections to multiple
clusters with different versions to hdfsproxy. Contributed
by Zhiyong Zhang
HADOOP-5780. Improve per block message prited by -metaSave
in HDFS. (Raghu Angadi)
HADOOP-6227. Fix Configuration to allow final parameters to be set
to null and prevent them from being overridden. Contributed by
Amareshwari Sriramadasu.
MAPREDUCE-430 Added patch supplied by Amar Kamat to allow
roll forward on branch to includ externally committed
MAPREDUCE-768. Provide an option to dump jobtracker configuration in
JSON format to standard output. Contributed by V.V.Chaitanya
MAPREDUCE-834 Correct an issue created by merging this issue with
patch attached to external Jira.
HADOOP-6184 Provide an API to dump Configuration in a JSON format.
Contributed by V.V.Chaitanya Krishna.
MAPREDUCE-745 Patch added for this issue to allow branch-0.20 to
merge cleanly.
MAPREDUCE:478 Allow map and reduce jvm parameters, environment
variables and ulimit to be set separately.
MAPREDUCE:682 Removes reservations on tasktrackers which are blacklisted.
Contributed by Sreekanth Ramakrishnan.
HADOOP:5420 Support killing of process groups in LinuxTaskController
HADOOP-5488 Removes the pidfile management for the Task JVM from the
framework and instead passes the PID back and forth between the
TaskTracker and the Task processes. Contributed by Ravi Gummadi.
MAPREDUCE:467 Provide ability to collect statistics about total tasks and
succeeded tasks in different time windows.
MAPREDUCE-817. Add a cache for retired jobs with minimal job
info and provide a way to access history file url
MAPREDUCE-814. Provide a way to configure completed job history
files to be on HDFS.
MAPREDUCE-838 Fixes a problem in the way commit of task outputs
happens. The bug was that even if commit failed, the task would be
declared as successful. Contributed by Amareshwari Sriramadasu.
MAPREDUCE-809 Fix job-summary logs to correctly record final status of
MAPREDUCE-740 Log a job-summary at the end of a job, while
allowing it to be configured to use a custom appender if desired.
MAPREDUCE-771 Fixes a bug which delays normal jobs in favor of
high-ram jobs.
HADOOP-5420 Support setsid based kill in LinuxTaskController.
MAPREDUCE-733 Fixes a bug that when a task tracker is killed ,
it throws exception. Instead it should catch it and process it and
allow the rest of the flow to go through
MAPREDUCE-734 Fixes a bug which prevented hi ram jobs from being
removed from the scheduler queue.
MAPREDUCE-693 Fixes a bug that when a job is submitted and the
JT is restarted (before job files have been written) and the job
is killed after recovery, the conf files fail to be moved to the
"done" subdirectory.
MAPREDUCE-722 Fixes a bug where more slots are getting reserved
for HiRAM job tasks than required.
MAPREDUCE-683 TestJobTrackerRestart failed because of stale
filemanager cache (which was created once per jvm). This patch makes
sure that the filemanager is inited upon every JobHistory.init()
and hence upon every restart. Note that this wont happen in production
as upon a restart the new jobtracker will start in a new jvm and
hence a new cache will be created.
MAPREDUCE-709 Fixes a bug where node health check script does
not display the correct message on timeout.
MAPREDUCE-708 Fixes a bug where node health check script does
not refresh the "reason for blacklisting".
MAPREDUCE-522 Rewrote TestQueueCapacities to make it simpler
and avoid timeout errors.
MAPREDUCE-532 Provided ability in the capacity scheduler to
limit the number of slots that can be concurrently used per queue
at any given time.
MAPREDUCE-211 Provides ability to run a health check script on
the tasktracker nodes and blacklist nodes if they are unhealthy.
Contributed by Sreekanth Ramakrishnan.
MAPREDUCE-516 Remove .orig file included by mistake.
MAPREDUCE-416 Moves the history file to a "done" folder whenever
a job completes.
HADOOP-5980 Previously, task spawned off by LinuxTaskController
didn't get LD_LIBRARY_PATH in their environment. The tasks will now
get same LD_LIBRARY_PATH value as when spawned off by
HADOOP-5981 This issue completes the feature mentioned in
HADOOP-2838. HADOOP-2838 provided a way to set env variables in
child process. This issue provides a way to inherit tt's env variables
and append or reset it. So now X=$X:y will inherit X (if there) and
append y to it.
HADOOP-5419 This issue is to provide an improvement on the
existing M/R framework to let users know which queues they have
access to, and for what operations. One use case for this would
that currently there is no easy way to know if the user has access
to submit jobs to a queue, until it fails with an access control
HADOOP-5420 Support setsid based kill in LinuxTaskController.
HADOOP-5643 Added the functionality to refresh jobtrackers node
list via command line (bin/hadoop mradmin -refreshNodes). The command
should be run as the jobtracker owner (jobtracker process owner)
or from a super group (mapred.permissions.supergroup).
HADOOP-2838 Now the users can set environment variables using
mapred.child.env. They can do the following X=Y : set X to Y X=$X:Y
: Append Y to X (which should be taken from the tasktracker)
HADOOP-5818. Revert the renaming from FSNamesystem.checkSuperuserPrivilege
to checkAccess by HADOOP-5643. (Amar Kamat via szetszwo)
HADOOP-5801. Fixes the problem: If the hosts file is changed across restart
then it should be refreshed upon recovery so that the excluded hosts are
lost and the maps are re-executed. (Amar Kamat via ddas)
HADOOP-5643. HADOOP-5643. Adds a way to decommission TaskTrackers
while the JobTracker is running. (Amar Kamat via ddas)
HADOOP-5419. Provide a facility to query the Queue ACLs for the
current user. (Rahul Kumar Singh via yhemanth)
HADOOP-5733. Add map/reduce slot capacity and blacklisted capacity to
JobTracker metrics. (Sreekanth Ramakrishnan via cdouglas)
HADOOP-5738. Split "waiting_tasks" JobTracker metric into waiting maps and
waiting reduces. (Sreekanth Ramakrishnan via cdouglas)
HADOOP-4842. Streaming now allows specifiying a command for the combiner.
(Amareshwari Sriramadasu via ddas)
HADOOP-4490. Provide ability to run tasks as job owners.
(Sreekanth Ramakrishnan via yhemanth)
HADOOP-5442. Paginate jobhistory display and added some search
capabilities. (Amar Kamat via acmurthy)
HADOOP-3327. Improves handling of READ_TIMEOUT during map output copying.
(Amareshwari Sriramadasu via ddas)
HADOOP-5113. Fixed logcondense to remove files for usernames
beginning with characters specified in the -l option.
(Peeyush Bishnoi via yhemanth)
HADOOP-2898. Provide an option to specify a port range for
Hadoop services provisioned by HOD.
(Peeyush Bishnoi via yhemanth)
HADOOP-4930. Implement a Linux native executable that can be used to
launch tasks as users. (Sreekanth Ramakrishnan via yhemanth)