Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

Toddlipcon master jan 03 #40

Merged
merged 8 commits into from

3 participants

@rangadi
Collaborator

This merges https://github.com/toddlipcon/hadoop-lzo master into kw/master.

  • it avoids creating default configuration repeatedly.
  • updates hadoop-core jar to hadoop-core-0.20.2-cdh3u1.jar
toddlipcon added some commits
@toddlipcon toddlipcon Fix performance issue when reinit() is called with a null Configuration
Previously, this would instantiate a new Configuration object on every call,
which involved re-reading and parsing the configuration XML files to
load the defaults. This was very slow.

The new version caches a default Configuration object statically
and uses that one in this circumstance.
47d4714
@toddlipcon toddlipcon Merge remote branch 'kw/master' ceb643f
@toddlipcon toddlipcon Make LzoInputFormatCommon abstract and add javadoc de79ad3
@toddlipcon toddlipcon Fix some javadoc formatting a61e99b
@toddlipcon toddlipcon Bump version to 0.4.14 8aa0605
@toddlipcon toddlipcon Bump to cdh3u1 release dependency b7f412d
@toddlipcon toddlipcon Merge remote branch 'kw/master' b39db4e
@toddlipcon toddlipcon Bump version to 0.4.15 after merging with kw c7d54ff
@dvryaboy dvryaboy merged commit 855d328 into from
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Sep 1, 2011
  1. @toddlipcon

    Fix performance issue when reinit() is called with a null Configuration

    toddlipcon authored
    Previously, this would instantiate a new Configuration object on every call,
    which involved re-reading and parsing the configuration XML files to
    load the defaults. This was very slow.
    
    The new version caches a default Configuration object statically
    and uses that one in this circumstance.
Commits on Sep 5, 2011
  1. @toddlipcon
  2. @toddlipcon
  3. @toddlipcon
  4. @toddlipcon

    Bump version to 0.4.14

    toddlipcon authored
  5. @toddlipcon
Commits on Nov 26, 2011
  1. @toddlipcon
  2. @toddlipcon
This page is out of date. Refresh to see the latest.
View
4 build.xml
@@ -28,9 +28,9 @@
<property name="Name" value="Hadoop GPL Compression"/>
<property name="name" value="hadoop-lzo"/>
- <property name="version" value="0.4.14"/>
+ <property name="version" value="0.4.15"/>
<property name="final.name" value="${name}-${version}"/>
- <property name="year" value="2008"/>
+ <property name="year" value="2011"/>
<property name="src.dir" value="${basedir}/src"/>
<property name="java.src.dir" value="${src.dir}/java"/>
View
BIN  lib/hadoop-core-0.20.3-CDH3-SNAPSHOT.jar → lib/hadoop-core-0.20.2-cdh3u1.jar
Binary file not shown
View
9 src/java/com/hadoop/compression/lzo/LzoCompressor.java
@@ -60,6 +60,13 @@
private int lzoCompressionLevel;
/**
+ * Used when the user doesn't specify a configuration. We cache a single
+ * one statically, since loading the defaults is expensive.
+ */
+ private static Configuration defaultConfiguration =
+ new Configuration();
+
+ /**
* The compression algorithm for lzo library.
*/
public static enum CompressionStrategy {
@@ -200,7 +207,7 @@ public void reinit(Configuration conf) {
// and the new user of the codec doesn't specify a particular configuration
// to CodecPool.getCompressor(). So we use the defaults.
if (conf == null) {
- conf = new Configuration();
+ conf = defaultConfiguration;
}
LzoCompressor.CompressionStrategy strategy = LzoCodec.getCompressionStrategy(conf);
int compressionLevel = LzoCodec.getCompressionLevel(conf);
View
7 src/java/com/hadoop/compression/lzo/LzoInputFormatCommon.java
@@ -23,7 +23,10 @@
import com.hadoop.compression.lzo.LzoIndexer;
import com.hadoop.compression.lzo.LzopCodec;
-public class LzoInputFormatCommon {
+/**
+ * Utilities used by the two LzoInputFormat implementations.
+ */
+public abstract class LzoInputFormatCommon {
/**
* The boolean property <code>lzo.text.input.format.ignore.nonlzo</code> tells
* the LZO text input format whether it should silently ignore non-LZO input
@@ -72,4 +75,4 @@ public static boolean isLzoFile(String filename) {
public static boolean isLzoIndexFile(String filename) {
return filename.endsWith(FULL_LZO_INDEX_SUFFIX);
}
-}
+}
View
4 src/java/com/hadoop/mapred/DeprecatedLzoTextInputFormat.java
@@ -52,12 +52,12 @@
* com.hadoop.mapred.DeprecatedLzoTextInputFormat, not
* com.hadoop.mapreduce.LzoTextInputFormat. The classes attempt to be alike in
* every other respect.
- *
+ * <p>
* Note that to use this input format properly with hadoop-streaming, you should
* also set the property <code>stream.map.input.ignoreKey=true</code>. That will
* replicate the behavior of the default TextInputFormat by stripping off the byte
* offset keys from the input lines that get piped to the mapper process.
- *
+ * <p>
* See {@link LzoInputFormatCommon} for a description of the boolean property
* <code>lzo.text.input.format.ignore.nonlzo</code> and how it affects the
* behavior of this input format.
View
2  src/java/com/hadoop/mapreduce/LzoTextInputFormat.java
@@ -47,7 +47,7 @@
* An {@link InputFormat} for lzop compressed text files. Files are broken into
* lines. Either linefeed or carriage-return are used to signal end of line.
* Keys are the position in the file, and values are the line of text.
- *
+ * <p>
* See {@link LzoInputFormatCommon} for a description of the boolean property
* <code>lzo.text.input.format.ignore.nonlzo</code> and how it affects the
* behavior of this input format.
Something went wrong with that request. Please try again.