Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Browse files

OOZIE-1218 Create a HCatalog Integration Guide (rohini via virag)

git-svn-id: https://svn.apache.org/repos/asf/oozie/trunk@1453463 13f79535-47bb-0310-9956-ffa450edef68
  • Loading branch information...
commit 7c6b7c04cfcffefd0527462796f4d577e2685df3 1 parent 5ee1756
Virag Kothari authored
View
6 core/src/main/resources/oozie-default.xml
@@ -133,7 +133,7 @@
<!-- HCatAccessorService -->
<property>
- <name>oozie.service.HCatAccessorService.connections</name>
+ <name>oozie.service.HCatAccessorService.jmsconnections</name>
<value>
default=java.naming.factory.initial#org.apache.activemq.jndi.ActiveMQInitialContextFactory;java.naming.provider.url#tcp://localhost:61616;connectionFactoryNames#ConnectionFactory
</value>
@@ -142,7 +142,7 @@
identifies the HCatalog server URL. "default" is used if no endpoint is mentioned
in the query. If some JMS property is not defined, the system will use the property
defined jndi.properties. jndi.properties files is retrieved from the application classpath.
- Mapping rules can also be provided for mapping Hcatalog server names to JMS server.
+ Mapping rules can also be provided for mapping Hcatalog servers to corresponding JMS providers.
hcat://${1}.${2}.server.com:8020=java.naming.factory.initial#Dummy.Factory;java.naming.provider.url#tcp://broker.${2}:61616
</description>
</property>
@@ -1753,7 +1753,7 @@
<name>oozie.service.URIHandlerService.uri.handlers</name>
<value>org.apache.oozie.dependency.FSURIHandler</value>
<description>
- Enlist the different uri schemes supported for data availability checks.
+ Enlist the different uri handlers supported for data availability checks.
</description>
</property>
<!-- Oozie HTTP Notifications -->
View
105 docs/src/site/twiki/AG_Install.twiki
@@ -400,6 +400,111 @@ The above value, =hdfs=, which is the default, means that Oozie will only allow
filesystems that Oozie is compatible with are: hdfs, hftp, webhdfs, and viewfs. Multiple filesystems can be specified as
comma-separated values. Putting a * will allow any filesystem type, effectively disabling this check.
+---+++ HCatalog Configuration
+
+Refer to the [[DG_HCatalogIntegration][Oozie HCatalog Integration]] document for a overview of HCatalog and
+integration of Oozie with HCatalog. This section explains the various settings to be configured in oozie-site.xml on
+the Oozie server to enable Oozie to work with HCatalog.
+
+*Adding HCatalog jars to Oozie war:*
+
+ For Oozie server to talk to HCatalog server, HCatalog and hive jars need to be in the server classpath.
+hive-site.xml which has the configuration to talk to the HCatalog server also needs to be in the classpath.
+
+The oozie-[version]-hcataloglibs.tar.gz in the oozie distribution bundles the required hcatalog and hive jars that
+needs to be placed in the Oozie server classpath. If using a version of HCatalog bundled in
+Oozie hcataloglibs/, copy the corresponding HCatalog jars from hcataloglibs/ to the libext/ directory. If using a
+different version of HCatalog, copy the required HCatalog jars from such version in the libext/ directory.
+This needs to be done before running the =oozie-setup.sh= script so that these jars get added to the Oozie WAR file.
+
+*Configure HCatalog URI Handling:*
+
+<verbatim>
+ <property>
+ <name>oozie.service.URIHandlerService.uri.handlers</name>
+ <value>org.apache.oozie.dependency.FSURIHandler,org.apache.oozie.dependency.HCatURIHandler</value>
+ <description>
+ Enlist the different uri handlers supported for data availability checks.
+ </description>
+ </property>
+</verbatim>
+
+The above configuration defines the different uri handlers which check for existence of data dependencies defined in a
+Coordinator. The default value is =org.apache.oozie.dependency.FSURIHandler=. FSURIHandler supports uris with
+schemes defined in the configuration =oozie.service.HadoopAccessorService.supported.filesystems= which are hdfs, hftp
+and webhcat by default. HCatURIHandler supports uris with the scheme as hcat.
+
+*Configure HCatalog services:*
+
+<verbatim>
+ <property>
+ <name>oozie.services.ext</name>
+ <value>
+ org.apache.oozie.service.JMSAccessorService,
+ org.apache.oozie.service.PartitionDependencyManagerService,
+ org.apache.oozie.service.HCatAccessorService
+ </value>
+ <description>
+ To add/replace services defined in 'oozie.services' with custom implementations.
+ Class names must be separated by commas.
+ </description>
+ </property>
+</verbatim>
+
+PartitionDependencyManagerService and HCatAccessorService are required to work with HCatalog and support Coordinators
+having HCatalog uris as data dependency. If the HCatalog server is configured to publish partition availability
+notifications to a JMS compliant messaging provider like ActiveMQ, then JMSAccessorService needs to be added
+to =oozie.services.ext= to handle those notifications.
+
+*Configure JMS Provider JNDI connection mapping for HCatalog:*
+
+<verbatim>
+ <property>
+ <name>oozie.service.HCatAccessorService.jmsconnections</name>
+ <value>
+ hcat://hcatserver.colo1.com:8020=java.naming.factory.initial#Dummy.Factory;java.naming.provider.url#tcp://broker.colo1.com:61616,
+ default=java.naming.factory.initial#org.apache.activemq.jndi.ActiveMQInitialContextFactory;java.naming.provider.url#tcp://broker.colo.com:61616;connectionFactoryNames#ConnectionFactory
+ </value>
+ <description>
+ Specify the map of endpoints to JMS configuration properties. In general, endpoint
+ identifies the HCatalog server URL. "default" is used if no endpoint is mentioned
+ in the query. If some JMS property is not defined, the system will use the property
+ defined jndi.properties. jndi.properties files is retrieved from the application classpath.
+ Mapping rules can also be provided for mapping Hcatalog servers to corresponding JMS providers.
+ hcat://${1}.${2}.com:8020=java.naming.factory.initial#Dummy.Factory;java.naming.provider.url#tcp://broker.${2}.com:61616
+ </description>
+ </property>
+</verbatim>
+
+ Currently HCatalog does not provide APIs to get the connection details to connect to the JMS Provider it publishes
+notifications to. It only has APIs which provide the topic name in the JMS Provider to which the notifications are
+published for a given database table. So the JMS Provider's connection properties needs to be manually configured
+in Oozie using the above setting. You can either provide a =default= JNDI configuration which will be used as the
+JMS Provider for all HCatalog servers, or can specify a configuration per HCatalog server URL or provide a
+configuration based on a rule matching multiple HCatalog server URLs. For example: With the configuration of
+hcat://${1}.${2}.com:8020=java.naming.factory.initial#Dummy.Factory;java.naming.provider.url#tcp://broker.${2}.com:61616,
+request URL of hcat://server1.colo1.com:8020 will map to tcp://broker.colo1.com:61616, hcat://server2.colo2.com:8020
+will map to tcp://broker.colo2.com:61616 and so on.
+
+*Configure HCatalog Polling Frequency:*
+
+<verbatim>
+ <property>
+ <name>oozie.service.coord.push.check.requeue.interval
+ </name>
+ <value>600000</value>
+ <description>Command re-queue interval for push dependencies (in millisecond).
+ </description>
+ </property>
+</verbatim>
+
+ If there is no JMS Provider configured for a HCatalog Server, then oozie polls HCatalog based on the frequency defined
+in =oozie.service.coord.input.check.requeue.interval=. This config also applies to HDFS polling.
+If there is a JMS provider configured for a HCatalog Server, then oozie polls HCatalog based on the frequency defined
+in =oozie.service.coord.push.check.requeue.interval= as a fallback.
+The defaults for =oozie.service.coord.input.check.requeue.interval= and =oozie.service.coord.push.check.requeue.interval=
+are 1 minute and 10 minutes respectively.
+
---+++ Fine Tuning an Oozie Server
Refer to the [[./oozie-default.xml][oozie-default.xml]] for details.
View
9 docs/src/site/twiki/CoordinatorFunctionalSpec.twiki
@@ -2313,7 +2313,8 @@ aggregated daily output.
*%GREEN% Example: %ENDCOLOR%*
#HCatPigExampleOne
----++++ Coordinator application definition:
+
+*Coordinator application definition:*
<blockquote>
<coordinator-app name="app-coord" frequency="${coord:days(1)}"
@@ -2390,7 +2391,8 @@ to pass the partition key-value string needed by the *HCatStorer* in Pig job whe
coordinator action.
#HCatWorkflow
----++++ Workflow definition:
+
+*Workflow definition:*
<blockquote>
<workflow-app xmlns="uri:oozie:workflow:0.3" name="logsprocessor-wf">
@@ -2494,7 +2496,8 @@ partition max/min values, output partition value, and database and table.
*%GREEN% Example: %ENDCOLOR%*
#HCatPigExampleTwo
----++++ Coordinator application definition:
+
+*Coordinator application definition:*
<blockquote>
<coordinator-app name="app-coord" frequency="${coord:days(1)}"
View
101 docs/src/site/twiki/DG_HCatalogIntegration.twiki
@@ -0,0 +1,101 @@
+<noautolink>
+
+[[index][::Go back to Oozie Documentation Index::]]
+
+---+!! HCatalog Integration (Since Oozie 4.x)
+
+%TOC%
+
+---++ HCatalog Overview
+ HCatalog is a table and storage management layer for Hadoop that enables users with different data processing
+tools - Pig, MapReduce, and Hive - to more easily read and write data on the grid. HCatalog's table abstraction presents
+users with a relational view of data in the Hadoop distributed file system (HDFS).
+
+ Read [[http://incubator.apache.org/hcatalog/docs/r0.5.0/index.html][HCatalog Documentation]] to know more about HCatalog.
+Working with HCatalog using pig is detailed in
+[[http://incubator.apache.org/hcatalog/docs/r0.5.0/loadstore.html][HCatLoader and HCatStorer]].
+Working with HCatalog using MapReduce directly is detailed in
+[[http://incubator.apache.org/hcatalog/docs/r0.5.0/inputoutput.html][HCatInputFormat and HCatOutputFormat]].
+
+---+++ HCatalog notifications
+ HCatalog provides notifications through a JMS provider like ActiveMQ when a new partition is added to a table in the
+database. This allows applications to consume those events and schedule the work that depends on them. In case of Oozie,
+the notifications are used to determine the availability of HCatalog partitions defined as data dependencies in the
+Coordinator and trigger workflows.
+
+Read [[http://incubator.apache.org/hcatalog/docs/r0.5.0/notification.html][HCatalog Notification]] to know more about
+notifications in HCatalog.
+
+---++ Oozie HCatalog Integration
+ Oozie's Coordinators so far have been supporting HDFS directories as a input data dependency. When a HDFS URI
+template is specified as a dataset and input events are defined in Coordinator for the dataset, Oozie performs data
+availability checks by polling the HDFS directory URIs resolved based on the nominal time. When all the data
+dependencies are met, the Coordinator's workflow is triggered which then consumes the available HDFS data.
+
+With addition of HCatalog support, Coordinators also support specifying a set of HCatalog table partitions as a dataset.
+The workflow is triggered when the HCatalog table partitions are available and the workflow actions can then read the
+partition data. A mix of HDFS and HCatalog dependencies can be specified as input data dependencies.
+Similar to HDFS directories, HCatalog table partitions can also be specified as output dataset events.
+
+With HDFS data dependencies, Oozie has to poll HDFS every time to determine the availability of a directory.
+If the HCatalog server is configured to publish partition availability notifications to a JMS provider, Oozie can be
+configured to subscribe to it and trigger jobs immediately. This pub-sub model reduces pressure on Namenode and also
+cuts down on delays caused by polling intervals.
+
+In the absence of a message bus in the deployment, Oozie will always
+poll the HCatalog server directly for partition availability with the same frequency as the HDFS polling. Even when
+subscribed to notifications, Oozie falls back to polling HCatalog server for partitions that were available before the
+coordinator action was materialized and to deal with missed notifications due to system downtimes. The frequency of the
+fallback polling is usually lower than the constant polling. Defaults are 10 minutes and 1 minute respectively.
+
+
+---+++ Oozie Server Configuration
+ Refer to [[AG_Install#HCatalog_Configuration][HCatalog Configuration]] section of [[AG_Install][Oozie Install]]
+documentation for the Oozie server side configuration required to support HCatalog table partitions as a data dependency.
+
+---+++ HCatalog URI Format
+
+Oozie supports specifying HCatalog partitions as a data dependency through a URI notation. The HCatalog partition URI is
+used to identify a set of table partitions: hcat://bar:8020/logsDB/logsTable/dt=20090415;region=US.
+
+The format to specify a HCatalog table partition URI is
+hcat://[metastore server]:[port]/[database name]/[table name]/[partkey1]=[value];[partkey2]=[value];...
+
+For example,
+<verbatim>
+ <dataset name="logs" frequency="${coord:days(1)}"
+ initial-instance="2009-02-15T08:15Z" timezone="America/Los_Angeles">
+ <uri-template>
+ hcat://myhcatmetastore:9080/database1/table1/datestamp=${YEAR}${MONTH}${DAY}${HOUR};region=USA
+ </uri-template>
+ </dataset>
+</verbatim>
+
+---+++ HCatalog Libraries
+
+A workflow action interacting with HCatalog requires the following jars in the classpath: hcatalog-core.jar,
+webhcat-java-client.jar, hive-common.jar, hive-exec.jar, hive-metastore.jar, hive-serde.jar and libfb303.jar.
+hive-site.xml which has the configuration to talk to the HCatalog server also needs to be in the classpath. The correct
+version of HCatalog and hive jars should be placed in classpath based on the version of HCatalog installed on the cluster.
+
+The jars can be added to the classpath of the action using one of the below ways.
+ * You can place the jars and hive-site.xml in the system shared library. The shared library for a pig, hive or java action can be overridden to include hcatalog shared libraries along with the action's shared library. Refer to [[WorkflowFunctionalSpec.html#a17_HDFS_Share_Libraries_for_Workflow_Applications_since_Oozie_2.3][Shared Libraries]] for more information. The oozie-sharelib-[version].tar.gz in the oozie distribution bundles the required HCatalog jars in a hcatalog sharelib. If using a different version of HCatalog than the one bundled in the sharelib, copy the required HCatalog jars from such version into the sharelib.
+ * You can place the jars and hive-site.xml in the workflow application lib/ path.
+ * You can specify the location of the jar files in =archive= tag and the hive-site.xml in =file= tag in the corresponding pig, hive or java action.
+
+---+++ Coordinator
+
+Refer to [[CoordinatorFunctionalSpec][Coordinator Functional Specification]] for more information about
+ * how to specify HCatalog partitions as a data dependency using input dataset events
+ * how to specify HCatalog partitions as output dataset events
+ * the various EL functions available to work with HCatalog dataset events and how to use them to access HCatalog partitions in pig, hive or java actions in a workflow.
+
+---+++ Workflow
+Refer to [[WorkflowFunctionalSpec][Workflow Functional Specification]] for more information about
+ * how to drop HCatalog partitions in the prepare block of a action
+ * the HCatalog EL functions available to use in workflows
+
+---+++ Known Issues
+ * When rerunning a coordinator action without specifying -nocleanup option if the 'output-event' are hdfs directories, then they are deleted. But if the 'output-event' is a hcatalog partition, currently the partition is not dropped.
+
+</noautolink>
View
8 docs/src/site/twiki/WorkflowFunctionalSpec.twiki
@@ -2296,7 +2296,7 @@ and Mapreduce Streaming share library direcotry is =share/library/mapreduce-stre
Oozie bundles a share library for specific versions of streaming, pig, hive, sqoop, distcp actions. These versions
of streaming, pig, hive, sqoop and distcp have been tested and verified to work correctly with the version of Oozie
that includes them. Oozie also bundles a separate share library for hcatalog, which can be used with pig, hive and java
-actions.
+actions (since Oozie 4.x).
In addition, Oozie provides a mechanism to override the action share library JARs to allow using an alternate version
of of the action JARs.
@@ -2312,9 +2312,9 @@ using the following precedence order:
* action.sharelib.for.#ACTIONTYPE# in the oozie server configuration
* action's =ActionExecutor getDefaultShareLibName()= method
-More than one share library directory name can be specified for an action by using a comma separated list. For example:
-When using HCatLoader and HCatStorer in pig, =action.sharelib.for.pig= can be set to =pig,hcatalog= to include both pig
-and hcatalog jars.
+More than one share library directory name can be specified for an action by using a comma separated list (since Oozie 4.x).
+For example: When using HCatLoader and HCatStorer in pig, =action.sharelib.for.pig= can be set to =pig,hcatalog= to include
+both pig and hcatalog jars.
---++ 18 User-Retry for Workflow Actions (since Oozie 3.1)
View
1  docs/src/site/twiki/index.twiki
@@ -44,6 +44,7 @@ Enough reading already? Follow the steps in [[DG_QuickStart][Oozie Quick Start]]
* [[http://java.sun.com/j2ee/1.4/docs/tutorial/doc/JSPIntro7.html#wp77280][EL Expression Language Quick Reference]]
* [[DG_CommandLineTool][Command Line Tool]]
* [[DG_WorkflowReRun][Workflow Re-runs Explained]]
+ * [[DG_HCatalogIntegration][HCatalog Integration Explained]]
* [[./client/apidocs/index.html][Oozie Client Javadocs]]
* [[./core/apidocs/index.html][Oozie Core Javadocs]]
View
215 hcataloglibs/hcatalog-0.5/pom.xml
@@ -0,0 +1,215 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements. See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership. The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+ xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
+ <modelVersion>4.0.0</modelVersion>
+ <parent>
+ <groupId>org.apache.oozie</groupId>
+ <artifactId>oozie-main</artifactId>
+ <version>4.1.0-SNAPSHOT</version>
+ <relativePath>../../pom.xml</relativePath>
+ </parent>
+ <groupId>org.apache.oozie</groupId>
+ <artifactId>oozie-hcatalog</artifactId>
+ <version>0.5.0.oozie-4.1.0-SNAPSHOT</version>
+ <description>Apache Oozie HCatalog ${project.version}</description>
+ <name>Apache Oozie HCatalog ${project.version}</name>
+ <packaging>jar</packaging>
+
+ <!-- src/main/assemblies/hcataloglib.xml is configured with useTransitiveDependencies as false
+ as the required jars are very less and too many dependencies to exclude -->
+ <dependencies>
+ <dependency>
+ <groupId>org.apache.hcatalog</groupId>
+ <artifactId>hcatalog-server-extensions</artifactId>
+ <version>0.5.0-incubating</version>
+ <scope>compile</scope>
+ <exclusions>
+ <exclusion>
+ <groupId>org.apache.hadoop</groupId>
+ <artifactId>hadoop-core</artifactId>
+ </exclusion>
+ <exclusion>
+ <groupId>org.apache.activemq</groupId>
+ <artifactId>activemq-core</artifactId>
+ </exclusion>
+ <exclusion>
+ <groupId>org.apache.activemq</groupId>
+ <artifactId>kahadb</artifactId>
+ </exclusion>
+ <exclusion>
+ <groupId>org.apache.activemq</groupId>
+ <artifactId>activeio-core</artifactId>
+ </exclusion>
+ <exclusion>
+ <groupId>org.apache.activemq.protobuf</groupId>
+ <artifactId>activemq-protobuf</artifactId>
+ </exclusion>
+ <exclusion>
+ <groupId>org.apache.geronimo.specs</groupId>
+ <artifactId>geronimo-jms_1.1_spec</artifactId>
+ </exclusion>
+ <exclusion>
+ <groupId>org.apache.geronimo.specs</groupId>
+ <artifactId>geronimo-j2ee-management_1.1_spec</artifactId>
+ </exclusion>
+ <exclusion>
+ <groupId>org.osgi</groupId>
+ <artifactId>org.osgi.core</artifactId>
+ </exclusion>
+ <exclusion>
+ <groupId>javax.jms</groupId>
+ <artifactId>jms</artifactId>
+ </exclusion>
+ </exclusions>
+ </dependency>
+
+ <dependency>
+ <groupId>org.apache.hcatalog</groupId>
+ <artifactId>hcatalog-core</artifactId>
+ <version>0.5.0-incubating</version>
+ <scope>compile</scope>
+ <exclusions>
+ <exclusion>
+ <groupId>org.apache.hadoop</groupId>
+ <artifactId>hadoop-core</artifactId>
+ </exclusion>
+ <exclusion>
+ <groupId>org.apache.hive</groupId>
+ <artifactId>hive-service</artifactId>
+ </exclusion>
+ <exclusion>
+ <groupId>org.apache.hive</groupId>
+ <artifactId>hive-cli</artifactId>
+ </exclusion>
+ <exclusion>
+ <groupId>org.apache.hive</groupId>
+ <artifactId>hive-builtins</artifactId>
+ </exclusion>
+ <exclusion>
+ <groupId>jline</groupId>
+ <artifactId>jline</artifactId>
+ </exclusion>
+ <exclusion>
+ <groupId>com.google.code.findbugs</groupId>
+ <artifactId>jsr305</artifactId>
+ </exclusion>
+ </exclusions>
+ </dependency>
+
+ <dependency>
+ <groupId>org.apache.hcatalog</groupId>
+ <artifactId>webhcat-java-client</artifactId>
+ <version>0.5.0-incubating</version>
+ <scope>compile</scope>
+ </dependency>
+
+ <dependency>
+ <groupId>org.apache.hive</groupId>
+ <artifactId>hive-common</artifactId>
+ <version>${hive.version}</version>
+ <scope>compile</scope>
+ <exclusions>
+ <exclusion>
+ <groupId>org.apache.hadoop</groupId>
+ <artifactId>hadoop-core</artifactId>
+ </exclusion>
+ </exclusions>
+ </dependency>
+
+ <dependency>
+ <groupId>org.apache.hive</groupId>
+ <artifactId>hive-metastore</artifactId>
+ <version>${hive.version}</version>
+ <scope>compile</scope>
+ <exclusions>
+ <exclusion>
+ <groupId>org.apache.hadoop</groupId>
+ <artifactId>hadoop-core</artifactId>
+ </exclusion>
+ </exclusions>
+ </dependency>
+
+ <dependency>
+ <groupId>org.apache.hive</groupId>
+ <artifactId>hive-exec</artifactId>
+ <version>${hive.version}</version>
+ <scope>compile</scope>
+ <exclusions>
+ <exclusion>
+ <groupId>org.apache.hadoop</groupId>
+ <artifactId>hadoop-core</artifactId>
+ </exclusion>
+ </exclusions>
+ </dependency>
+
+ <dependency>
+ <groupId>org.apache.hive</groupId>
+ <artifactId>hive-serde</artifactId>
+ <version>${hive.version}</version>
+ <scope>compile</scope>
+ <exclusions>
+ <exclusion>
+ <groupId>org.apache.hadoop</groupId>
+ <artifactId>hadoop-core</artifactId>
+ </exclusion>
+ </exclusions>
+ </dependency>
+
+ <dependency>
+ <groupId>org.apache.thrift</groupId>
+ <artifactId>libfb303</artifactId>
+ <version>0.7.0</version>
+ <scope>compile</scope>
+ </dependency>
+
+ <dependency>
+ <groupId>org.codehaus.jackson</groupId>
+ <artifactId>jackson-core-asl</artifactId>
+ <version>1.8.8</version>
+ <scope>compile</scope>
+ </dependency>
+
+ <dependency>
+ <groupId>org.codehaus.jackson</groupId>
+ <artifactId>jackson-mapper-asl</artifactId>
+ <version>1.8.8</version>
+ <scope>compile</scope>
+ </dependency>
+
+ </dependencies>
+
+ <build>
+ <plugins>
+ <plugin>
+ <groupId>org.apache.maven.plugins</groupId>
+ <artifactId>maven-assembly-plugin</artifactId>
+ <configuration>
+ <descriptors>
+ <descriptor>../../src/main/assemblies/hcataloglib.xml</descriptor>
+ </descriptors>
+ <finalName>hcataloglibs</finalName>
+ <appendAssemblyId>false</appendAssemblyId>
+ </configuration>
+ </plugin>
+ </plugins>
+ </build>
+
+</project>
+
View
1  hcataloglibs/pom.xml
@@ -32,6 +32,7 @@
<packaging>pom</packaging>
<modules>
+ <module>hcatalog-0.5</module>
<module>hcatalog-0.6</module>
</modules>
View
2  pom.xml
@@ -68,7 +68,7 @@
<hadoop.version>1.1.1</hadoop.version>
<hbase.version>0.94.2</hbase.version>
- <hcatalog.version>0.6.0</hcatalog.version>
+ <hcatalog.version>0.5.0</hcatalog.version>
<hadooplib.version>${hadoop.version}.oozie-${project.version}</hadooplib.version>
<hbaselib.version>${hbase.version}.oozie-${project.version}</hbaselib.version>
View
1  release-log.txt
@@ -4,6 +4,7 @@ OOZIE-1239 Bump up trunk to 4.1.0-SNAPSHOT (virag)
-- Oozie 4.0.0 (unreleased)
+OOZIE-1218 Create a HCatalog Integration Guide (rohini via virag)
OOZIE-1250 Coord action timeout not happening when there is a exception (rohini via mona)
OOZIE-1207 Optimize current EL resolution in case of start-instance and end-instance (rohini via mona)
OOZIE-1247 CoordActionInputCheck shouldn't queue CoordPushInputCheck (rohini via virag)
View
41 sharelib/hcatalog/pom.xml
@@ -17,7 +17,7 @@
limitations under the License.
-->
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
- xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
+ xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
<parent>
<groupId>org.apache.oozie</groupId>
@@ -39,19 +39,27 @@
<dependencies>
<dependency>
- <groupId>org.apache.oozie</groupId>
- <artifactId>oozie-hcatalog</artifactId>
- <scope>compile</scope>
- <exclusions>
- <exclusion>
+ <groupId>org.apache.oozie</groupId>
+ <artifactId>oozie-hcatalog</artifactId>
+ <scope>compile</scope>
+ <exclusions>
+ <exclusion>
<groupId>org.apache.hcatalog</groupId>
<artifactId>hcatalog-server-extensions</artifactId>
</exclusion>
- <exclusion>
+ <exclusion>
<groupId>org.apache.hive</groupId>
<artifactId>hive-builtins</artifactId>
</exclusion>
<exclusion>
+ <groupId>org.apache.hive</groupId>
+ <artifactId>hive-shims</artifactId>
+ </exclusion>
+ <exclusion>
+ <groupId>org.apache.thrift</groupId>
+ <artifactId>libthrift</artifactId>
+ </exclusion>
+ <exclusion>
<groupId>org.apache.zookeeper</groupId>
<artifactId>zookeeper</artifactId>
</exclusion>
@@ -105,6 +113,10 @@
</exclusion>
<exclusion>
<groupId>org.slf4j</groupId>
+ <artifactId>slf4j-api</artifactId>
+ </exclusion>
+ <exclusion>
+ <groupId>org.slf4j</groupId>
<artifactId>slf4j-log4j12</artifactId>
</exclusion>
<exclusion>
@@ -120,6 +132,14 @@
<artifactId>mockito-all</artifactId>
</exclusion>
<exclusion>
+ <groupId>javax.transaction</groupId>
+ <artifactId>jta</artifactId>
+ </exclusion>
+ <exclusion>
+ <groupId>javax.jdo</groupId>
+ <artifactId>jdo2-api</artifactId>
+ </exclusion>
+ <exclusion>
<groupId>org.datanucleus</groupId>
<artifactId>datanucleus-core</artifactId>
</exclusion>
@@ -183,12 +203,7 @@
<groupId>antlr</groupId>
<artifactId>antlr</artifactId>
</exclusion>
- </exclusions>
- </dependency>
- <dependency>
- <groupId>org.slf4j</groupId>
- <artifactId>slf4j-api</artifactId>
- <scope>compile</scope>
+ </exclusions>
</dependency>
</dependencies>
View
2  src/main/assemblies/hcataloglibs.xml
@@ -26,7 +26,7 @@
<fileSets>
<!-- HCatalog libs -->
<fileSet>
- <directory>${basedir}/../hcataloglibs/hcatalog-0.6/target/hcataloglibs</directory>
+ <directory>${basedir}/../hcataloglibs/hcatalog-0.5/target/hcataloglibs</directory>
<outputDirectory>/hcataloglibs</outputDirectory>
</fileSet>
</fileSets>
Please sign in to comment.
Something went wrong with that request. Please try again.