Skip to content
This repository has been archived by the owner on Apr 5, 2022. It is now read-only.

Commit

Permalink
update namespace for consistency
Browse files Browse the repository at this point in the history
rename pig to pig-factory
rename hive-client to hive-client-factory
rename job attr to job-ref
  • Loading branch information
Costin Leau committed Oct 7, 2012
1 parent 27b326f commit 0e8c9ea
Show file tree
Hide file tree
Showing 26 changed files with 81 additions and 84 deletions.
2 changes: 1 addition & 1 deletion docs/src/reference/docbook/reference/hadoop.xml
Original file line number Diff line number Diff line change
Expand Up @@ -317,7 +317,7 @@

<para>For Spring Batch environments, SHDP provides a dedicated tasklet to execute Hadoop jobs as a step in a Spring Batch workflow. An example declaration is shown below:</para>

<programlisting language="xml"><![CDATA[<hdp:job-tasklet id="hadoop-tasklet" job="mr-job" wait-for-job="true" />]]></programlisting>
<programlisting language="xml"><![CDATA[<hdp:job-tasklet id="hadoop-tasklet" job-ref="mr-job" wait-for-job="true" />]]></programlisting>
<para>The tasklet above references a Hadoop job definition named "mr-job". By default, <literal>wait-for-job</literal> is true so that the tasklet will wait for the job to complete when it executes. Setting wait-for-job to false will submit the job to the Hadoop cluster but not wait for it to complete.</para>
</section>
</section>
Expand Down
14 changes: 7 additions & 7 deletions docs/src/reference/docbook/reference/hive.xml
Original file line number Diff line number Diff line change
Expand Up @@ -32,21 +32,21 @@
<para>Similar to the server, SHDP provides a dedicated namespace element for configuring a Hive client (that is Hive accessing a server node through the Thrift). Likewise, simply specify the host, the port
(the defaults are <literal>localhost</literal> and <literal>10000</literal> respectively) and you're done:</para>

<programlisting language="xml"><![CDATA[<!-- by default, the definition name is 'hive-client' -->
<hdp:hive-client host="some-other-host" port="10001" />]]></programlisting>
<para>Note that since Thrift clients are not thread-safe, <literal>hive-client</literal> returns a factory (named <literal>org.springframework.data.hadoop.hive.HiveClientFactory</literal>)
<programlisting language="xml"><![CDATA[<!-- by default, the definition name is 'hiveClientFactory' -->
<hdp:hive-client-factory host="some-other-host" port="10001" />]]></programlisting>
<para>Note that since Thrift clients are not thread-safe, <literal>hive-client-factory</literal> returns a factory (named <literal>org.springframework.data.hadoop.hive.HiveClientFactory</literal>)
for creating <literal>HiveClient</literal> new instances for each invocation. Further more, the client definition
also allows Hive scripts (either declared inlined or externally) to be executed during initialization, once the client connects; this quite useful for doing Hive specific initialization:</para>

<programlisting language="xml"><![CDATA[<hive-client host="some-host" port="some-port" xmlns="http://www.springframework.org/schema/hadoop">
<programlisting language="xml"><![CDATA[<hive-client-factory host="some-host" port="some-port" xmlns="http://www.springframework.org/schema/hadoop">
<hdp:script>
DROP TABLE IF EXITS testHiveBatchTable;
CREATE TABLE testHiveBatchTable (key int, value string);
</hdp:script>
<hdp:script location="classpath:org/company/hive/script.q">
<arguments>ignore-case=true</arguments>
</hdp:script>
</hive-client>]]></programlisting>
</hive-client-factory>]]></programlisting>

<para>In the example above, two scripts are executed each time a new Hive client is created (if the scripts need to be executed only once considering using a tasklet) by the factory.
The first executed a script defined inlined while the second read the script from the classpath and passed one parameter
Expand Down Expand Up @@ -137,8 +137,8 @@
catching any exceptions and performing clean-up. One can programmatically execute queries (and get the raw results or convert them to longs or ints) or scripts but also interact with the Hive API through the <literal>HiveClientCallback</literal>.
For example:</para>

<programlisting language="xml"><![CDATA[<hdp:hive-client ... />
<!-- Hive template wires automatically to 'hiveClient'-->
<programlisting language="xml"><![CDATA[<hdp:hive-client-factory ... />
<!-- Hive template wires automatically to 'hiveClientFactory'-->
<hdp:hive-template />
<!-- wire hive template into a bean -->
Expand Down
16 changes: 8 additions & 8 deletions docs/src/reference/docbook/reference/pig.xml
Original file line number Diff line number Diff line change
Expand Up @@ -8,13 +8,13 @@

<programlisting language="xml"><![CDATA[<hdp:pig />]]></programlisting>

<para>This will create a <interfacename>org.springframework.data.hadoop.pig.PigServerFactory</interfacename> instance, named <literal>pig</literal>, a factory that creates <literal>PigServer</literal> instances on demand
<para>This will create a <interfacename>org.springframework.data.hadoop.pig.PigServerFactory</interfacename> instance, named <literal>pigFactory</literal>, a factory that creates <literal>PigServer</literal> instances on demand
configured with a default <interfacename>PigContext</interfacename>, executing scripts in <literal>MapReduce</literal> mode. The factory is needed since <literal>PigServer</literal> is not thread-safe and thus cannot
be used by multiple objects at the same time.

In typical scenarios however, one might want to connect to a remote Hadoop tracker and register some scripts automatically so let us take a look of how the configuration might look like:</para>

<programlisting language="xml"><![CDATA[<pig exec-type="LOCAL" job-name="pig-script" configuration-ref="hadoopConfiguration" properties-location="pig-dev.properties"
<programlisting language="xml"><![CDATA[<pig-factory exec-type="LOCAL" job-name="pig-script" configuration-ref="hadoopConfiguration" properties-location="pig-dev.properties"
xmlns="http://www.springframework.org/schema/hadoop">
source=${pig.script.src}
<script location="org/company/pig/script.pig">
Expand All @@ -25,20 +25,20 @@
B = FOREACH A GENERATE name;
DUMP B;
</script>
</pig> />]]></programlisting>
</pig-factory> />]]></programlisting>

<para>The example exposes quite a few options so let us review them one by one. First the top-level pig definition configures the pig instance: the execution type, the Hadoop configuration used and the job name. Notice that
additional properties can be specified (either by declaring them inlined or/and loading them from an external file) - in fact, <literal><![CDATA[<hdp:pig/>]]></literal> just like the rest of the libraries configuration
additional properties can be specified (either by declaring them inlined or/and loading them from an external file) - in fact, <literal><![CDATA[<hdp:pig-factory/>]]></literal> just like the rest of the libraries configuration
elements, supports common properties attributes as described in the <link linkend="hadoop:config:properties">hadoop configuration</link> section.</para>
<para>The definition contains also two scripts: <literal>script.pig</literal> (read from the classpath) to which one pair of arguments,
relevant to the script, is passed (notice the use of property placeholder) but also an inlined script, declared as part of the definition, without any arguments.</para>

<para>As you can tell, the <literal>pig</literal> namespace offers several options pertaining to Pig configuration.</para>
<para>As you can tell, the <literal>pig-factory</literal> namespace offers several options pertaining to Pig configuration.</para>

<section id="pig:runner">
<title>Running a Pig script</title>

<para>Like the rest of the Spring Hadoop components, a runner is provided out of the box for executing PIg scripts, either inlined or from various locations through <literal>pig-runner</literal> element:</para>
<para>Like the rest of the Spring Hadoop components, a runner is provided out of the box for executing Pig scripts, either inlined or from various locations through <literal>pig-runner</literal> element:</para>

<programlisting language="xml"><![CDATA[<hdp:pig-runner id="pigRunner" run-at-startup="true">
<hdp:script>
Expand Down Expand Up @@ -75,8 +75,8 @@
executing the scripts, catching any exceptions and performing clean-up. One can programmatically execute scripts but also interact with the Hive API through the <literal>PigServerCallback</literal>.
For example:</para>

<programlisting language="xml"><![CDATA[<hdp:pig ... />
<!-- Pig template wires automatically to 'pig'-->
<programlisting language="xml"><![CDATA[<hdp:pig-factory ... />
<!-- Pig template wires automatically to 'pigFactory'-->
<hdp:pig-template />
<!-- use component scanning-->
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ protected Class<?> getBeanClass(Element element) {

@Override
protected boolean isEligibleAttribute(String attributeName) {
return !("job".equals(attributeName) || "pre-action".equals(attributeName) || "post-action".equals(attributeName))
return !("job-ref".equals(attributeName) || "pre-action".equals(attributeName) || "post-action".equals(attributeName))
&& super.isEligibleAttribute(attributeName);
}

Expand All @@ -43,7 +43,7 @@ protected void doParse(Element element, ParserContext parserContext, BeanDefinit
// parse attributes using conventions
super.doParse(element, parserContext, builder);

NamespaceUtils.setCSVProperty(element, builder, "job", "jobNames");
NamespaceUtils.setCSVProperty(element, builder, "job-ref", "jobNames");

NamespaceUtils.setCSVReferenceProperty(element, builder, "pre-action", "preAction");
NamespaceUtils.setCSVReferenceProperty(element, builder, "post-action", "postAction");
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -34,14 +34,14 @@ protected Class<?> getBeanClass(Element element) {

@Override
protected boolean isEligibleAttribute(String attributeName) {
return (!"job".equals(attributeName)) && super.isEligibleAttribute(attributeName);
return (!"job-ref".equals(attributeName)) && super.isEligibleAttribute(attributeName);
}

@Override
protected void doParse(Element element, ParserContext parserContext, BeanDefinitionBuilder builder) {
// parse attributes using conventions
super.doParse(element, parserContext, builder);

NamespaceUtils.setCSVProperty(element, builder, "job", "jobNames");
NamespaceUtils.setCSVProperty(element, builder, "job-ref", "jobNames");
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -51,12 +51,12 @@ public void init() {
registerBeanDefinitionParser("script", new ScriptParser());
registerBeanDefinitionParser("script-tasklet", new ScriptTaskletParser());

registerBeanDefinitionParser("pig", new PigServerParser());
registerBeanDefinitionParser("pig-factory", new PigServerParser());
registerBeanDefinitionParser("pig-tasklet", new PigTaskletParser());
registerBeanDefinitionParser("pig-template", new PigTemplateParser());
registerBeanDefinitionParser("pig-runner", new PigRunnerParser());

registerBeanDefinitionParser("hive-client", new HiveClientParser());
registerBeanDefinitionParser("hive-client-factory", new HiveClientParser());
registerBeanDefinitionParser("hive-server", new HiveServerParser());
registerBeanDefinitionParser("hive-tasklet", new HiveTaskletParser());
registerBeanDefinitionParser("hive-template", new HiveTemplateParser());
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ protected Class<?> getBeanClass(Element element) {

@Override
protected String defaultId(ParserContext context, Element element) {
return "hiveClient";
return "hiveClientFactory";
}

@Override
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ protected boolean isEligibleAttribute(String attributeName) {

@Override
protected String defaultId(ParserContext context, Element element) {
return "pig";
return "pigFactory";
}

@Override
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@ public HiveClientFactory getObject() {
}

public Class<?> getObjectType() {
return HiveClient.class;
return HiveClientFactory.class;
}

public boolean isSingleton() {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ public void setScripts(Collection<HiveScript> scripts) {
*
* @param hive HiveClient to set
*/
public void setHiveClient(HiveClientFactory hiveFactory) {
public void setHiveClientFactory(HiveClientFactory hiveFactory) {
this.hiveClientFactory = hiveFactory;
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ public class HiveTemplate implements InitializingBean, HiveOperations, ResourceL

/**
* Constructs a new <code>HiveClient</code> instance.
* Expects {@link #setHiveClient(ObjectFactory)} to be called before using it.
* Expects {@link #setHiveClientFactory(ObjectFactory)} to be called before using it.
*/
public HiveTemplate() {
}
Expand Down Expand Up @@ -276,7 +276,7 @@ protected HiveClient createHiveClient() {
*
* @param hiveClientFactory
*/
public void setHiveClient(HiveClientFactory hiveClientFactory) {
public void setHiveClientFactory(HiveClientFactory hiveClientFactory) {
this.hiveClientFactory = hiveClientFactory;
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,7 @@ public void setScripts(Collection<PigScript> scripts) {
*
* @param pigFactory The pigFactory to set.
*/
public void setPigServer(PigServerFactory pigFactory) {
public void setPigFactory(PigServerFactory pigFactory) {
this.pigFactory = pigFactory;
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,7 @@ public PigServerFactory getObject() throws Exception {
}

public Class<?> getObjectType() {
return ObjectFactory.class;
return PigServerFactory.class;
}

public boolean isSingleton() {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ public class PigTemplate implements InitializingBean, PigOperations, ResourceLoa

/**
* Constructs a new <code>PigTemplate</code> instance.
* Expects {@link #setPigServer(ObjectFactory)} to be called before using it.
* Expects {@link #setPigFactory(ObjectFactory)} to be called before using it.
*/
public PigTemplate() {
}
Expand Down Expand Up @@ -190,7 +190,7 @@ protected PigServer createPigServer() {
*
* @param pigServerFactory
*/
public void setPigServer(PigServerFactory pigServerFactory) {
public void setPigFactory(PigServerFactory pigServerFactory) {
this.pigServerFactory = pigServerFactory;
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ Bean id.]]></xsd:documentation>
</xsd:annotation>
</xsd:attribute>
<!-- the job reference -->
<xsd:attribute name="job">
<xsd:attribute name="job-ref">
<xsd:annotation>
<xsd:documentation source="java:org.apache.hadoop.mapreduce.Job"><![CDATA[
Hadoop Job. Multiple names can be specified using comma (,) as a separator.]]></xsd:documentation>
Expand Down Expand Up @@ -961,15 +961,15 @@ Argument(s) to pass to this script. Defined in Properties format (key=value).
</xsd:complexContent>
</xsd:complexType>

<xsd:element name="pig">
<xsd:element name="pig-factory">
<xsd:annotation>
<xsd:documentation><![CDATA[
Defines a PigServer 'template' (note that since PigServer is not thread-safe, each bean invocation will create a new PigServer instance).
Defines a Pig (Server) factory. The factory is thread-safe and allows creation of PigServer instances (which are not thread-safe).
]]>
</xsd:documentation>
<xsd:appinfo>
<tool:annotation>
<tool:exports type="org.apache.pig.PigServer"/>
<tool:exports type="org.springframework.data.hadoop.pig.PigServerFactory"/>
</tool:annotation>
</xsd:appinfo>
</xsd:annotation>
Expand All @@ -987,7 +987,7 @@ Pig script.]]></xsd:documentation>
<xsd:attribute name="id" type="xsd:ID" use="optional">
<xsd:annotation>
<xsd:documentation><![CDATA[
Bean id (default is "pig").
Bean id (default is "pigFactory").
]]></xsd:documentation>
</xsd:annotation>
</xsd:attribute>
Expand Down Expand Up @@ -1051,10 +1051,10 @@ Pig script.]]></xsd:documentation>
Bean id.]]></xsd:documentation>
</xsd:annotation>
</xsd:attribute>
<xsd:attribute name="pig-server-ref" type="xsd:string" use="optional" default="pig">
<xsd:attribute name="pig-factory-ref" type="xsd:string" use="optional" default="pigFactory">
<xsd:annotation>
<xsd:documentation source="java:org.apache.pig.PigServer"><![CDATA[
Reference to a PigServer factory. Defaults to 'pig'.
<xsd:documentation source="java:org.springframework.data.hadoop.pig.PigServerFactory"><![CDATA[
Reference to a PigServer factory. Defaults to 'pigFactory'.
]]></xsd:documentation>
<xsd:appinfo>
<tool:annotation kind="ref">
Expand Down Expand Up @@ -1154,10 +1154,10 @@ Bean id (default is "pigTemplate").
]]></xsd:documentation>
</xsd:annotation>
</xsd:attribute>
<xsd:attribute name="pig-server-ref" type="xsd:string" use="optional" default="pig">
<xsd:attribute name="pig-factory-ref" type="xsd:string" use="optional" default="pigFactory">
<xsd:annotation>
<xsd:documentation source="java:org.apache.pig.PigServer"><![CDATA[
Reference to a PigServer factory. Defaults to 'pig'.
<xsd:documentation source="java:org.springframework.data.hadoop.pig.PigServerFactory"><![CDATA[
Reference to a PigServer factory. Defaults to 'pigFactory'.
]]></xsd:documentation>
<xsd:appinfo>
<tool:annotation kind="ref">
Expand Down Expand Up @@ -1210,15 +1210,16 @@ Reference to the Hadoop configuration. Defaults to 'hadoopConfiguration'.]]></xs
</xsd:element>

<!-- Hive -->
<xsd:element name="hive-client">
<xsd:element name="hive-client-factory">
<xsd:complexType>
<xsd:annotation>
<xsd:documentation><![CDATA[
Defines a Hive client for connecting to a Hive server through the Thrift protocol.
Defines a HiveClient factory for connecting to a Hive server through the Thrift protocol. The factory is thread-safe and allows
creation of HiveClient instances (which are not thread-safe).
]]></xsd:documentation>
<xsd:appinfo>
<tool:annotation>
<tool:exports type="org.apache.hadoop.hive.service.HiveClient"/>
<tool:exports type="org.springframework.data.hadoop.hive.HiveClientFactory"/>
</tool:annotation>
</xsd:appinfo>
</xsd:annotation>
Expand All @@ -1233,7 +1234,7 @@ Hive script to be executed during start-up.]]></xsd:documentation>
<xsd:attribute name="id" type="xsd:ID" use="optional">
<xsd:annotation>
<xsd:documentation><![CDATA[
Bean id (default is "hiveClient").
Bean id (default is "hiveClientFactory").
]]></xsd:documentation>
</xsd:annotation>
</xsd:attribute>
Expand Down Expand Up @@ -1298,22 +1299,22 @@ Hive script.]]></xsd:documentation>
Bean id.]]></xsd:documentation>
</xsd:annotation>
</xsd:attribute>
<xsd:attribute name="hive-client-ref" type="xsd:string" use="optional" default="hiveClient">
<xsd:attribute name="hive-client-factory-ref" type="xsd:string" use="optional" default="hiveClientFactory">
<xsd:annotation>
<xsd:documentation source="java:org.apache.hadoop.hive.service.HiveClient"><![CDATA[
Reference to a HiveClient instance.
<xsd:documentation source="java:org.springframework.data.hadoop.hive.HiveClientFactory"><![CDATA[
Reference to a HiveClient factory instance. Defaults to 'hiveClientFactory'.
]]></xsd:documentation>
<xsd:appinfo>
<tool:annotation kind="ref">
<tool:expected-type type="org.apache.hadoop.hive.service.HiveClient" />
<tool:expected-type type="org.springframework.data.hadoop.hive.HiveClientFactory" />
</tool:annotation>
</xsd:appinfo>
</xsd:annotation>
</xsd:attribute>
<xsd:attribute name="hive-template-ref" type="xsd:string" use="optional">
<xsd:annotation>
<xsd:documentation source="java:org.springframework.data.hadoop.hive.HiveTemplate"><![CDATA[
Reference to a HiveTemplate instance. Alternative to 'hive-client-ref' attribute..
Reference to a HiveTemplate instance. Alternative to 'hive-client-factory-ref' attribute..
]]></xsd:documentation>
<xsd:appinfo>
<tool:annotation kind="ref">
Expand Down Expand Up @@ -1401,10 +1402,10 @@ Bean id (default is "hiveTemplate").
]]></xsd:documentation>
</xsd:annotation>
</xsd:attribute>
<xsd:attribute name="hive-client-ref" default="hiveClient">
<xsd:attribute name="hive-client-factory-ref" default="hiveClientFactory">
<xsd:annotation>
<xsd:documentation source="java:org.apache.hadoop.hive.service.HiveClient"><![CDATA[
Reference to HiveClient factory. Defaults to 'hiveClient'.]]></xsd:documentation>
<xsd:documentation source="java:org.springframework.data.hadoop.hive.HiveClientFactory"><![CDATA[
Reference to HiveClient factory. Defaults to 'hiveClientFactory'.]]></xsd:documentation>
<xsd:appinfo>
<tool:annotation kind="ref">
<tool:expected-type type="org.springframework.data.hadoop.hive.HiveClientFactory" />
Expand Down

0 comments on commit 0e8c9ea

Please sign in to comment.