HCATALOG-20: Create a package target. New files.

git-svn-id: https://svn.apache.org/repos/asf/incubator/hcatalog/trunk@1130209 13f79535-47bb-0310-9956-ffa450edef68
apache · Jun 1, 2011 · 881bda4 · 881bda4
1 parent 6471584
commit 881bda4
Show file tree

Hide file tree

Showing 10 changed files with 3,128 additions and 0 deletions.
diff --git a/LICENSE.txt b/LICENSE.txt
diff --git a/NOTICE.txt b/NOTICE.txt
@@ -0,0 +1,10 @@
+Apache HCatalog
+Copyright 2011 The Apache Software Foundation
+
+This product includes/uses software developed by The Apache Software
+Foundation (http://www.apache.org/).
+Apache HCatalog
+Copyright 2011 The Apache Software Foundation
+
+This product includes/uses software developed by The Apache Software
+Foundation (http://www.apache.org/).
diff --git a/README.txt b/README.txt
@@ -0,0 +1,76 @@
+Apache HCatalog
+===============
+HCatalog is a table and storage management service for data created using Apache 
+Hadoop.
+
+The vision of HCatalog is to provide table management and storage management layers
+for Apache Hadoop. This includes:
+
+ * Providing a shared schema and data type mechanism.
+ * Providing a table abstraction so that users need not be concerned with where
+   or how their data is stored.
+ * Providing interoperability across data processing tools such as Pig, Map
+   Reduce, Streaming, and Hive. 
+
+Data processors using Apache Hadoop have a common need for table management
+services. The goal of this table management service is to track data that exists in
+a Hadoop grid and present that data to users in a tabular format. HCatalog
+provides a single input and output format to users so that individual users need
+not be concerned with the storage formats that are chosen for particular data
+sets. Data is described by a schema and shares a datatype system.
+
+Users are free to choose the best tools for their use cases. The Hadoop project
+includes Map Reduce, Streaming, Pig, and Hive, and additional tools exist such
+as Cascading. Each of these tools has users who prefer it, and there are use
+cases best addressed by each of these tools. Two users on the same grid who
+share data are not constrained to use the same tool but with HCatalog are free
+to choose the best tool for their use case.  HCatalog presents data in the same
+way to all of the tools, providing interfaces to each of them.
+
+For the latest information about HCatalog, please visit our website at:
+
+   http://incubator.apache.org/hcatalog
+
+and our wiki, at:
+
+   https://cwiki.apache.org/confluence/display/HCATALOG
+
+
+Apache HCatalog
+===============
+HCatalog is a table and storage management service for data created using Apache 
+Hadoop.
+
+The vision of HCatalog is to provide table management and storage management layers
+for Apache Hadoop. This includes:
+
+ * Providing a shared schema and data type mechanism.
+ * Providing a table abstraction so that users need not be concerned with where
+   or how their data is stored.
+ * Providing interoperability across data processing tools such as Pig, Map
+   Reduce, Streaming, and Hive. 
+
+Data processors using Apache Hadoop have a common need for table management
+services. The goal of this table management service is to track data that exists in
+a Hadoop grid and present that data to users in a tabular format. HCatalog
+provides a single input and output format to users so that individual users need
+not be concerned with the storage formats that are chosen for particular data
+sets. Data is described by a schema and shares a datatype system.
+
+Users are free to choose the best tools for their use cases. The Hadoop project
+includes Map Reduce, Streaming, Pig, and Hive, and additional tools exist such
+as Cascading. Each of these tools has users who prefer it, and there are use
+cases best addressed by each of these tools. Two users on the same grid who
+share data are not constrained to use the same tool but with HCatalog are free
+to choose the best tool for their use case.  HCatalog presents data in the same
+way to all of the tools, providing interfaces to each of them.
+
+For the latest information about HCatalog, please visit our website at:
+
+   http://incubator.apache.org/hcatalog
+
+and our wiki, at:
+
+   https://cwiki.apache.org/confluence/display/HCATALOG
+
+
diff --git a/RELEASE_NOTES.txt b/RELEASE_NOTES.txt
@@ -0,0 +1,70 @@
+These notes are for HCatalog 0.1.0 release.
+
+Highlights
+==========
+
+This is the initial relase of Apache HCatalog.  It provides read and write capability for Pig and Hadoop, and read capability for Hive.
+
+System Requirements
+===================
+
+1. Java 1.6.x or newer, preferably from Sun. Set JAVA_HOME to the root of your
+   Java installation
+2. Ant build tool, version 1.8 or higher:  http://ant.apache.org - to build
+   source only
+3. This release is compatible with Hadoop 0.20.x with security.  Currently this
+   is available from Cloudera in their CDH3 release or from the 0.20.203 branch
+   of Apache Hadoop (not yet released).
+4. This release is compatible with Pig 0.8.1.
+5. This release is compatible with Hive 0.7.0.
+
+Trying the Release
+==================
+1. Download hcatalog-0.1.0.tar.gz
+2. Unpack the file: tar -xzvf hcatalog-0.1.0.tar.gz
+3. Move into the installation directory: cd hcatalog-0.1.0
+TODO need install instructions
+4. To use with Hadoop MapReduce jobs, use the HCatInputFormat and
+   HCatOutputFormat classes.
+5. To use with Pig, use the HCatLoader and HCatStorer classes.
+6. To use the command line interface, set HADOOP_CLASSPATH to the directory 
+   that contains the configuration files for your cluster, and use bin/hcat.sh
+
+Relevant Documentation
+======================
+See http://incubator.apache.org/hcatalog/docs/r0.1.0
+These notes are for HCatalog 0.1.0 release.
+
+Highlights
+==========
+
+This is the initial relase of Apache HCatalog.  It provides read and write capability for Pig and Hadoop, and read capability for Hive.
+
+System Requirements
+===================
+
+1. Java 1.6.x or newer, preferably from Sun. Set JAVA_HOME to the root of your
+   Java installation
+2. Ant build tool, version 1.8 or higher:  http://ant.apache.org - to build
+   source only
+3. This release is compatible with Hadoop 0.20.x with security.  Currently this
+   is available from Cloudera in their CDH3 release or from the 0.20.203 branch
+   of Apache Hadoop (not yet released).
+4. This release is compatible with Pig 0.8.1.
+5. This release is compatible with Hive 0.7.0.
+
+Trying the Release
+==================
+1. Download hcatalog-0.1.0.tar.gz
+2. Unpack the file: tar -xzvf hcatalog-0.1.0.tar.gz
+3. Move into the installation directory: cd hcatalog-0.1.0
+TODO need install instructions
+4. To use with Hadoop MapReduce jobs, use the HCatInputFormat and
+   HCatOutputFormat classes.
+5. To use with Pig, use the HCatLoader and HCatStorer classes.
+6. To use the command line interface, set HADOOP_CLASSPATH to the directory 
+   that contains the configuration files for your cluster, and use bin/hcat.sh
+
+Relevant Documentation
+======================
+See http://incubator.apache.org/hcatalog/docs/r0.1.0
diff --git a/conf/jndi.properties b/conf/jndi.properties
@@ -0,0 +1,36 @@
+## ---------------------------------------------------------------------------
+## Licensed to the Apache Software Foundation (ASF) under one or more
+## contributor license agreements.  See the NOTICE file distributed with
+## this work for additional information regarding copyright ownership.
+## The ASF licenses this file to You under the Apache License, Version 2.0
+## (the "License"); you may not use this file except in compliance with
+## the License.  You may obtain a copy of the License at
+## 
+## http://www.apache.org/licenses/LICENSE-2.0
+## 
+## Unless required by applicable law or agreed to in writing, software
+## distributed under the License is distributed on an "AS IS" BASIS,
+## WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+## See the License for the specific language governing permissions and
+## limitations under the License.
+## ---------------------------------------------------------------------------
+
+# If ActiveMQ is used then uncomment following properties, else substitute it accordingly.
+#java.naming.factory.initial = org.apache.activemq.jndi.ActiveMQInitialContextFactory
+
+# use the following property to provide location of MQ broker.
+#java.naming.provider.url = tcp://localhost:61616
+
+# use the following property to specify the JNDI name the connection factory
+# should appear as. 
+#connectionFactoryNames = connectionFactory, queueConnectionFactory, topicConnectionFactry
+
+# register some queues in JNDI using the form
+# queue.[jndiName] = [physicalName]
+# queue.MyQueue = example.MyQueue
+
+
+# register some topics in JNDI using the form
+# topic.[jndiName] = [physicalName]
+# topic.MyTopic = example.MyTopic
+
diff --git a/conf/proto-hive-site.xml b/conf/proto-hive-site.xml
@@ -0,0 +1,106 @@
+<?xml version="1.0"?>
+<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
+
+<configuration>
+
+<property>
+  <name>hive.metastore.local</name>
+  <value>false</value>
+  <description>controls whether to connect to remove metastore server or open a new metastore server in Hive Client JVM</description>
+</property>
+
+<property>
+  <name>javax.jdo.option.ConnectionURL</name>
+  <value>jdbc:mysql://DBHOSTNAME/hivemetastoredb?createDatabaseIfNotExist=true</value>
+  <description>JDBC connect string for a JDBC metastore</description>
+</property>
+
+<property>
+  <name>javax.jdo.option.ConnectionDriverName</name>
+  <value>com.mysql.jdbc.Driver</value>
+  <description>Driver class name for a JDBC metastore</description>
+</property>
+
+<property>
+  <name>javax.jdo.option.ConnectionUserName</name>
+  <value>hive</value>
+  <description>username to use against metastore database</description>
+</property>
+
+<property>
+  <name>javax.jdo.option.ConnectionPassword</name>
+  <value>PASSWORD</value>
+  <description>password to use against metastore database</description>
+</property>
+
+<property>
+  <name>hive.metastore.warehouse.dir</name>
+  <value>WAREHOUSE_DIR</value>
+  <description>location of default database for the warehouse</description>
+</property>
+
+<property>
+  <name>hive.metastore.sasl.enabled</name>
+    <value>true</value>
+    <description>If true, the metastore thrift interface will be secured with SASL. Clients must authenticate with Kerberos.</description>
+</property>
+
+<property>
+  <name>hive.metastore.kerberos.keytab.file</name>
+  <value>KEYTAB_PATH</value>
+  <description>The path to the Kerberos Keytab file containing the metastore thrift server's service principal.</description>
+</property>
+
+<property>
+  <name>hive.metastore.kerberos.principal</name>
+  <value>KERBEROS_PRINCIPAL</value>
+  <description>The service principal for the metastore thrift server. The special string _HOST will be replaced automatically with the correct host name.</description>
+</property>
+
+<property>
+  <name>hive.metastore.cache.pinobjtypes</name>
+  <value>Table,Database,Type,FieldSchema,Order</value>
+  <description>List of comma separated metastore object types that should be pinned in the cache</description>
+</property>
+
+<property>
+  <name>hive.metastore.uris</name>
+  <value>thrift://SVRHOST:3306</value>
+  <description>URI for client to contact metastore server</description>
+</property>
+
+<property>
+  <name>hive.semantic.analyzer.factory.impl</name>
+  <value>org.apache.hcatalog.cli.HCatSemanticAnalyzerFactory</value>
+  <description>controls which SemanticAnalyzerFactory implemenation class is used by CLI</description>
+</property>
+
+<property>
+  <name>hadoop.clientside.fs.operations</name>
+  <value>true</value>
+  <description>FS operations are owned by client</description>
+</property>
+
+<property>
+  <name>hive.metastore.client.socket.timeout</name>
+  <value>60</value>
+  <description>MetaStore Client socket timeout in seconds</description>
+</property>
+
+</configuration>