Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Browse files

HIVE-2598. Update README.txt file to use description from wiki

(Carl Steinbach via jvs)



git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1203885 13f79535-47bb-0310-9956-ffa450edef68
  • Loading branch information...
commit 8a19652ef0d5573baf68fd2b680c68baadf6b08c 1 parent 4910f33
John Sichi authored
Showing with 24 additions and 11 deletions.
  1. +24 −11 README.txt
View
35 README.txt
@@ -1,14 +1,27 @@
-Apache Hive @VERSION@
-=================
-
-Apache Hive is a data warehouse system for Hadoop that facilitates
-easy data summarization, ad-hoc querying and analysis of large
-datasets stored in Hadoop compatible file systems. Hive provides a
-mechanism to put structure on this data and query the data using a
-SQL-like language called HiveQL. At the same time this language also
-allows traditional map/reduce programmers to plug in their custom
-mappers and reducers when it is inconvenient or inefficient to express
-this logic in HiveQL.
+Apache Hive (TM) @VERSION@
+======================
+
+The Apache Hive (TM) data warehouse software facilitates querying and
+managing large datasets residing in distributed storage. Built on top
+of Apache Hadoop (TM), it provides:
+
+* Tools to enable easy data extract/transform/load (ETL)
+
+* A mechanism to impose structure on a variety of data formats
+
+* Access to files stored either directly in Apache HDFS (TM) or in other
+ data storage systems such as Apache HBase (TM)
+
+* Query execution via MapReduce
+
+Hive defines a simple SQL-like query language, called QL, that enables
+users familiar with SQL to query the data. At the same time, this
+language also allows programmers who are familiar with the MapReduce
+framework to be able to plug in their custom mappers and reducers to
+perform more sophisticated analysis that may not be supported by the
+built-in capabilities of the language. QL can also be extended with
+custom scalar functions (UDF's), aggregations (UDAF's), and table
+functions (UDTF's).
Please note that Hadoop is a batch processing system and Hadoop jobs
tend to have high latency and incur substantial overheads in job
Please sign in to comment.
Something went wrong with that request. Please try again.