Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create a external table with 'location' command success while elasticsearch-hadoop do not support 'location' function #786

Closed
yangfeiran opened this issue Jun 8, 2016 · 1 comment

Comments

@yangfeiran
Copy link
Contributor

yangfeiran commented Jun 8, 2016

Previously I thought that was my mistake, so I raised a question hear : https://discuss.elastic.co/t/to-map-hive-table-into-es/51900

But after I searched for days, I learnt that one can only use 'insert' command to add data through elasticsearch-hadoop.

This would be a problem when you wanted to sqoop data from mysql to HDFS.

For instance, there is a external Hive table that created with 'location' command, so it is mapped to a HDFS directory. Every time you use sqoop add more data to the HDFS directory, the table increase itself. However, as the elasticsearch-hadoop do not support 'location' mapping, when I create a external Hive table stored by 'org.elasticsearch.hadoop.hive.EsStorageHandler', every time I want to add any data, I have to create a Hive table, and use 'insert' command to manually add data.

As today, there is tons of data producing everyday, it is necessary to make tools auto incremental.

@jbaiera
Copy link
Member

jbaiera commented Jun 20, 2016

Based on the resolved conversation from here I'm moving to close this ticket. The LOCATION keyword on Hive's external table DDL assumes that the external table is loaded from HDFS. This is not always the case in the event of using a custom storage handler. Since Hive queries the data directly from Elasticsearch, the Elasticsearch-Hadoop Hive integration does not leverage the LOCATION functionality on external table definitions.

In the event that incremental loads are required, it is generally advised to use alternative methods instead of relying on the support of the LOCATION DLL property.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants