You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
But after I searched for days, I learnt that one can only use 'insert' command to add data through elasticsearch-hadoop.
This would be a problem when you wanted to sqoop data from mysql to HDFS.
For instance, there is a external Hive table that created with 'location' command, so it is mapped to a HDFS directory. Every time you use sqoop add more data to the HDFS directory, the table increase itself. However, as the elasticsearch-hadoop do not support 'location' mapping, when I create a external Hive table stored by 'org.elasticsearch.hadoop.hive.EsStorageHandler', every time I want to add any data, I have to create a Hive table, and use 'insert' command to manually add data.
As today, there is tons of data producing everyday, it is necessary to make tools auto incremental.
The text was updated successfully, but these errors were encountered:
Based on the resolved conversation from here I'm moving to close this ticket. The LOCATION keyword on Hive's external table DDL assumes that the external table is loaded from HDFS. This is not always the case in the event of using a custom storage handler. Since Hive queries the data directly from Elasticsearch, the Elasticsearch-Hadoop Hive integration does not leverage the LOCATION functionality on external table definitions.
In the event that incremental loads are required, it is generally advised to use alternative methods instead of relying on the support of the LOCATION DLL property.
Previously I thought that was my mistake, so I raised a question hear : https://discuss.elastic.co/t/to-map-hive-table-into-es/51900
But after I searched for days, I learnt that one can only use 'insert' command to add data through elasticsearch-hadoop.
This would be a problem when you wanted to sqoop data from mysql to HDFS.
For instance, there is a external Hive table that created with 'location' command, so it is mapped to a HDFS directory. Every time you use sqoop add more data to the HDFS directory, the table increase itself. However, as the elasticsearch-hadoop do not support 'location' mapping, when I create a external Hive table stored by 'org.elasticsearch.hadoop.hive.EsStorageHandler', every time I want to add any data, I have to create a Hive table, and use 'insert' command to manually add data.
As today, there is tons of data producing everyday, it is necessary to make tools auto incremental.
The text was updated successfully, but these errors were encountered: