Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hive Elasticserach insert a date or timestamp #61

Closed
nmaillard opened this issue Jul 16, 2013 · 3 comments
Closed

Hive Elasticserach insert a date or timestamp #61

nmaillard opened this issue Jul 16, 2013 · 3 comments

Comments

@nmaillard
Copy link

Hello everyone

I am right now playing around with this great piece of sotware.
I have looked at the code but not found a clear answer.
Say I want to push from hive into an elasticsearch index a date field how would I go about writing it?

CREATE EXTERNAL TABLE es_wrtie (
clientId String,
mydate timestamp||string)
STORED BY 'org.elasticsearch.hadoop.hive.ESStorageHandler'
TBLPROPERTIES('es.resource' = 'test/test/')

since hive only knows timstamps would I use these, or put a strign and let elasticesearch do the trick if I format it correctly?

thnaks for your help and software

@costin
Copy link
Member

costin commented Jul 16, 2013

Hi,

Mapping is currently work in progress - we're working on two main features:

  1. have direct type conversion (Hive/Pig/Cascading/etc... <-> ES types)
  2. allow mapping to be customized. This means doing things field aliasing (field X in ES is mapped to Y in Hive for example).

The current code base uses implicit naming for 2 and in case of 1 works for most types (though there are some corner cases where it fails).
This being said, try using a timestamp and let elasticsearch-hadoop do the conversion - please report that any failures.

Cheers

@costin
Copy link
Member

costin commented Aug 5, 2013

Timestamp for Hive and datetime for Pig are supported in master.
In case of Hive use the timestamp type, simply use a string that contains a date/time in ISO format (according to the format configured in ES - by default dateOptionalTime).
When adding data w/o a schema, the conversion will happen automatically. Additionally, one can create the index before hand causing the conversion (and thus validation) to be forced.
The conversion between Hive and ES happens automatically.

P.S. I recommend using the Timestamp type in Hive since you get to use the built-in UDFs.

@costin costin closed this as completed Aug 5, 2013
@SagarBalai
Copy link

Hi @costin
I have seen some interesting incorrect behaviour in mapping of hive timestamp to elastic search date datatype.
Versions :: Hive -1.6, Elastic search - 2.4

In Hive, I have a column 'c1' of timestamp datatype with value as '2017-01-30 23:41:06'. In HIVE-ES integration (spark job) with third party jar 'elasticsearch-hadoop-2.4.0.jar' in spark job, 'c1' is indexed with vale '1485848466000'.
Same indexed value '1485848466000' is converted to java.util.Date using format 'yyyy-MM-dd HH:mm:ss.SSS' (same as hive timestamp) or without any format, date output is '2017-01-31 13:11:06.000'.

In short,
'2017-01-30 23:41:06' ==> '1485848466000' ==> '2017-01-31 13:11:06.000'

Here hive date and ES date is different, could you please help here.
Thanks in advance,
Sagar

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants