Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hive to Elasticsearch Column names are lost #84

Closed
nmaillard opened this issue Sep 12, 2013 · 6 comments
Closed

Hive to Elasticsearch Column names are lost #84

nmaillard opened this issue Sep 12, 2013 · 6 comments

Comments

@nmaillard
Copy link

@nmaillard nmaillard commented Sep 12, 2013

Hi everyone

While inserting data into elasticsearch from Hive I have around this weid bug where elasticsearch columns all end up being called _col0 ..._colN

In my earlier tests I don't remember this happening.

My test table and insert are:

CREATE EXTERNAL TABLE es_mock_write (
user STRING,
session STRING,
medium_source STRING)
STORED BY 'org.elasticsearch.hadoop.hive.ESStorageHandler'
TBLPROPERTIES('es.resource' = 'dmp_demo/db_demo/','es.host'='esaddr');

INSERT OVERWRITE TABLE es_mock_write select
userid,
concat_ws("|",collect_set(session) ) ,
concat_ws("|",collect_set(concat(medium,'/',source)))
FROM mock.mock_events

@vvakar

This comment has been minimized.

Copy link

@vvakar vvakar commented Sep 12, 2013

I've had that happen once when my ES schema was out of sync with the hive table. Did you set an explicit ES schema?
Also, I would try using column aliases in the INSERT statement (userid AS userid, ...).

@nmaillard

This comment has been minimized.

Copy link
Author

@nmaillard nmaillard commented Sep 26, 2013

Hello

I have tried a couple of things, recreating different indexes and tables a new.
using column aliases
nothing seems to do the trick I still have _col0 ..._colN in the resulting ES index.

@vvakar

This comment has been minimized.

Copy link

@vvakar vvakar commented Sep 26, 2013

Can you post the hive output of

EXPLAIN
?

Hive uses _col* as temporary variables and I wonder if that somehow made its way to the output.


From: Nicolas Maillard [notifications@github.com]
Sent: Thursday, September 26, 2013 1:34 PM
To: elasticsearch/elasticsearch-hadoop
Cc: Valentin Vakar
Subject: Re: [elasticsearch-hadoop] Hive to Elasticsearch Column names are lost (#84)

Hello

I have tried a couple of things, recreating different indexes and tables a new.
using column aliases
nothing seems to do the trick I still have _col0 ..._colN in the resulting ES index.


Reply to this email directly or view it on GitHubhttps://github.com//issues/84#issuecomment-25187221.

@costin

This comment has been minimized.

Copy link
Member

@costin costin commented Sep 26, 2013

Have you tried the latest build? I was published a couple of hours ago

@nmaillard

This comment has been minimized.

Copy link
Author

@nmaillard nmaillard commented Sep 30, 2013

Hi @costin,@vvakar
Ok I used the latest build and it did the trick the column names are back.
thanks for the help.
I'll go ahead and close

@nmaillard nmaillard closed this Sep 30, 2013
@costin

This comment has been minimized.

Copy link
Member

@costin costin commented Sep 30, 2013

Glad to hear it. thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.