Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rhive.query("select * from abc limit 30000000") rhive.size.table rhive.load.table2 functions problems! #75

Closed
suolemen opened this issue Nov 6, 2014 · 14 comments

Comments

@suolemen
Copy link

suolemen commented Nov 6, 2014

first problem : 30000000 numbers data!
when table is a big data how can i use function tu get the data set
use rhive.query or rhive.big.query is not ok

second problem :
rhive.query("select * from kc_tel")
result :
phoneno
1 13531542675
2 13531542297
3 13531541982
4 13531541667

but when i use : rhive.size.table("kc_tel")
result : NULL
why the result is NULL ?

third problem :
when rhive.load.table2("kc_tel")
error : java.sql.SQLException: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask

who can help me ? thank you very much!

@ssshow16
Copy link
Contributor

ssshow16 commented Nov 6, 2014

Please let me know your environment values.

You can check it by using rhive function "rhive.env()"

Thanks

On Thu, Nov 6, 2014 at 12:15 PM, suolemen notifications@github.com wrote:

first problem : 30000000 numbers data!
when table is a big data how can i use function tu get the data set
use rhive.query or rhive.big.query is not ok

second problem :
rhive.query("select * from kc_tel")
result :
phoneno
1 13531542675
2 13531542297
3 13531541982
4 13531541667

but when i use : rhive.size.table("kc_tel")
result : NULL

why the result is NULL ?

third problem :
when rhive.load.table2("kc_tel")
error : java.sql.SQLException: Error while processing statement: FAILED:
Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask

who can help me ? thank you very much!


Reply to this email directly or view it on GitHub
#75.

@suolemen
Copy link
Author

suolemen commented Nov 6, 2014

rhive.env()
hadoop home: /usr/lib/hadoop
fs: hdfs://hdktmaster.infobird.com:8020
hive home: /usr/lib/hive
user name: root
user home: /root
temp dir: /tmp/root>

@ssshow16
Copy link
Contributor

ssshow16 commented Nov 6, 2014

bug fixed and release new version "nexr-rhive-2.0.4"

Please, try again!

@suolemen
Copy link
Author

suolemen commented Nov 6, 2014

I install Rhive method is :
install.packages("RHive")
who to become "nexr-rhive-2.0.4"

reinstallation "RHive" packages ?

http://cran.r-project.org/
Package source: RHive_2.0-0.2.tar.gz

@ssshow16
Copy link
Contributor

ssshow16 commented Nov 6, 2014

It take a long time to register R Package into CRAN and I didn't register
new version yet.
So, you have to download new version from github(
https://github.com/nexr/RHive).

After that, build and install RHive.
There is RHive install guide in Github page.

On Thu, Nov 6, 2014 at 4:34 PM, suolemen notifications@github.com wrote:

I install Rhive method is :
install.packages("RHive")

who to become "nexr-rhive-2.0.4"


Reply to this email directly or view it on GitHub
#75 (comment).

@suolemen
Copy link
Author

suolemen commented Nov 6, 2014

I have installed nexr-rhive-2.0.4
rhive.load.table2("kc_tel") is ok!

but when i use : rhive.size.table("kc_tel") is also NULL

tableName <- "kc_tel"
metaInfo <- .rhive.desc.table(tableName, detail=TRUE)
location <- strsplit(strsplit(as.character(metaInfo[[1]]), "location:")[[1]][2],",")[[1]][1]
location can not get right value!

@ssshow16
Copy link
Contributor

ssshow16 commented Nov 6, 2014

Which hive version do you use?

Please let me know some information from debug like the following:

debug(.rhive.size.table)
rhive.size.table("kc_tel")
debugging in: .rhive.size.table(tableName = tableName)
debug: {
if (missing(tableName)) {
stop("missing tableName")
}
tableName <- tolower(tableName)
metaInfo <- .rhive.desc.table(tableName, detail = TRUE)
location <- strsplit(strsplit(as.character(metaInfo[[1]]),
"location:")[[1]][2], ",")[[1]][1]
dataInfo <- .rhive.hdfs.du(location, summary = TRUE)
return(dataInfo$length)
}
Browse[2]>
debug: if (missing(tableName)) {
stop("missing tableName")
}
Browse[2]>
debug: tableName <- tolower(tableName)
Browse[2]>
debug: metaInfo <- .rhive.desc.table(tableName, detail = TRUE)
Browse[2]>
debug: location <- strsplit(strsplit(as.character(metaInfo[[1]]), "location:")[[1]][2],
",")[[1]][1]
Browse[2]>
debug: dataInfo <- .rhive.hdfs.du(location, summary = TRUE)
Browse[2]> location

Please check if location is correct.!!

@suolemen
Copy link
Author

suolemen commented Nov 7, 2014

debug(.rhive.size.table)
rhive.size.table("kc_tel")
debugging in: .rhive.size.table(tableName = tableName)
debug: {
if (missing(tableName)) {
stop("missing tableName")
}
tableName <- tolower(tableName)
metaInfo <- .rhive.desc.table(tableName, detail = TRUE)
location <- strsplit(strsplit(as.character(metaInfo[[1]]),
"location:")[[1]][2], ",")[[1]][1]
dataInfo <- .rhive.hdfs.du(location, summary = TRUE)
return(dataInfo$length)
}
Browse[2]>
debug: if (missing(tableName)) {
stop("missing tableName")
}
Browse[2]>
debug: tableName <- tolower(tableName)
Browse[2]>
debug: metaInfo <- .rhive.desc.table(tableName, detail = TRUE)
Browse[2]>
debug: location <- strsplit(strsplit(as.character(metaInfo[[1]]), "location:")[[ 1]][2],
",")[[1]][1]
Browse[2]>
debug: dataInfo <- .rhive.hdfs.du(location, summary = TRUE)
Browse[2]> location
[1] NA
Browse[2]>

@suolemen
Copy link
Author

suolemen commented Nov 7, 2014

hive version : hive-service-0.12.0-cdh5.0.0.jar
Release Notes - Hive - Version 0.12.0

@ssshow16
Copy link
Contributor

ssshow16 commented Nov 7, 2014

What is the result for "rhive.desc.table("kc_tel",detail=TRUE)"?

@suolemen
Copy link
Author

suolemen commented Nov 7, 2014

rhive.desc.table("kc_tel",detail=TRUE)
X..
1

rhive.desc.table("kc_tel",detail=FALSE)
col_name data_type comment
1 phoneno string None

@suolemen
Copy link
Author

suolemen commented Nov 7, 2014

Browse[2]> tableInfo <- .rhive.query(paste("DESCRIBE EXTENDED",tableName))
Browse[2]> res <- tableInfo[[2]][length(rownames(tableInfo))]
Browse[2]> res
[1]
3 Levels: ... Table(tableName:kc_tel, dbName:default, owner:hive, createTime:1409708239, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:phoneno, type:string, comment:null)], location:hdfs://hdktmaster.infobird.com:8020/user/hive/warehouse/kc_tel, inputFormat:org.apache.hadoop.mapred.TextInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{colelction.delim=|, serialization.format=,, line.delim=

Browse[2]> str(res)
Factor w/ 3 levels "","string ",..: 1
Browse[2]> res <- lapply(res, function(v) { gsub("(^ +)|( +$)", "", v) })
Browse[2]> res
[[1]]
[1] ""
Browse[2]> as.data.frame(res)
X..
1

@ssshow16
Copy link
Contributor

ssshow16 commented Nov 7, 2014

I guess that your table's line.delim is '\n'.
Now RHive have a bug about your case.
I will fix it as soon as possible.

Until then, create table again without setting line.delim and try it.

@suolemen
Copy link
Author

suolemen commented Nov 7, 2014

ok thank you

create table kc_tel (phoneno string)
row format delimited
fields terminated by ','
collection items terminated by '|'
lines terminated by '\n'
stored as textfile;

become create table kc_tel (phoneno string)
rhive.size.table("kc_tel")
result is right !! thank you very much! -_-

@suolemen suolemen closed this as completed Nov 7, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants