Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hive 常用语法汇总 #89

Open
myyyy opened this issue Apr 1, 2019 · 0 comments
Open

Hive 常用语法汇总 #89

myyyy opened this issue Apr 1, 2019 · 0 comments
Labels

Comments

@myyyy
Copy link
Owner

@myyyy myyyy commented Apr 1, 2019

Hive 常用语法汇总

  1. ROW_NUMBER()
从1开始,按照顺序,生成分组内记录的序列
SELECT cookieid,createtime,pv,
ROW_NUMBER() OVER(PARTITION BY cookieid ORDER BY pv desdatedatec) AS rn 
FROM lxw1234;
  1. lateral view json_tuple()
针对json数据格式解析的函数
lateral view json_tuple(t0.value, "B", "H") t1 as b, h
  1. explode
将hive一行中复杂的array或者map结构拆分成多行

select deviceid,sum(RecoveryDay.time) as duration from device.device_active_accumulator 
lateral view explode(access) t AS accessDay
where 
date=20190106 and accessDay.name="audiorc" and accessDay.ot=20190101
  1. 时间转换
转时间戳:
unix_timestamp('20180901','yyyyMMdd')*1000
转日期:
from_unixtime(1441565203,'yyyy/MM/dd HH:mm:ss')
  1. struct 解析
event|struct|{"eventType":"event","key":"boost","value":"1","params":{"model":"xxx","boosttype":"1"}}

 event.key
 event.params['model']
  1. regexp_replace
TODO
  1. parse_url
返回给定URL的指定部分,partToExtract的有效值包括HOST,PATH, QUERY, REF, PROTOCOL, AUTHORITY,FILE和USERINFO。例如:  parse_url('http://facebook.com/path1/p.php?k1=v1&k2=v2#Ref1', 'HOST') 返回 'facebook.com'.。当第二个参数为QUERY时,可以使用第三个参数提取特定参数的值,例如: parse_url('http://facebook.com/path1/p.php?k1=v1&k2=v2#Ref1',
'QUERY', 'k1') 返回'v1'

8.union all

并行
set hive.exec.parallel=true;

set hive.exec.parallel.thread.number=8;

复杂的 map struct结构如何解析

array_contains(map_keys(installmap), "xxxxx") 判断是否包含 某个key
tva.lastReportInfo.versionName 取的 versionName

select tva.lastReportInfo.versionName,count(*) from(
select installmap['xxxxx'] as tva from xxx
where date= 20191007
and array_contains(map_keys(installmap), "xxxxx")
)t1
group by tva.lastReportInfo.versionName

参考

Hive常用字符串函数

@myyyy myyyy added the 数据分析 label Apr 1, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
1 participant
You can’t perform that action at this time.