-
Notifications
You must be signed in to change notification settings - Fork 369
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CH] New byte buffer takes most of time in SourceFromJavalter::generate #4943
Comments
optoruntime::new_array_c可能是传入的 |
原因:查询运行过程中,有26200次new byte[1024*1024] 操作,平均每个task有78次,总耗时8s, 而查询耗时也就30+s 问题:为什么会走带copy的OnHeapCopyShuffleInputStream,没走zero-copy的LowCopyNettyShuffleInputStream 调用链
public static ShuffleInputStream create(
InputStream in, boolean forceCompress, boolean isCustomizedShuffleCodec) {
final InputStream unwrapped = unwrapInputStream(in, forceCompress, isCustomizedShuffleCodec);
if (unwrapped != null) {
return createCompressedShuffleInputStream(in, unwrapped);
}
return new OnHeapCopyShuffleInputStream(in, false);
}
private static InputStream unwrapInputStream(
InputStream in, boolean forceCompress, boolean isCustomizedShuffleCodec) {
if (forceCompress) {
return unwrapSparkInputStream(in);
} else if (isCustomizedShuffleCodec) {
return unwrapSparkWithCompressedInputStream(in);
}
return null;
} 由于我的local环境中并未设置celeborn作为shuffle manager, 因此最终走了OnHeapCopyShuffleInputStream。而OnHeapCopyShuffleInputStream目前的实现还不是很高效,最终导致了标题中描述的问题。 |
这里可能要看下你本地调用连,理应要走 LowCopyFileSegmentShuffleInputStream 这个,因为是从本地文件直接读取,按理走这里。 |
Description
Reproduce sqls:
Two issues cc @baibaichen
memory.m_capacity
inReadBufferFromJavaInputStream::readFromJava
, should usememory.m_size
.The text was updated successfully, but these errors were encountered: