-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Processing of siva files bigger than 2Gb #31
Comments
Siva reader uses MappedByteBuffer: And there is a limit in jdk https://github.com/openjdk-mirror/jdk7u-jdk/blob/master/src/share/classes/sun/nio/ch/FileChannelImpl.java#L788 Then there is an issue in Spark, it can't have partition more than 2G: We can't change it. |
2 issues here:
If it works - log an issue in https://github.com/src-d/siva-java
|
But here is test:
Big files are available in |
Nice, why not linking the issue here? |
Good point haha. I honestly believed I did, but no. Here you go: src-d/siva-java#18 |
siva-java Issue was fixed src-d/siva-java#18 (comment) and new v0.1.3 was released + a new engine version that includes it. |
with the last engine processing works |
There is a limit for a job in Spark. It's 2GB. We need to investigate how to change it if possible and how it will affect spark. (the limit was introduced for some reason)
If somebody else will look at it, here is a tip. It looks like the limit is not from Spark actually, but JVM. I can be wrong. JFYI
Exception:
The text was updated successfully, but these errors were encountered: