Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues/54: on the multifile problem in spark-fits #55

Merged
merged 6 commits into from
Oct 19, 2018

Conversation

JulienPeloton
Copy link
Member

What has changed?

This PR brings two major improvements:

How this has been tested?

Unit test suite passes + additional integration tests performed. Seem all good, though I need to keep an eye on this

Is there anything left?

Yes for 20,000+ input files, the job explodes by sending many errors in the same times (probably related to each other):

java.lang.AssertionError:
        HDU number 0 does not exist!

java.lang.ArithmeticException: / by zero

java.util.NoSuchElementException: key not found: BITPIX

org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: BP-1151505977-134.158.75.222-1469858775214:blk_1075726089_1986406

.... (and then loop over those four)

I need to investigate a bit more.

@JulienPeloton JulienPeloton added bug Something isn't working enhancement New feature or request IO labels Oct 18, 2018
@JulienPeloton JulienPeloton self-assigned this Oct 18, 2018
@JulienPeloton
Copy link
Member Author

For the record, the multifile problem was solved with minimal change: instead of looping over files and unioning RDD, I now give to Spark the full list of files and he fills the partition alone... Just magic.

@codecov-io
Copy link

codecov-io commented Oct 18, 2018

Codecov Report

Merging #55 into master will decrease coverage by 0.21%.
The diff coverage is 90.9%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master     #55      +/-   ##
=========================================
- Coverage   89.52%   89.3%   -0.22%     
=========================================
  Files           9       9              
  Lines         487     477      -10     
  Branches       87      88       +1     
=========================================
- Hits          436     426      -10     
  Misses         51      51
Impacted Files Coverage Δ
...strolabsoftware/sparkfits/FitsSourceRelation.scala 97.36% <90.9%> (-0.31%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update dff168a...6789c37. Read the comment docs.

@JulienPeloton
Copy link
Member Author

Probably the last error arises because of ulimit -n being too small (1024). Need to investigate.

@JulienPeloton
Copy link
Member Author

OK - the problem with >> 10,000 seems deeper than expected.
I will merge this PR since it solves a large part of the problem and open a separate one to this remaining issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request IO
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants