Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pds-generate crashes when runs out of heap memory #31

Open
plawton-umd opened this issue Oct 4, 2022 · 2 comments
Open

pds-generate crashes when runs out of heap memory #31

plawton-umd opened this issue Oct 4, 2022 · 2 comments

Comments

@plawton-umd
Copy link

plawton-umd commented Oct 4, 2022

PDS4_SPACECRAFT_1H00_1000.sch.gz
PDS4_SPACECRAFT_1H00_1000.xsd.gz
Template_NH_Pepssi_data.vm.gz
pepssi_example.sh.gz

🐛 Describe the bug

When trying to run pds-generate on 1274 PDS3 lbl files that are more than 2 MB each
it crashes with the below error output

pds-generate -p data//.lbl -t Template_test.vm
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at org.antlr.runtime.ANTLRReaderStream.load(ANTLRReaderStream.java:78)
at org.antlr.runtime.ANTLRInputStream.(ANTLRInputStream.java:68)
at org.antlr.runtime.ANTLRInputStream.(ANTLRInputStream.java:52)
at org.antlr.runtime.ANTLRInputStream.(ANTLRInputStream.java:48)
at org.antlr.runtime.ANTLRInputStream.(ANTLRInputStream.java:40)
at gov.nasa.pds.tools.label.parser.DefaultLabelParser.parseLabel(DefaultLabelParser.java:265)
at gov.nasa.pds.tools.label.parser.DefaultLabelParser.parseLabel(DefaultLabelParser.java:249)
at gov.nasa.pds.tools.label.parser.DefaultLabelParser.parseLabel(DefaultLabelParser.java:154)
at gov.nasa.pds.imaging.generate.readers.ProductToolsLabelReader.parseLabel(ProductToolsLabelReader.java:302)
at gov.nasa.pds.imaging.generate.label.PDS3Label.setMappings(PDS3Label.java:463)
at gov.nasa.pds.imaging.generate.GenerateLauncher.query(GenerateLauncher.java:225)
at gov.nasa.pds.imaging.generate.GenerateLauncher.main(GenerateLauncher.java:293)

It should be noted that no PDS4 xml labels were produced.
When test were run using small numbers of products (1, 2, 5, and 10) it was noted that the percentage of memory
used from the beginning increased with the number of products included, increased as processing time passed,
and was not seen to decrease.

📜 To Reproduce

Steps to reproduce the behavior:

  1. Go to 'https://pdssbn.astro.umd.edu/holdings/nh-a-pepssi-3-kem1-v5.0/dataset.shtml'
  2. Click on 'DOWNLOAD (1010.3 MB)'
  3. Create a working directory
  4. Download the checksum and tgz file
  5. Place them in the working directory
  6. Confirm the checksum and open the tgz file
  7. Create a link from the working directory to the nh-a-pepssi-3-kem1-v5.0/data directory
    ln -s nh-a-pepssi-3-kem1-v5.0/data data
  8. Download the 4 attached files and unzip them.
  9. Place the PDS4_SPACECRAFT_1H00_1000.sch and PDS4_SPACECRAFT_1H00_1000.xsd files in the working directory
  10. Place the Template_NH_Pepssi_data.vm file in the working directory
  11. Update the PDS4_SPACECRAFT_1H00_1000.sch and PDS4_SPACECRAFT_1H00_1000.xsd lines in Template_NH_Pepssi_data.vm with the correct local path
  12. Place pepssi_example.sh in working directory (I put the command in this small script since this form appears to be corrupting the command line)
  13. Update pepssi_example.sh with the correct local path
  14. run pepssi_example.sh Note: suggest running the script via cron since it may take a while
  15. Be sure to note if there are any xml files in the same data subdirectories as the lbl and fit files
  16. See error17.
  17. To see the behavior with increased memory use, you can create new directories subset1, subset2, subset5, and subset10. Based on the number in the directory name, copy subdirectories from nh-a-pepssi-3-kem1-v5.0/data/ to the subset directories. Redirect the data link. Watch the information provided by top.

🕵️ Expected behavior

PDS4 xml labels to be produced for each PDS3 lbl label and the program to not crash.

📚 Version of Software Used

/sbnops/lcltools/bin/pds-generate -V

gov.nasa.pds:mi-label
Version 1.2.2
Release Date: 2022-04-14 05:41:49

🩺 Test Data / Additional context

See Reproduce section. Note: that is the latest public version of the dataset we are having issues with. It only has 847 products versus 1274 products in the version in review. The public version can run on the server we are using for the migration work, but will crash on a laptop. The in review version is half again larger and crashes on the server. Specifications of the computer will matter, so it is suggested to not use your best computer for the testing.

🏞Screenshots

🖥 System Info

  • OS: [e.g. iOS] RHEL 3.10.0-1160.66.1.el7.x86_64 number 1 SMP Wed May 18 16:02:34 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
  • Browser [e.g. chrome, safari] n/a
  • Version [e.g. 22]

🦄 Related requirements

⚙️ Engineering Details

@plawton-umd plawton-umd added bug Something isn't working needs:triage labels Oct 4, 2022
@jordanpadams
Copy link
Member

@plawton-umd per our discussion, this may be an error in the MILabel logic, or a memory leak.

in the meantime, a workaround that may help get this working would be to increase your Java heap by updating the last line of pds-generate command-line script.

Before:

${JAVA_CMD} -jar ${GENERATE_JAR} "$@"

After:

"${JAVA_CMD}" -Xms2048m -Xmx4096m -jar ${GENERATE_JAR} "$@"

Depending upon how much available memory you have and what you are running on the server you are executing MILabel, you can can continue to multiply each of those numbers x2 until it stops failing. This is a temporary spike in the memory usage while the software runs, but Java garbage collection will clean it all up as soon as (or very very soon as after) the run completes.

let me know how this goes. we will add this to our backlog to investigate the issue further.

@jordanpadams
Copy link
Member

@plawton-umd As a heads up, we have not had an opportunity to push the testing of this app to determine the guaranteed issue here, but we just fixed a potential memory leak in the code that may have been a cause for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: ToDo
Development

No branches or pull requests

4 participants