Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems converting large DTBooks to ZedAI/EPUB3 #207

Open
ghost opened this issue Apr 13, 2014 · 4 comments
Open

Problems converting large DTBooks to ZedAI/EPUB3 #207

ghost opened this issue Apr 13, 2014 · 4 comments

Comments

@ghost
Copy link

ghost commented Apr 13, 2014

From john.bru...@gmail.com on August 30, 2012 23:38:55

What steps will reproduce the problem?

  1. Start the Pipeline with a large heap (e.g. export JAVA_MIN_MEM=2G && export JAVA_MAX_MEM=2G && export JAVA_PERM_MEM=128M && export JAVA_MAX_PERM_MEM=128M)
  2. Run a conversion of a large book (e.g. http://goo.gl/XNtgV )
    cli/dp2 dtbook-to-zedai --i-source ./Big_Book.xml --data /path/to/Big_Book.zip
    --file /path/to/Big_Book-output-zedai.zip --persistent
    --x-assert-valid false > ~/testing/epub3/Big_Book-client.log
  3. Follow the Pipeline log.

The processing will either eventually grind to a crawl while the CPU is pegged, or there might be a GC Overhead Limit error. What is the expected output? What do you see instead? Instead of a successful conversion, I see this from the console log for the sample book linked above (1,060 pages, 1,333 images, 13.8Mb DTBook XML file):

INFO  [ o.d.common.xproc.calabash.steps.Message] bundle://31.0:1/xml/dtbook-to-zedai.convert.xpl:544:25:Message:file:/Users/john/testing/epub3/bks-dp2/daisy-pipeline/data/72830d43-a693-4966-b4d5-926748790d40/context/images/p0730_001.jpg --> file:/Users/john/testing/epub3/bks-dp2/daisy-pipeline/data/72830d43-a693-4966-b4d5-926748790d40/output/output-dir/images/p0730_001.jpg   @o.d.c.x.c.slf4jXProcMessageListener:80#info
DEBUG [ o.d.common.xproc.calabash.steps.Message] bundle://31.0:1/xml/dtbook-to-zedai.convert.xpl:544:25:Message step !1.69.7.7 read file:/Users/john/testing/epub3/bks-dp2/daisy-pipeline/data/72830d43-a693-4966-b4d5-926748790d40/context/./Glencoe_World_History.xml   @o.d.c.x.c.slf4jXProcMessageListener:119#finest
Aug 30, 2012 4:12:58 PM org.restlet.engine.log.LogFilter afterHandle
INFO: 2012-08-30    16:12:58    127.0.0.1   -   127.0.0.1   8182    GET /ws/jobs/72830d43-a693-4966-b4d5-926748790d40   msgSeq=20802    200 -   0   2951    http://localhost:8182   -   -
DEBUG [ o.d.common.xproc.calabash.steps.Message] Running { http://xmlcalabash.com/ns/extensions}message !1.69.7.7   @o.d.c.x.c.slf4jXProcMessageListener:93#fine
INFO  [ o.d.common.xproc.calabash.steps.Message] bundle://31.0:1/xml/dtbook-to-zedai.convert.xpl:544:25:Message:file:/Users/john/testing/epub3/bks-dp2/daisy-pipeline/data/72830d43-a693-4966-b4d5-926748790d40/context/images/readingcheck.jpg --> file:/Users/john/testing/epub3/bks-dp2/daisy-pipeline/data/72830d43-a693-4966-b4d5-926748790d40/output/output-dir/images/readingcheck.jpg   @o.d.c.x.c.slf4jXProcMessageListener:80#info
DEBUG [ o.d.common.xproc.calabash.steps.Message] bundle://31.0:1/xml/dtbook-to-zedai.convert.xpl:544:25:Message step !1.69.7.7 read file:/Users/john/testing/epub3/bks-dp2/daisy-pipeline/data/72830d43-a693-4966-b4d5-926748790d40/context/./Glencoe_World_History.xml   @o.d.c.x.c.slf4jXProcMessageListener:119#finest
Aug 30, 2012 4:13:04 PM org.restlet.engine.log.LogFilter afterHandle
INFO: 2012-08-30    16:13:04    127.0.0.1   -   127.0.0.1   8182    GET /ws/jobs/72830d43-a693-4966-b4d5-926748790d40   msgSeq=20804    200 -   0   2966    http://localhost:8182   -   -
INFO  [  o.d.p.webservice.PipelineStatusService] Error caught from application(resource): Internal Server Error   @o.d.p.w.PipelineStatusService:50#getStatus
Exception in thread "Thread-11" java.lang.OutOfMemoryError: GC overhead limit exceeded
    at java.util.Arrays.copyOfRange(Arrays.java:3209)
    at java.lang.String.<init>(String.java:215)
    at java.lang.StringBuffer.toString(StringBuffer.java:585)
    at java.net.URI.defineString(URI.java:1962)
    at java.net.URI.toString(URI.java:1594)
    at net.sf.saxon.functions.ResolveURI.resolve(ResolveURI.java:140)
    at net.sf.saxon.functions.ResolveURI.evaluateItem(ResolveURI.java:95)
    at net.sf.saxon.expr.GeneralComparison.effectiveBooleanValue(GeneralComparison.java:596)
    at net.sf.saxon.expr.FilterIterator$NonNumeric.matches(FilterIterator.java:198)
    at net.sf.saxon.expr.FilterIterator.getNextMatchingItem(FilterIterator.java:80)
    at net.sf.saxon.expr.FilterIterator.next(FilterIterator.java:59)
    at net.sf.saxon.expr.ContextMappingIterator.next(ContextMappingIterator.java:52)
    at net.sf.saxon.expr.instruct.BlockIterator.next(BlockIterator.java:44)
    at net.sf.saxon.expr.FirstItemExpression.evaluateItem(FirstItemExpression.java:56)
    at net.sf.saxon.expr.Expression.iterate(Expression.java:429)
    at net.sf.saxon.sxpath.XPathExpression.iterate(XPathExpression.java:154)
    at net.sf.saxon.s9api.XPathSelector.iterator(XPathSelector.java:201)
    at com.xmlcalabash.runtime.XAtomicStep.evaluateXPath(Unknown Source)
    at com.xmlcalabash.runtime.XAtomicStep.computeValue(Unknown Source)
    at com.xmlcalabash.runtime.XForEach.run(Unknown Source)
    at com.xmlcalabash.runtime.XCompoundStep.run(Unknown Source)
    at com.xmlcalabash.runtime.XPipeline.doRun(Unknown Source)
    at com.xmlcalabash.runtime.XPipeline.run(Unknown Source)
    at com.xmlcalabash.runtime.XPipelineCall.run(Unknown Source)
    at com.xmlcalabash.runtime.XPipeline.doRun(Unknown Source)
    at com.xmlcalabash.runtime.XPipeline.run(Unknown Source)
    at org.daisy.common.xproc.calabash.CalabashXProcPipeline.run(CalabashXProcPipeline.java:239)
    at org.daisy.pipeline.job.Job.run(Job.java:131)
    at org.daisy.pipeline.job.DefaultJobExecutionService$1.run(DefaultJobExecutionService.java:69)
    at java.lang.Thread.run(Thread.java:680)
Aug 30, 2012 4:13:24 PM org.restlet.resource.ServerResource doCatch
WARNING: Exception or error caught in server resource
Internal Server Error (500) - The server encountered an unexpected condition which prevented it from fulfilling the request
    at org.restlet.resource.ServerResource.doHandle(ServerResource.java:510)
    at org.restlet.resource.ServerResource.get(ServerResource.java:700)
    at org.restlet.resource.ServerResource.doHandle(ServerResource.java:582)
    at org.restlet.resource.ServerResource.doNegotiatedHandle(ServerResource.java:642)
    at org.restlet.resource.ServerResource.doConditionalHandle(ServerResource.java:341)
    at org.restlet.resource.ServerResource.handle(ServerResource.java:944)
    at org.restlet.resource.Finder.handle(Finder.java:246)
    at org.restlet.routing.Filter.doHandle(Filter.java:159)
    at org.restlet.routing.Filter.handle(Filter.java:206)
    at org.restlet.routing.Router.doHandle(Router.java:431)
    at org.restlet.routing.Router.handle(Router.java:648)
    at org.restlet.routing.Filter.doHandle(Filter.java:159)
    at org.restlet.routing.Filter.handle(Filter.java:206)
    at org.restlet.routing.Filter.doHandle(Filter.java:159)
    at org.restlet.routing.Filter.handle(Filter.java:206)
    at org.restlet.routing.Filter.doHandle(Filter.java:159)
    at org.restlet.engine.application.StatusFilter.doHandle(StatusFilter.java:154)
    at org.restlet.routing.Filter.handle(Filter.java:206)
    at org.restlet.routing.Filter.doHandle(Filter.java:159)
    at org.restlet.routing.Filter.handle(Filter.java:206)
    at org.restlet.engine.CompositeHelper.handle(CompositeHelper.java:211)
    at org.restlet.engine.application.ApplicationHelper.handle(ApplicationHelper.java:84)
    at org.restlet.Application.handle(Application.java:381)
    at org.restlet.routing.Filter.doHandle(Filter.java:159)
    at o...

Original issue: http://code.google.com/p/daisy-pipeline/issues/detail?id=207

@ghost
Copy link
Author

ghost commented Apr 13, 2014

From rdeltour@gmail.com on August 31, 2012 17:44:45

Some recent (uncommitted) changes improve the situation on the dtbook-to-zedai conversion (the given sample succeeds with the hinted memory settings).

I still have not tested the zedai-to-epub3 part.

Status: Accepted
Labels: Component-Modules Module-dtbook-to-zedai Module-zedai-to-epub3

@ghost
Copy link
Author

ghost commented Apr 13, 2014

From rdeltour@gmail.com on September 07, 2012 05:39:25

Owner: rdeltour@gmail.com

@bertfrees
Copy link
Member

@mccallum-sgd Can you also try to reproduce this?

@bertfrees
Copy link
Member

This has lower priority because ideally we should wait for the removal of the ZedAI step in dtbook-to-epub3, and dtbook-to-zedai is probably not used in production by anyone.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants