Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor to use store for all temp IO #3548

Merged
merged 2 commits into from
Aug 7, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 9 additions & 3 deletions src/main/java/org/dita/dost/ant/ExtensibleAntInvoker.java
Original file line number Diff line number Diff line change
Expand Up @@ -307,7 +307,7 @@ public static Job getJob(final Project project) {
Job job = project.getReference(ANT_REFERENCE_JOB);
if (job != null) {
if (job.isStale()) {
project.log("Reload stale job configuration reference", Project.MSG_VERBOSE);
project.log("Reload stale job configuration reference", Project.MSG_ERR);
try {
job = new Job(tempDir, job.getStore());
} catch (final IOException ioe) {
Expand All @@ -316,10 +316,16 @@ public static Job getJob(final Project project) {
project.addReference(ANT_REFERENCE_JOB, job);
}
} else {
XMLUtils xmlUtils = project.getReference(ANT_REFERENCE_XML_UTILS);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Store XMLUtils into Ant project reference, because we want to reuse the same Saxon Processor everywhere. This allows reusing the same name pool etc. that's needed for in-memory Store implementation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious -- in the past we had memory crashes in steps where Saxon ended up reading 5,000 or 10,000 topics (like the mappull step). Is reusing the same Saxon Processor going to have any impact on that sort of case, either better or worse, or is there no change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the Processor class is what maintains the name pool etc. The XsltExecutable is the one that would be a problem and we don't reuse it by default.

if (xmlUtils == null) {
project.log("XML utils not found from Ant project reference", Project.MSG_ERR);
xmlUtils = new XMLUtils();
xmlUtils.setLogger(new DITAOTAntLogger(project));
}
Store store = project.getReference(ANT_REFERENCE_STORE);
if (store == null) {
project.log("Store not found from Ant project reference", Project.MSG_VERBOSE);
store = new StreamStore( new XMLUtils());
project.log("Store not found from Ant project reference", Project.MSG_ERR);
store = new StreamStore(job.tempDir, xmlUtils);
}
project.log("Job not found from Ant project reference", Project.MSG_VERBOSE);
try {
Expand Down
35 changes: 29 additions & 6 deletions src/main/java/org/dita/dost/ant/InitializeProjectTask.java
Original file line number Diff line number Diff line change
Expand Up @@ -12,14 +12,14 @@
import org.apache.tools.ant.Task;
import org.dita.dost.log.DITAOTAntLogger;
import org.dita.dost.store.Store;
import org.dita.dost.store.StreamStore;
import org.dita.dost.store.StoreBuilder;
import org.dita.dost.util.CatalogUtils;
import org.dita.dost.util.XMLUtils;

import java.io.File;
import java.util.ServiceLoader;

import static org.dita.dost.util.Constants.ANT_REFERENCE_STORE;
import static org.dita.dost.util.Constants.ANT_REFERENCE_XML_UTILS;
import static org.dita.dost.util.Constants.*;
import static org.dita.dost.util.URLUtils.toFile;

/**
Expand All @@ -28,6 +28,11 @@
* @since 3.5
*/
public final class InitializeProjectTask extends Task {

private static ServiceLoader<StoreBuilder> storeBuilderLoader = ServiceLoader.load(StoreBuilder.class);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use service loader to load StoreBuilder which in turn can be used to create the actual Store implementations.


private String storeType = "file";
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Default StoreBuilder name, plain old temporary directory in disk implementation.


@Override
public void execute() throws BuildException {
log("Initializing project", Project.MSG_INFO);
Expand All @@ -42,10 +47,28 @@ public void execute() throws BuildException {
xmlUtils.setLogger(new DITAOTAntLogger(getProject()));
getProject().addReference(ANT_REFERENCE_XML_UTILS, xmlUtils);
}
final Store store = getStore(xmlUtils);
getProject().addReference(ANT_REFERENCE_STORE, store);
}

private Store getStore(XMLUtils xmlUtils) {
Store store = getProject().getReference(ANT_REFERENCE_STORE);
if (store == null) {
store = new StreamStore(xmlUtils);
getProject().addReference(ANT_REFERENCE_STORE, store);
if (store != null) {
return store;
}
File tempDir = toFile(getProject().getUserProperty(ANT_TEMP_DIR));
if (tempDir == null) {
tempDir = toFile(getProject().getProperty(ANT_TEMP_DIR));
}
for (StoreBuilder storeBuilder : storeBuilderLoader) {
if (storeBuilder.getType().equals(storeType)) {
return storeBuilder.setTempDir(tempDir).setXmlUtils(xmlUtils).build();
}
}
throw new BuildException(String.format("Unsupported store type %s", storeType));
}

public void setStoreType(final String storeType) {
this.storeType = storeType;
}
}
5 changes: 3 additions & 2 deletions src/main/java/org/dita/dost/ant/types/JobSourceSet.java
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@
import org.apache.tools.ant.types.resources.FileResource;
import org.apache.tools.ant.types.resources.URLResource;
import org.dita.dost.ant.ExtensibleAntInvoker;
import org.dita.dost.store.ant.types.StoreResource;
import org.dita.dost.util.Constants;
import org.dita.dost.util.FileUtils;
import org.dita.dost.util.Job;
Expand Down Expand Up @@ -58,9 +59,9 @@ private Collection<Resource> getResults() {
for (final FileInfo f : job.getFileInfo(this::filter)) {
log("Scanning for " + f.file.getPath(), Project.MSG_VERBOSE);
final File tempFile = new File(job.tempDir, f.file.getPath());
if (tempFile.exists()) {
if (job.getStore().exists(tempFile.toURI())) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All cases where we used to test file existance with File.exists(), we use Store.exists(URI) instead. The default StreamStore implementation will just test if the file exists. The added abstraction here allows in-memory Store to first check if the resource exists in memory, then fall back to checking if the resource exists in disk.

log("Found temporary directory file " + tempFile, Project.MSG_VERBOSE);
res.add(new FileResource(job.tempDir, f.file.toString()));
res.add(new StoreResource(job, job.tempDirURI.relativize(f.uri)));
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Custom Ant resource that can read from any Store, not just disk.

} else if (f.src.getScheme().equals("file")) {
final File srcFile = new File(f.src);
if (srcFile.exists()) {
Expand Down
5 changes: 1 addition & 4 deletions src/main/java/org/dita/dost/module/BranchFilterModule.java
Original file line number Diff line number Diff line change
Expand Up @@ -145,6 +145,7 @@ protected void processMap(final URI map) {
logger.debug("Writing " + currentFile);

try {
doc.setDocumentURI(currentFile.toString());
job.getStore().writeDocument(doc, currentFile);
} catch (final IOException e) {
logger.error("Failed to serialize " + map.toString() + ": " + e.getMessage(), e);
Expand Down Expand Up @@ -374,10 +375,6 @@ private void generateCopies(final Element topicref, final List<FilterUtils> filt
writer.setCurrentFile(dstAbsUri);
final List<XMLFilter> pipe = singletonList(writer);

final File dstDirUri = new File(dstAbsUri.resolve("."));
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to create directories into temporary directory in disk, the Store will handle that when resource is created.

if (!dstDirUri.exists() && !dstDirUri.mkdirs()) {
logger.error("Failed to create directory " + dstDirUri);
}
try {
job.getStore().transform(srcAbsUri, dstAbsUri, pipe);
} catch (final DITAOTException e) {
Expand Down
24 changes: 14 additions & 10 deletions src/main/java/org/dita/dost/module/ChunkModule.java
Original file line number Diff line number Diff line change
Expand Up @@ -27,8 +27,6 @@

import static net.sf.saxon.s9api.streams.Steps.attribute;
import static net.sf.saxon.s9api.streams.Steps.child;
import static org.apache.commons.io.FileUtils.deleteQuietly;
import static org.apache.commons.io.FileUtils.moveFile;
import static org.dita.dost.util.Constants.*;
import static org.dita.dost.util.FileUtils.getExtension;
import static org.dita.dost.util.URLUtils.getRelativePath;
Expand Down Expand Up @@ -233,10 +231,14 @@ private void updateList(final Map<URI, URI> changeTable, final Map<URI, URI> con
}
// removed extra topic files
for (final URI s : oldTopicList) {
final File f = new File(job.tempDirURI.resolve(s));
logger.debug("Delete " + f.toURI());
if (f.exists() && !f.delete()) {
logger.error("Failed to delete " + f.getAbsolutePath());
final URI f = job.tempDirURI.resolve(s);
if (job.getStore().exists(f)) {
logger.debug("Delete " + f);
try {
job.getStore().delete(f);
} catch (IOException e) {
logger.error("Failed to delete " + f);
}
}
}

Expand All @@ -252,7 +254,7 @@ private void updateList(final Map<URI, URI> changeTable, final Map<URI, URI> con
final URI targetPath = conflictTable.get(oldFile);
if (targetPath != null) {
final URI target = targetPath;
if (!new File(target).exists()) {
if (!job.getStore().exists(target)) {
// newly chunked file
URI relativePath = getRelativePath(xmlDitalist, from);
final URI relativeTargetPath = getRelativePath(xmlDitalist, target);
Expand All @@ -262,10 +264,12 @@ private void updateList(final Map<URI, URI> changeTable, final Map<URI, URI> con
}
// ensure the newly chunked file to the old one
try {
logger.debug("Delete " + target);
deleteQuietly(new File(target));
if (job.getStore().exists(target)) {
logger.debug("Delete " + target);
job.getStore().delete(target);
}
logger.debug("Move " + from + " to " + target);
moveFile(new File(from), new File(target));
job.getStore().move(from, target);
final FileInfo fi = job.getFileInfo(from);
if (fi != null) {
job.remove(fi);
Expand Down
7 changes: 4 additions & 3 deletions src/main/java/org/dita/dost/module/CleanPreprocessModule.java
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@
import org.dita.dost.util.Job;
import org.dita.dost.util.Job.FileInfo;
import org.dita.dost.util.URLUtils;
import org.dita.dost.util.XMLUtils;
import org.dita.dost.writer.AbstractXMLFilter;
import org.dita.dost.writer.LinkFilter;
import org.dita.dost.writer.MapCleanFilter;
Expand Down Expand Up @@ -129,7 +130,7 @@ public AbstractPipelineOutput execute(final Map<String, String> input) throws DI
logger.debug("Skip format " + fi.format);
} else {
final File srcFile = new File(job.tempDirURI.resolve(fi.uri));
if (srcFile.exists()) {
if (job.getStore().exists(srcFile.toURI())) {
final File destFile = new File(job.tempDirURI.resolve(fi.result));
final List<XMLFilter> processingPipe = getProcessingPipe(fi, srcFile, destFile);
if (!processingPipe.isEmpty()) {
Expand Down Expand Up @@ -196,12 +197,12 @@ private Collection<FileInfo> rewrite(final Collection<FileInfo> fis) throws DITA

private Document serialize(final Collection<FileInfo> fis) throws IOException {
try {
final Document doc = DocumentBuilderFactory.newInstance().newDocumentBuilder().newDocument();
final Document doc = XMLUtils.getDocumentBuilder().newDocument();
final DOMResult result = new DOMResult(doc);
XMLStreamWriter out = XMLOutputFactory.newInstance().createXMLStreamWriter(result);
job.serialize(out, emptyMap(), fis);
return (Document) result.getNode();
} catch (final XMLStreamException | ParserConfigurationException e) {
} catch (final XMLStreamException e) {
throw new IOException("Failed to serialize job file: " + e.getMessage());
}
}
Expand Down
4 changes: 2 additions & 2 deletions src/main/java/org/dita/dost/module/ConrefPushModule.java
Original file line number Diff line number Diff line change
Expand Up @@ -39,13 +39,13 @@ public AbstractPipelineOutput execute(final AbstractPipelineInput input) {
reader.setJob(job);
for (final FileInfo f: fis) {
final File file = new File(job.tempDir, f.file.getPath());
logger.info("Reading " + file.getAbsolutePath());
logger.info("Reading " + file.toURI());
//FIXME: this reader calculate parent directory
reader.read(file.getAbsoluteFile());
}
final Map<File, Hashtable<MoveKey, DocumentFragment>> pushSet = reader.getPushMap();
for (final Map.Entry<File, Hashtable<MoveKey, DocumentFragment>> entry: pushSet.entrySet()) {
logger.info("Processing " + entry.getKey().getAbsolutePath());
// logger.info("Processing " + entry.getKey().toURI());
final ConrefPushParser parser = new ConrefPushParser();
parser.setJob(job);
parser.setLogger(logger);
Expand Down
6 changes: 1 addition & 5 deletions src/main/java/org/dita/dost/module/CopyToModule.java
Original file line number Diff line number Diff line change
Expand Up @@ -158,7 +158,7 @@ private void performCopytoTask(final Map<FileInfo, FileInfo> copyToMap) {
final URI srcFile = job.tempDirURI.resolve(copytoSource);
final URI targetFile = job.tempDirURI.resolve(copytoTarget);

if (new File(targetFile).exists()) {
if (job.getStore().exists(targetFile)) {
logger.warn(MessageUtils.getMessage("DOTX064W", copytoTarget.getPath()).toString());
} else {
final FileInfo input = job.getFileInfo(fi -> fi.isInput).iterator().next();
Expand Down Expand Up @@ -193,10 +193,6 @@ private void copyFileWithPIReplaced(final URI src, final URI target, final URI c
assert !copytoTargetFilename.isAbsolute();
assert inputMapInTemp.isAbsolute();
final File workdir = new File(target).getParentFile();
if (!workdir.exists() && !workdir.mkdirs()) {
logger.error("Failed to create copy-to target directory " + workdir.toURI());
return;
}
final File path2project = getPathtoProject(copytoTargetFilename, target, inputMapInTemp, job);
final File path2rootmap = getPathtoRootmap(target, inputMapInTemp);
XMLFilter filter = new CopyToFilter(workdir, path2project, path2rootmap, src, target);
Expand Down
14 changes: 2 additions & 12 deletions src/main/java/org/dita/dost/module/DebugAndFilterModule.java
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,6 @@
*/
package org.dita.dost.module;

import net.sf.saxon.s9api.Serializer;
import org.dita.dost.exception.DITAOTException;
import org.dita.dost.exception.DITAOTXMLErrorHandler;
import org.dita.dost.module.reader.TempFileNameScheme;
Expand Down Expand Up @@ -121,11 +120,6 @@ private void processFile(final FileInfo f) {
return;
}
outputFile = new File(job.tempDir, f.file.getPath());
final File outputDir = outputFile.getParentFile();
if (!outputDir.exists() && !outputDir.mkdirs()) {
logger.error("Failed to create output directory " + outputDir.getAbsolutePath());
return;
}
logger.info("Processing " + f.src + " to " + outputFile.toURI());

final Set<URI> schemaSet = dic.get(f.uri);
Expand Down Expand Up @@ -167,9 +161,9 @@ private void processFile(final FileInfo f) {

in = new InputSource(f.src.toString());

final Serializer result = processor.newSerializer(outputFile);
final ContentHandler result = job.getStore().getContentHandler(outputFile.toURI());
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of manually creating a S9api Serializer ask for a ContentHandler from the Store; all IO to temp goes through Store.


xmlSource.setContentHandler(result.getContentHandler());
xmlSource.setContentHandler(result);
xmlSource.parse(in);
} catch (final RuntimeException e) {
throw e;
Expand Down Expand Up @@ -476,10 +470,6 @@ private Element searchForKey(final Element root, final String key) {
* @throws DITAOTException if generation fails
*/
private void generateScheme(final File filename, final Document root) throws DITAOTException {
final File p = filename.getParentFile();
if (!p.exists() && !p.mkdirs()) {
throw new DITAOTException("Failed to make directory " + p.getAbsolutePath());
}
try {
job.getStore().writeDocument(root, filename.toURI());
} catch (final IOException e) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -885,9 +885,6 @@ private void outputResult() throws DITAOTException {

try {
logger.info("Serializing job specification");
if (!job.tempDir.exists() && !job.tempDir.mkdirs()) {
throw new DITAOTException("Failed to create " + job.tempDir + " directory");
}
job.write();
} catch (final IOException e) {
throw new DITAOTException("Failed to serialize job configuration files: " + e.getMessage(), e);
Expand Down
1 change: 1 addition & 0 deletions src/main/java/org/dita/dost/module/KeyrefModule.java
Original file line number Diff line number Diff line change
Expand Up @@ -440,6 +440,7 @@ private Document readMap(final FileInfo input) throws DITAOTException {
private void writeMap(final FileInfo in, final Document doc) throws DITAOTException {
try {
final URI file = job.tempDirURI.resolve(in.uri);
doc.setDocumentURI(file.toString());
job.getStore().writeDocument(doc, file);
} catch (final IOException e) {
throw new DITAOTException("Failed to write map: " + e.getMessage(), e);
Expand Down
18 changes: 11 additions & 7 deletions src/main/java/org/dita/dost/module/MaprefModule.java
Original file line number Diff line number Diff line change
Expand Up @@ -7,22 +7,23 @@
*/
package org.dita.dost.module;

import com.google.common.io.Files;
import net.sf.saxon.s9api.*;
import net.sf.saxon.trans.UncheckedXPathException;
import net.sf.saxon.trans.XPathException;
import org.dita.dost.exception.DITAOTException;
import org.dita.dost.pipeline.AbstractPipelineInput;
import org.dita.dost.pipeline.AbstractPipelineOutput;
import org.dita.dost.util.CatalogUtils;
import org.dita.dost.util.DelegatingURIResolver;
import org.dita.dost.util.Job.FileInfo;
import org.dita.dost.util.XMLUtils;
import org.w3c.dom.Document;
import org.w3c.dom.Element;

import javax.xml.transform.Source;
import javax.xml.transform.stream.StreamSource;
import java.io.*;
import java.io.File;
import java.io.IOException;
import java.util.Collection;
import java.util.List;

Expand Down Expand Up @@ -93,14 +94,16 @@ private void processMap(final FileInfo input) throws DITAOTException {
Document doc;
try {
doc = XMLUtils.getDocumentBuilder().newDocument();
final Source source = job.getStore().getSource(inputFile.toURI());
final Destination serializer = new DOMDestination(doc);
final XsltTransformer transformer = templates.load();
transformer.setErrorListener(toErrorListener(logger));
transformer.setURIResolver(new DelegatingURIResolver(CatalogUtils.getCatalogResolver(), job.getStore()));
transformer.setParameter(new QName("file-being-processed"), XdmItem.makeValue(inputFile.getName()));

final Source source = job.getStore().getSource(inputFile.toURI());
transformer.setSource(source);
final Destination serializer = new DOMDestination(doc);
serializer.setDestinationBaseURI(inputFile.toURI());
transformer.setDestination(serializer);
transformer.setURIResolver(CatalogUtils.getCatalogResolver());
transformer.setParameter(new QName("file-being-processed"), XdmItem.makeValue(inputFile.getName()));
transformer.transform();
} catch (final UncheckedXPathException e) {
throw new DITAOTException("Failed to merge map " + inputFile, e);
Expand All @@ -122,6 +125,7 @@ private void processMap(final FileInfo input) throws DITAOTException {
job.add(updated);

try {
doc.setDocumentURI(outputFile.toURI().toString());
job.getStore().writeDocument(doc, outputFile.toURI());
} catch (final IOException e) {
throw new DITAOTException("Failed to serialize map " + inputFile + ": " + e.getMessage(), e);
Expand Down Expand Up @@ -157,7 +161,7 @@ private void replace(final FileInfo input) throws DITAOTException {
final File inputFile = new File(job.tempDir, input.file.getPath() + FILE_EXTENSION_TEMP);
final File outputFile = new File(job.tempDir, input.file.getPath());
try {
Files.move(inputFile, outputFile);
job.getStore().move(inputFile.toURI(), outputFile.toURI());
} catch (final IOException e) {
throw new DITAOTException("Failed to replace temporary file " + inputFile + ": " + e.getMessage(), e);
}
Expand Down
4 changes: 4 additions & 0 deletions src/main/java/org/dita/dost/module/MoveLinksModule.java
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,8 @@
import org.dita.dost.exception.DITAOTException;
import org.dita.dost.pipeline.AbstractPipelineInput;
import org.dita.dost.pipeline.AbstractPipelineOutput;
import org.dita.dost.util.CatalogUtils;
import org.dita.dost.util.DelegatingURIResolver;
import org.dita.dost.util.Job.FileInfo;
import org.dita.dost.util.XMLUtils;
import org.dita.dost.writer.DitaLinksWriter;
Expand Down Expand Up @@ -60,6 +62,8 @@ public AbstractPipelineOutput execute(final AbstractPipelineInput input) throws

final XsltTransformer transformer = xsltCompiler.compile(new StreamSource(styleFile)).load();
transformer.setErrorListener(toErrorListener(logger));
transformer.setURIResolver(new DelegatingURIResolver(CatalogUtils.getCatalogResolver(), job.getStore()));
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The DelegatingURIResolver allow chaining XML catalog resolver and Store (which implements URIResolver), so that in the end Store will return the Source that is used to read the resource.


if (input.getAttribute("include.rellinks") != null) {
transformer.setParameter(new QName("include.rellinks"),
XdmItem.makeValue(input.getAttribute("include.rellinks")));
Expand Down
Loading