Skip to content

Debugging module loading

Rob Rudin edited this page Jan 19, 2024 · 22 revisions

Under the hood, ml-gradle uses the MarkLogic Java Client to connect to a MarkLogic REST API server and load modules. So when you run into errors with loading modules, it's often helpful to run a quick test that uses the Java Client to confirm that you can connect to your REST API server outside the scope of ml-gradle.

Debugging module loading without involving ml-gradle

Here's a build.gradle file with a task that you can customize and run for loading modules via port 8000 (this includes all "asset" modules - i.e. not REST API services, transforms, or options, which must be loaded via your application-specific REST API server):

buildscript {
    repositories { 
        mavenCentral() 
    }
    dependencies { 
        classpath "com.marklogic:marklogic-client-api:6.1.0" 
    }
}

task testLoadModule {
    doLast {
        // See https://docs.marklogic.com/javadoc/client/com/marklogic/client/DatabaseClientFactory.html
        def host = "localhost"
        def port = 8000
        def database = "Documents"
        def username = "admin"
        def password = "admin"

        // See https://docs.marklogic.com/javadoc/client/com/marklogic/client/DatabaseClientFactory.SecurityContext.html
        def context = new com.marklogic.client.DatabaseClientFactory.DigestAuthContext(username, password)
        def client = com.marklogic.client.DatabaseClientFactory.newClient(host, port, database, context)
        try {
            client.newDocumentManager().write("/test/module.xqy", new com.marklogic.client.io.StringHandle("<hello>world</hello>"))
        } finally {
            client.release()
        }
    }
}

You can use this as a starting point for any sort of debugging test, such as for loading a service or transform. Just check out the Java Client javadocs to see what calls need to be made.

You can change the version number of the Java Client as well. To see what version ml-gradle is using, run "./gradlew buildEnvironment" to see a list of the Gradle plugins you're using and their dependencies.

Configuring SSL

The Java Client SecurityContext javadocs show the methods for configuring an SSL context and hostname verifier on the SecurityContext. You can call these methods to configure an SSL connection to your REST server:

def context = new com.marklogic.client.DatabaseClientFactory.DigestAuthContext(username, password)
// This is one possible implementation of an SSLContext
context.withSSLContext(
    com.marklogic.client.ext.modulesloader.ssl.SimpleX509TrustManager.newSSLContext(),
    new com.marklogic.client.ext.modulesloader.ssl.SimpleX509TrustManager()
)
context.withSSLHostnameVerifier(com.marklogic.client.DatabaseClientFactory.SSLHostnameVerifier.ANY)

To reference the classes in that chunk of code, you'll need to adjust your buildscript as shown below:

buildscript {
	repositories {
		mavenCentral()
	}
	dependencies {
		classpath "com.marklogic:marklogic-client-api:6.1.0"
		classpath "com.marklogic:ml-javaclient-util:4.5.0"
	}
}

Testing against a DHS instance

If you are having problems loading modules into a DHS Data Hub instance, try the following task, which uses basic authentication and a simple approach for SSL. Be sure to change the host/username/password and ensure that the connectionType value is correct based on whether your host is a load balancer (gateway) or an actual ML host (direct).

Also, you likely do not need to modify the buildscript in your build.gradle file if you're already importing the Data Hub Gradle plugin, as that includes the dependencies needed for the below script.

task testLoadModule {
    doLast {
        def host = "changeme"
        def port = 8011
        def database = "Documents"
        def username = "changeme"
        def password = "changeme"

        // If talking to a load balancer, use GATEWAY instead of DIRECT
        def connectionType = com.marklogic.client.DatabaseClient.ConnectionType.DIRECT
        //def connectionType = com.marklogic.client.DatabaseClient.ConnectionType.GATEWAY

        // See https://docs.marklogic.com/javadoc/client/com/marklogic/client/DatabaseClientFactory.SecurityContext.html
        def context = new com.marklogic.client.DatabaseClientFactory.BasicAuthContext(username, password)

        // Use a very simple accept-everything approach for SSL; only suitable for test purposes
        context.withSSLContext(
            com.marklogic.client.ext.modulesloader.ssl.SimpleX509TrustManager.newSSLContext(),
            new com.marklogic.client.ext.modulesloader.ssl.SimpleX509TrustManager()
        )
        context.withSSLHostnameVerifier(com.marklogic.client.DatabaseClientFactory.SSLHostnameVerifier.ANY)

        // See https://docs.marklogic.com/javadoc/client/com/marklogic/client/DatabaseClientFactory.html
        def client = com.marklogic.client.DatabaseClientFactory.newClient(host, port, database, context, connectionType)
        try {
            def docManager = client.newDocumentManager()
            def writeSet = docManager.newWriteSet()
            writeSet.add("/test/module1.xqy", new com.marklogic.client.io.StringHandle("<hello>world</hello>"))
            writeSet.add("/test/module2.xqy", new com.marklogic.client.io.StringHandle("<another>test</another>"))
            client.newDocumentManager().write(writeSet)
        } finally {
            client.release()
        }
    }
}

Authenticating with a certificate

If port 8000 and/or your REST API server (for loading REST extensions like services, transforms, and options) requires authenticating with a certificate, you can use the below task as a starting point for debugging authenticating with that app server. It doesn't load a module; it just evaluates a simple XQuery expression, which should suffice for verifying that you can authenticate correctly.

task testAuthenticateWithCertificate {
  doLast {
    def host = "localhost"
    def port = 8123 // the port of your REST API server
    def certFile = "path/to/cert.p12"
    def certPassword = "not-required"

    // See https://docs.marklogic.com/javadoc/client/com/marklogic/client/DatabaseClientFactory.CertificateAuthContext.html
    def context = new com.marklogic.client.DatabaseClientFactory.CertificateAuthContext(certFile, certPassword)
    def client = com.marklogic.client.DatabaseClientFactory.newClient(host, port, context)
    try {
      println client.newServerEval().xquery("fn:current-dateTime()").evalAs(String.class)
    } finally {
      client.release()
    }
  }
}

Verifying that modules are in the right directories

See How modules are loaded to ensure that you have your modules in the directories that ml-gradle expects.

Modules getting corrupted?

If you have binary files that are loaded into your modules but are corrupted, odds are it's because ml-gradle doesn't realize they're binary files and is trying to replace string tokens in them with Gradle property values. ml-gradle is aware of a few dozen common binary extensions - to add your own (such as xlsm and xlsx for Excel files), just set this property (can find more info in the Property Reference):

mlAdditionalBinaryExtensions=xlsm,xlsx

If you have a text file that is being corrupted, see [Encoding issues], particularly if your file contains some non-ASCII characters.

A module is bad, but not sure which one?

ml-gradle defaults to loading modules in a batch call to ML, which is much more efficient than loading them one at a time. But if a module is invalid - e.g. a module is a JSON file with malformed JSON in it - then the error from ML will not state which module/document was bad. In such a scenario, it's useful to load each module in a separate call to ML. That way, when using "-i", you'll see what file was being processed right before the error message.

To control the number of modules sent in a call to ML, set the following property:

mlModulesLoaderBatchSize=1

Module loading taking a long time?

Some projects have reported that when there are hundreds of non-REST modules to load or more (i.e. those under src/main/ml-config/root or src/main/ml-config/ext), the single call (defaults to port 8000) that loads the modules can take significantly different amounts of time on different machines - i.e. a few seconds on one machine, a few minutes on another machine.

As of ml-gradle 3.4.0, you can set the following property (e.g. in gradle.properties) to control how many modules are loaded in a single call to MarkLogic:

mlModulesLoaderBatchSize=100

This property has no value by default, which means all non-REST modules are loaded in a single call.

Prior to ml-gradle 3.4.0 (but not before ml-gradle 3.0.0), you can do some build.gradle surgery to configure the batch size that ml-gradle uses when loading non-REST module. You can put the following into build.gradle to specify a batch size to control how many modules are loaded at a time:

ext {
  def loadModulesCommand = mlAppDeployer.getCommand("LoadModulesCommand")
  loadModulesCommand.initializeDefaultModulesLoader(mlCommandContext)
  loadModulesCommand.modulesLoader.assetFileLoader.batchSize = 10
}

Understanding errors from the Java Client

The Java Client can throw an error based on a variety of conditions. Below are some common error messages to see and what might be causing them.

Invalid username/password - if the Java Client fails to connect to MarkLogic, you'll see an error like this:

Error occurred while loading REST modules: Local message: /config/query write failed: Unauthorized. Server Message: Unauthorized

To debug this, see Configuring security to understand what username/password properties are used for loading modules.

Resource not found - sometimes, the Java Client may connect to a MarkLogic app server that isn't a valid REST app server (i.e. it's not using the REST API rewriter). And you'll get an error like this:

Caused by: com.marklogic.client.ResourceNotFoundException: Local message: /config/query not found for write. Server Message: Request failed. Error body not received from server

To debug this, check the value of the "mlRestPort" property that's logged when using Gradle info-level logging (-i or --info on the command line). This tells you what application-specific REST server ml-gradle is trying to connect to. You can try using the script near the top of this page to debug the connection and loading of the REST module as well.

Broken pipe - if a java.net.SocketException is thrown with a message of "Broken pipe (Write failed)", then the JVM (Java Virtual Machine) is reporting that the server closed the connection before a module could be written to MarkLogic. Neither the JVM nor ml-gradle has any idea what happened, other than the server closed the connection. Check your MarkLogic server logs to look for signs of a connection failing. In some environments, such as when running MarkLogic locally via Docker, this error can occur when Docker has not provisioned sufficient memory to MarkLogic, causing MarkLogic to restart. In that scenario, the connection is closed, resulting in a "Broken pipe" error. The MarkLogic server logs will show that a restart occurred, thus identifying the problem.

Clone this wiki locally