Skip to content

Support project.build.sourceEncoding from POM when indexing on PackageHub #221

@olafurpg

Description

@olafurpg

While indexing packages on PackageHub, I saw the following error in the logs

maven:org.eclipse.xtend:org.eclipse.xtend.core:2.24.0
com.sourcegraph.packagehub.PackageActor$CommandFailed: {
  "command": [
    "/coursier",
    "launch",
    "--jvm",
    "8",
    "com.sourcegraph:lsif-java_2.13:0.5.2-5-4a5bba31-SNAPSHOT",
    "-r",
    "sonatype:snapshots",
    "--",
    "index",
    "--output",
    "/root/.cache/packagehub/maven/org.eclipse.xtend/org.eclipse.xtend.core/2.24.0/dump.lsif",
    "--build-tool",
    "lsif"
  ],
  "cwd": "/root/.cache/packagehub/maven/org.eclipse.xtend/org.eclipse.xtend.core/2.24.0",
  "exit": 1,
  "stdout": [
    "\u001b[94minfo\u001b[39m: Compiling 316 Java sources",
    "\u001b[94minfo\u001b[39m: $ /root/.cache/coursier/jvm/adopt@1.8.0-292/bin/javac @/root/.cache/packagehub/maven/org.eclipse.xtend/org.eclipse.xtend.core/2.24.0/target/javacopts.txt",
    "\u001b[94minfo\u001b[39m: /root/.cache/packagehub/maven/org.eclipse.xtend/org.eclipse.xtend.core/2.24.0/org/eclipse/xtend/core/formatting2/RichStringToLineModel.java:97: error: unmappable character for encoding utf8",
    "\u001b[94minfo\u001b[39m:       boolean _startsWith_1 = t.startsWith(\"��\");",
    "\u001b[94minfo\u001b[39m:                                             ^",
    "\u001b[94minfo\u001b[39m: /root/.cache/packagehub/maven/org.eclipse.xtend/org.eclipse.xtend.core/2.24.0/org/eclipse/xtend/core/formatting2/RichStringToLineModel.java:97: error: unmappable character for encoding utf8",
    "\u001b[94minfo\u001b[39m:       boolean _startsWith_1 = t.startsWith(\"��\");",
    "\u001b[94minfo\u001b[39m:                                              ^",

By default, PackageHub assumes that all source files use UTF-8 encoding. A brief investigation reveals that some some packages specify what encoding they use in the pom.xml file. PackageHub should respect this setting

META-INF/maven/org.apache.commons/commons-lang3/pom.xml
522:    <project.build.sourceEncoding>ISO-8859-1</project.build.sourceEncoding

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions