Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

URLDecoder: Illegal hex characters in escape (%) pattern - For input string: " S" #747

Closed
anand-singh opened this issue Apr 15, 2016 · 3 comments

Comments

@anand-singh
Copy link

anand-singh commented Apr 15, 2016

"org.elasticsearch" % "elasticsearch-spark_2.11" % "2.3.0"

[error] (run-main-0) java.lang.IllegalArgumentException: URLDecoder: Illegal hex characters in escape (%) pattern - For input string: " S"
java.lang.IllegalArgumentException: URLDecoder: Illegal hex characters in escape (%) pattern - For input string: " S"
    at java.net.URLDecoder.decode(URLDecoder.java:194)
    at org.elasticsearch.hadoop.util.StringUtils.decodeQuery(StringUtils.java:404)
    at org.elasticsearch.hadoop.util.StringUtils.tokenizeAndUriDecode(StringUtils.java:135)
    at org.elasticsearch.hadoop.serialization.dto.mapping.MappingUtils.validateMapping(MappingUtils.java:45)
    at org.elasticsearch.hadoop.rest.RestService.findPartitions(RestService.java:271)
    at org.elasticsearch.spark.rdd.AbstractEsRDD.esPartitions$lzycompute(AbstractEsRDD.scala:61)
    at org.elasticsearch.spark.rdd.AbstractEsRDD.esPartitions(AbstractEsRDD.scala:60)
    at org.elasticsearch.spark.rdd.AbstractEsRDD.getPartitions(AbstractEsRDD.scala:27)
    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
    at scala.Option.getOrElse(Option.scala:121)
    at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
    at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
    at scala.Option.getOrElse(Option.scala:121)
    at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
    at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
    at scala.Option.getOrElse(Option.scala:121)
    at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
    at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:190)
    at org.apache.spark.sql.execution.Limit.executeCollect(basicOperators.scala:165)
    at org.apache.spark.sql.execution.SparkPlan.executeCollectPublic(SparkPlan.scala:174)
    at org.apache.spark.sql.DataFrame$$anonfun$org$apache$spark$sql$DataFrame$$execute$1$1.apply(DataFrame.scala:1499)
    at org.apache.spark.sql.DataFrame$$anonfun$org$apache$spark$sql$DataFrame$$execute$1$1.apply(DataFrame.scala:1499)
    at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:56)
    at org.apache.spark.sql.DataFrame.withNewExecutionId(DataFrame.scala:2086)
    at org.apache.spark.sql.DataFrame.org$apache$spark$sql$DataFrame$$execute$1(DataFrame.scala:1498)
    at org.apache.spark.sql.DataFrame.org$apache$spark$sql$DataFrame$$collect(DataFrame.scala:1505)
    at org.apache.spark.sql.DataFrame$$anonfun$head$1.apply(DataFrame.scala:1375)
    at org.apache.spark.sql.DataFrame$$anonfun$head$1.apply(DataFrame.scala:1374)
    at org.apache.spark.sql.DataFrame.withCallback(DataFrame.scala:2099)
    at org.apache.spark.sql.DataFrame.head(DataFrame.scala:1374)
    at org.apache.spark.sql.DataFrame.take(DataFrame.scala:1456)
    at org.apache.spark.sql.DataFrame.showString(DataFrame.scala:170)
    at org.apache.spark.sql.DataFrame.show(DataFrame.scala:350)
    at org.apache.spark.sql.DataFrame.show(DataFrame.scala:311)
    at org.apache.spark.sql.DataFrame.show(DataFrame.scala:319)
    at com.rklick.engine.example.ESDataTest$.main(ESDataTest.scala:44)
    at com.rklick.engine.example.ESDataTest.main(ESDataTest.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)

@costin
Copy link
Member

costin commented Apr 15, 2016

Any code sample to reproduce this?

@anand-singh
Copy link
Author

Hi @costin,

Thanks for quick response.

Code Sample:
object ESDataTest {

  def main(args: Array[String]): Unit = {
    val conf = new SparkConf(false)
      .setMaster("local[*]")
      .set("es.nodes", ELASTIC_NODES)
      .set("es.port", ELASTIC_PORT)
      .setAppName("Text File Test")

    val sc: SparkContext = new SparkContext(conf)
    val ssc: SQLContext = new SQLContext(sc)

    val df = ssc.esDF("pharmadata/test")
    df.printSchema()
    df.show()
  }

}

Ingest below file into ES and try to read it using above code sample.
PharmaData1.txt

Thanks,
Anand

costin added a commit that referenced this issue May 2, 2016
relates #747

(cherry picked from commit 5e3742ae41c03786c5473c5e6d618a430e621dc8)
@costin
Copy link
Member

costin commented May 2, 2016

Fixed in master and 2.x. The problem was caused by Spark infrastructure not URI escaping the field names which, when using special fields (like %something) caused the exception to occur.

@costin costin closed this as completed May 2, 2016
costin added a commit that referenced this issue May 2, 2016
costin added a commit that referenced this issue May 2, 2016
costin added a commit that referenced this issue May 3, 2016
costin added a commit that referenced this issue May 3, 2016
relates #747

(cherry picked from commit a09ca09)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants