# Scala: JSON with json4s

### Defining JSON

* Spark includes the json4s library
    * but you also need json4s-jackson 

    import $ivy.`org.json4s::json4s-jackson:3.6.7`

    import org.json4s._
    import org.json4s.JsonDSL._
    import org.json4s.jackson.JsonMethods.parse
    import org.json4s.jackson.Serialization.write

To represent JSON in scala, json4s provides `JObject()` and family which all create `JValue`s .

A JValue is a scala-based representation of a json document. You can create documents manually using these constructors. 


In [3]:
import org.json4s._
import org.json4s.JsonDSL._


val jsonResult: JValue = JObject(
    "name" -> JString("Michael"),
    "isAdult" -> JBool(true),
    "tags" -> JArray(List(JString("UK"), JString("Scala")))
)


[32mimport [39m[36morg.json4s._
[39m
[32mimport [39m[36morg.json4s.JsonDSL._


[39m
[36mjsonResult[39m: [32mJValue[39m = [33mJObject[39m(
  [33mList[39m(
    ([32m"name"[39m, [33mJString[39m([32m"Michael"[39m)),
    ([32m"isAdult"[39m, [33mJBool[39m(true)),
    ([32m"tags"[39m, [33mJArray[39m([33mList[39m([33mJString[39m([32m"UK"[39m), [33mJString[39m([32m"Scala"[39m))))
  )
)

When you have a `JValue` the selection operator `\` extracts a value from the document, given a key.

In [4]:
jsonResult \ "name"

[36mres3[39m: [32mJValue[39m = [33mJString[39m([32m"Michael"[39m)

Since `JValue`s are hard to work with, the `.extract` method converts them into basic scala types. To do this it needs a formatter which describes how to perform the calculation.

The formatter is given as an implicit argument to `.extract`, so needs to be defined as an implicit in scope.

json4s provides `DefaultFormats` which cover many basic scala types. 

In [5]:
implicit val formats = DefaultFormats

(jsonResult \ "name").extract[String]

[36mformats[39m: [32mDefaultFormats[39m.type = org.json4s.DefaultFormats$@1646cd50
[36mres4_1[39m: [32mString[39m = [32m"Michael"[39m

In [6]:
jsonResult \ "tags"

[36mres5[39m: [32mJValue[39m = [33mJArray[39m([33mList[39m([33mJString[39m([32m"UK"[39m), [33mJString[39m([32m"Scala"[39m)))

In [7]:
(jsonResult \ "tags").extract[List[String]]

[36mres6[39m: [32mList[39m[[32mString[39m] = [33mList[39m([32m"UK"[39m, [32m"Scala"[39m)

To parse a string into a `JValue`, use `parse`

### Parsing JSON

In [8]:
import org.json4s.jackson.JsonMethods.parse

val jsonText = """
{
   "tags": ["UK","Scala"],
   "name": "Michael",
   "isAdult": true
}
"""

parse(jsonText)

[32mimport [39m[36morg.json4s.jackson.JsonMethods.parse

[39m
[36mjsonText[39m: [32mString[39m = [32m"""
{
   "tags": ["UK","Scala"],
   "name": "Michael",
   "isAdult": true
}
"""[39m
[36mres7_2[39m: [32mJValue[39m = [33mJObject[39m(
  [33mList[39m(
    ([32m"tags"[39m, [33mJArray[39m([33mList[39m([33mJString[39m([32m"UK"[39m), [33mJString[39m([32m"Scala"[39m)))),
    ([32m"name"[39m, [33mJString[39m([32m"Michael"[39m)),
    ([32m"isAdult"[39m, [33mJBool[39m(true))
  )
)

In [9]:
parse(jsonText) == jsonResult

[36mres8[39m: [32mBoolean[39m = true

The general approach is then: parse *into* a JValue then extract the relevant piece as a usable type. In one line:

In [10]:
(parse(jsonText) \ "tags").extract[List[String]].last

[36mres9[39m: [32mString[39m = [32m"Scala"[39m

### Writing Json with the DSL

json4s provides the `~` constructor for `JValue`s. 

This allows you to describe a json document using basic scala types that are then converted to json4s' `JValue` representations.

In [11]:
val jsonDsl: JValue =  
    ("key1" ->  "val1") ~  
    ("key2" ->  true)   ~   
    ("key3" -> List(1, 2, 3))

[36mjsonDsl[39m: [32mJValue[39m = [33mJObject[39m(
  [33mList[39m(
    ([32m"key1"[39m, [33mJString[39m([32m"val1"[39m)),
    ([32m"key2"[39m, [33mJBool[39m(true)),
    ([32m"key3"[39m, [33mJArray[39m([33mList[39m([33mJInt[39m(1), [33mJInt[39m(2), [33mJInt[39m(3))))
  )
)

In [12]:
import org.json4s.jackson.Serialization.{write, writePretty}

val jsonDslText: String = write(jsonDsl)

println(jsonDslText)

{"key1":"val1","key2":true,"key3":[1,2,3]}


[32mimport [39m[36morg.json4s.jackson.Serialization.{write, writePretty}

[39m
[36mjsonDslText[39m: [32mString[39m = [32m"{\"key1\":\"val1\",\"key2\":true,\"key3\":[1,2,3]}"[39m

In [13]:
writePretty(jsonDsl)

[36mres12[39m: [32mString[39m = [32m"""{
  "key1" : "val1",
  "key2" : true,
  "key3" : [ 1, 2, 3 ]
}"""[39m

The approach: define json via the DSL then use `write()` to convert to string, and output -- eg., using `println()`. In one line:

In [14]:
println(write(
    ("name" -> "Michael") ~
    ("isAdult" -> true)   ~
    ("tags" -> List("UK", "Scala"))
))

{"name":"Michael","isAdult":true,"tags":["UK","Scala"]}


### Serializing Case Classes

Its often useful to be able to deserialize json directly to a case class; and conversly, starting with case class, to directly produce json. ie., to `extract` a `Person` and `write` a `Person`. 

To do this you need to define a child class of `CustomSerializer` that defines:

    def deserialize(implicit format: Formats)
    def serialize(implicit format: Formats)
    
These are relatively complex methods to override, so the typical way of defining them is to supply implementations as a parameter to the parent. 

The first should define the extractor, the way a `JValue` goes to a `Person`.  The second should define how a `Person` converts to a `JValue`.

(The syntax needs to be pretty much exactly what's given.)

In [15]:
case class Person(name: String, isAdult: Boolean, tags: List[String])

class PersonSerializer extends CustomSerializer[Person](implicit formats => (
   {
       case j: JValue => Person(
           (j \ "name").extract[String], 
           (j \ "isAdult").extract[Boolean], 
           (j \ "tags").extract[List[String]]
       )
   } ,
   {
       case j: Person =>
           ("name" -> j.name)       ~
           ("isAdult" -> j.isAdult) ~
           ("tags" -> j.tags)
   }                                                        
))

defined [32mclass[39m [36mPerson[39m
defined [32mclass[39m [36mPersonSerializer[39m

### Parse and Write with custom classes

Now defined, we can use `parse`, `extract` and `write` as above. 

In [16]:
jsonText

[36mres15[39m: [32mString[39m = [32m"""
{
   "tags": ["UK","Scala"],
   "name": "Michael",
   "isAdult": true
}
"""[39m

In [17]:
parse(jsonText).extract[Person]

[36mres16[39m: [32mPerson[39m = [33mPerson[39m([32m"Michael"[39m, true, [33mList[39m([32m"UK"[39m, [32m"Scala"[39m))

In [18]:
println(write(Person("Michael", true, List("UK", "Scala"))))

{"name":"Michael","isAdult":true,"tags":["UK","Scala"]}


## Decomposing

* For simple cases, json4s can automatically serialize a case class
* The `Extraction.decompose` method accepts any object and attempts to produce a `JValue`
    - no custom serializer required!
* However `.extract`ing still requires the serializer 

In [19]:
import org.json4s.Extraction.decompose

[32mimport [39m[36morg.json4s.Extraction.decompose[39m

In [20]:
val auto = decompose(Person("Michael", true, List("UK", "Scala")))

[36mauto[39m: [32mJValue[39m = [33mJObject[39m(
  [33mList[39m(
    ([32m"name"[39m, [33mJString[39m([32m"Michael"[39m)),
    ([32m"isAdult"[39m, [33mJBool[39m(true)),
    ([32m"tags"[39m, [33mJArray[39m([33mList[39m([33mJString[39m([32m"UK"[39m), [33mJString[39m([32m"Scala"[39m))))
  )
)

In [21]:
println(write(auto))

{"name":"Michael","isAdult":true,"tags":["UK","Scala"]}


## Writing to File

In [22]:
import java.nio.file.{Paths, Files}
import java.nio.charset.StandardCharsets

def writeFile(filename: String, contents: String) = Files.write(
    Paths.get(filename), contents.getBytes(StandardCharsets.UTF_8)
)

[32mimport [39m[36mjava.nio.file.{Paths, Files}
[39m
[32mimport [39m[36mjava.nio.charset.StandardCharsets

[39m
defined [32mfunction[39m [36mwriteFile[39m

In [23]:
writeFile("sample.json", write(auto))

[36mres22[39m: [32mjava[39m.[32mnio[39m.[32mfile[39m.[32mPath[39m = sample.json

In [30]:
import scala.io.Source

val json = Source.fromFile("sample.json").mkString

println(json)

{"name":"Michael","isAdult":true,"tags":["UK","Scala"]}


[32mimport [39m[36mscala.io.Source

[39m
[36mjson[39m: [32mString[39m = [32m"{\"name\":\"Michael\",\"isAdult\":true,\"tags\":[\"UK\",\"Scala\"]}"[39m

In [32]:
parse(Source.fromFile("sample.json").mkString).extract[Person]

[36mres31[39m: [32mPerson[39m = [33mPerson[39m([32m"Michael"[39m, true, [33mList[39m([32m"UK"[39m, [32m"Scala"[39m))