# Snowpark Loading JSON Data

In this lab you will perform the following:

- [ ] Upload a JSON file to a Snowflake internal stage
- [ ] Build a DataFrame that reads data from the JSON file
- [ ] Load the entire JSON file to a table

---

## Create a Session

Create a Snowpark Session by passing in the connection properties file created in the [first lab exercise](../A-Dataframes/01-Sessions.ipynb).

In [None]:
import com.snowflake.snowpark._
import com.snowflake.snowpark.functions._
import com.snowflake.snowpark.types._

// Set connection properties file variable
val pwd = sys.env.get("PWD").fold("")(_.toString)
val filename = s"$pwd/de_snowpark/connect.properties"

val session = Session.builder.configFile(s"$filename").create

---
## Put simple.json File

In this section you will use the `file` convenience method of the Session object, to return a FileOperation object, which has access to `PUT`/`GET` data from Snowflake internal stages.  

Before calling `file.put()`, let's create a set of options to override the default values.  

In our case below, we want the `PUT` command to upload a file without automatically GZIPing it by setting the AUTO_COMPRESS to FALSE, and to OVERWRITE the file if it already exists.  

Run the `PUT` to upload the local `simple.json` file specified by `localFileName` to your user stage `@~` specified as the `stageLocation` along with the specified options in `putOptions`.

See [PUT command](https://docs.snowflake.com/en/sql-reference/sql/put.html) for the full list of options.

In [None]:
val putOptions = Map("AUTO_COMPRESS" -> "FALSE", "OVERWRITE"->"TRUE")
val localFileName = "./simple.json"
val stageLocation = "@~"

session.file.put(localFileName, stageLocation, putOptions)

---
### Progress: Check

- [X] Upload the JSON file to a Snowflake internal stage
- [ ] Build a DataFrame that can read the data from the JSON file
- [ ] Load the entire JSON file to a table

---

## Build a DataFrame to Read the File Data

The Session object has a `read` method that can be used to load data in various supported formats, with definition of format-specific options, from a Snowflake stage to a DataFrame.  

Run the `json` function to upload JSON files in the stage specified by `jsonFilePath` and set the JSON file format option with the `options` method, to strip any outer array specified in the `readOptions` configuration passed in.


In [None]:
val jsonFilePath = "@~/simple.json"
val readOptions = Map("STRIP_OUTER_ARRAY" -> "TRUE")

val dfRawJson = session.read
                       .options(readOptions)
                       .json(jsonFilePath)


dfRawJson.show
dfRawJson.count

Examine the output of the `show` and `count` methods. We can now access and see values in the `simple.json` file from a DataFrame. 

Use the the `schema` method to see the definition of the columns of this DataFrame as a result of calling `.json`.

In [None]:
dfRawJson.schema

<div class="alert alert-block alert-info">
<b>Tip:</b> For files in formats other than CSV, like JSON, the DataFrameReader object treats the data in the file as a single VARIANT column with the name $1.</div>

---
### Progress: Check

- [X] Upload the JSON file to a Snowflake internal stage
- [X] Build a DataFrame that can read the data from the JSON file
- [ ] Load the entire JSON file to a table
---

## Load to a table

Using the DataFrame above, let's load a table with the contents of the file.

Since `$1` is not a valid column name we will need use the `as` function to rename the DataFrame column. Note that the `as`, `alias` and `name` functions are equivalent, and can be used to return a new renamed column.
    

In [None]:
val df = dfRawJson.select(col("$1").as("JSON"))
                  .write.saveAsTable("raw.SIMPLE_JSON")

var tableRows = session.table("raw.SIMPLE_JSON").count()
println(s"The table raw.SIMPLE_JSON has $tableRows rows.")

---
### Progress: Check

- [X] Upload the JSON file to a Snowflake internal stage
- [X] Build a DataFrame that can read the data from the JSON file
- [X] Load the entire JSON file to a table
---