# JavaUDF from Snowpark

Snowflake can run your Java/Scala code as part of your overall data processing inside the Snowflake data processing engine.  

![](../assets/java_udf_overview.gif)

<div class="alert alert-block alert-info">
<b>Language Support:</b> Note that Snowflake's initial support for extensible user defined functions starts with Java/Scala, but will be not be limited to JVM languages in the future (such as Python)</div>

This extensibility feature allows you to deploy and leverage your own internal programming expertise, along with the extensive set of Java libraries that already exist, to accomplish a myriad of data processing capabilities.

In this section, we'll leverage *Snowpark* to collect and deploy our Scala code for building custom Java code that will run inside Snowflake.

In this case, we'll be as simple as possible, and simply replace a `String` to understand the mechanics of where and how this code is deployed inside Snowflake.  We will also show locally running code to complement the server running code. In short, you can choose the location you want your code to run based on your specific requirements and needs with *Snowpark*.

At the end of this lab, you will have run similar code in FOUR different ways:

- [ ] Temporarily for a query, unnamed 
- [ ] Temporarily for a session, named
- [ ] Permanently for Snowpark and SQL, named
- [ ] Locally, leveraging Snowflake DataFrame `collections`


## Connect to Snowflake

In [None]:
// Import Snowpark into our Scala notebook
import com.snowflake.snowpark._
import com.snowflake.snowpark.functions._
import com.snowflake.snowpark.types._

In [None]:
// Set connection properties built in de_snowpark/A-Dataframes/01-Sessions.ipynb
val pwd = sys.env.get("PWD").fold("")(_.toString)
val filename = s"$pwd/de_snowpark/connect.properties"

val session = Session.builder.configFile(s"$filename").create

In [None]:
// Import existing implict objects (such as existing UDFs) for use by Snowpark
import session.implicits._

In [None]:
// Create a Snowflake internal stage that will be used by our Java UDFs
session.sql("create stage if not exists raw.JAVA_UDF_STAGE").collect

## Create our UDF class in Scala

Rather than reinventing the wheel, we will create a single wheel (`class`) we'll use four different ways (`methods` used in UDFs differently) below.  We will use each of these methods to get a sense for how Snowpark can run our Java/Scala code as part of an entire pipeline using [DataFrames](https://docs.snowflake.com/en/developer-guide/snowpark/working-with-dataframes.html).

In [None]:
class parrotClass() extends Serializable {
    // Temporary (unnamed) says Hello
    def sayHello = (s: String) => {
        s"Hello there $s"
    }
    // Temporary (named) says Howdy
    def sayHowdy = (s: String) => {
        s"Howdy $s"
    }
    // Permanent (named) says Goodbye
    def sayGoodbye = (s: String) => {
        s"Goodbye $s"
    }
    // Local says Welcome to my town
    def localSays = (s: String) => {
        s"Welcome to my town $s"
    }
}

In [None]:
// Test locally... just say Hello here in Scala to test our class

// (nothing to do with Snowpark/Snowflake)
new parrotClass().sayHello("Joe")
new parrotClass().sayHello("Cynthia")
new parrotClass().sayHowdy("Bobby")
new parrotClass().sayHowdy("Barbie")
new parrotClass().sayGoodbye("Xer")
new parrotClass().sayGoodbye("Ravi")
new parrotClass().localSays("David")
new parrotClass().localSays("Uday")

### sayHello: Create a Temporary UDF (unnamed)

Let's start with a UDF that Snowpark will deploy as `temporary` and `unnamed`.  These temporary, unnamed UDFs typically are used only for a single query.  They can contain simple Scala code to do some more complex logic.  You can not reference this UDF without re-uploading the code in subsequent sessions; it's a one time use UDF.

![](../assets/java_udf_temp_unnamed.png)

In [None]:
// Create a temporary unnamed UDF in Snowflake
val sayHelloUDF = udf((new parrotClass()).sayHello)

<div class="alert alert-block alert-warning">
<i class="fas fa-question-circle fa-2x"></i>
<b>Queries</b>: What queries were executed on your behalf by Snowpark when you created a new UDF using the <mark>sayHello</mark> method?   Did you notice a .jar was uploaded on your behalf?
</div>


Now we have a temporary function, with some long machine-readable name (such as `tempUDF_939266449(arg1 STRING)`) that can be used in a Snowpark DataFrame.  Don't worry, we don't need to know what the temporary UDF name is. Once it is ready to go in Snowflake, we can just use the UDF object (`sayHelloUDF`) in our DataFrame.

In [None]:
session
    .sql("select 'Mohit' as name")         // String Literal DataFrame
    .withColumn(
      "fromjava", sayHelloUDF(col("NAME")) // Run through UDF in Snowflake!
    ) 
.show

If you have successfully run your UDF code, uploaded and registered as a temporary unnamed Snowpark function, you will see the output of the Java code, run on Snowflake as follows:

```
------------------------------
|"NAME"  |"FROMJAVA"         |
------------------------------
|Mohit   |Hello there Mohit  | <--- `Hello there Mohit` was the output of our Java code run on Snowflake!
------------------------------
```

### Progress: Check

- [X] Temporarily for a query, unnamed 
- [ ] Temporarily for a session, named
- [ ] Permanently for Snowpark and SQL, named
- [ ] Locally, leveraging Snowflake DataFrame `collections`

### sayHowdy: create a Temporary UDF (named)

More typical, and available for use in subsequent SQL, Snowpark can collect and deploy a temporary, named function.  This function is still temporary. It can be accessed via SQL, or instead of the created UDF object, it can also be invoked via the `callUDF` column expression function.

![](../assets/java_udf_temp_named.png)

In [None]:
// Register temporary function RAW.SAY_HOWDY()
session.udf.registerTemporary("raw.SAY_HOWDY", new parrotClass().sayHowdy)

<div class="alert alert-block alert-warning">
<i class="fas fa-search fa-2x"></i>
<b>CREATE TEMPORARY</b>: The output of this looks similar to before... What was the function name that was created?  Was it machine-generated output? Was it what you expected to see?
</div>

In [None]:
session
    .sql("select 'Alfred' as name")          // Holy smokes, batman!
    .withColumn("HOWDYNAME"
        , callUDF("raw.SAY_HOWDY", col("name")) 
    )
.show

<div class="alert alert-block alert-warning">
<i class="fas fa-surprise fa-2x fa-border"></i>
    <b>Pop Quiz:</b> Will <mark>SAY_HOWDY()</mark> be available in SQL if you log in to Snowsight and run a query?  Why or why not? Let's try it below.
</div>

Head over to [https://app.snowflake.com/](https://app.snowflake.com/) and login to the class Snowflake account using your animal name and password. Create a worksheet, and run the following same SQL:

```sql
// use [login]_db; -- use your default DB
use schema raw;

SELECT  *  FROM ( 
  SELECT 
    "NAME"
    , raw.SAY_HOWDY("NAME") AS "HOWDYNAME" 
  FROM 
    (select 'Alfred' as name)
  ) 
LIMIT 10;
```

### Progress: Check

- [X] Temporarily for a query, unnamed 
- [X] Temporarily for a session, named
- [ ] Permanently for Snowpark and SQL, named
- [ ] Locally, leveraging Snowflake DataFrame `collections`

### sayGoodbye: Create a Permanent UDF (named)

Snowpark can also collect and push a Scala UDF and register it so that it's available in this Snowpark session, subsequent Snowpark sessions, and SQL.  The `registerPermanent` method creates a JavaUDF that is made available similar to any other scalar UDF in Snowflake (be it Java, JavaScript or SQL).

![](../assets/java_udf_perm.png)

In [None]:
// Create a Permanent UDF in Snowflake
session.udf.registerPermanent("raw.SAY_GOODBYE", new parrotClass().sayGoodbye, "raw.JAVA_UDF_STAGE")
// Since this UDF will stick around for a while, we need a permanent place to
// hold the .jars and code for it.  JAVA_UDF_STAGE is a named stage that 
// we will use to hold this jar...

In [None]:
session
    .sql("select 'Henrietta' as name")
    .withColumn("GOODBYE_PERM", callUDF("raw.SAY_GOODBYE", col("name"))) 
.show

Head over to [https://app.snowflake.com/](https://app.snowflake.com/) and login to the class Snowflake account using your animal name and password. Create a worksheet, and run the following same SQL:

```sql
// use [login]_db; -- use your default DB
use schema raw;

SELECT  *  FROM ( 
  SELECT 
    "NAME"
    , raw.SAY_GOODBYE("NAME") AS "GOODBYE_PERM"
  FROM (select 'Henrietta' as name)
  )
LIMIT 10;
```

It runs, right?  That's because `SAY_GOODBYE` has been created as a regular UDF and is available.

### Progress: Check

- [X] Temporarily for a query, unnamed 
- [X] Temporarily for a session, named
- [X] Permanently for Snowpark and SQL, named
- [ ] Locally, leveraging Snowflake DataFrame `collections`

### localSays: Run local (here in notebook)

With Snowpark, you have the choice on where you want to execute your Java.  If, for instance, you want to create your own programs and access remote network locations (for sending email/SMS/HTTPS) you can pair Snowflake DataFrames with your own local code.  The following isn't a UDF at all; it's actually showing that you can make use of DataFrames and interact with classes, **and** provide your own control programs.  

![](../assets/java_udf_local.png)

The [Scala world is your oyster](https://docs.scala-lang.org/overviews/collections/trait-traversable.html); you can `.fold`, `.map`, `.foreach` to your heart's content and pair that with your existing code!  

In [None]:
session
    .sql("select 'Rick' name union select 'Farnaz'")
    .collect()
    // Above this line is running in Snowflake in the cloud
    // Below this line is running locally here in the notebook
    .foreach(row => {
        println(new parrotClass().localSays(row.getString(0)))
    })

### Progress: Check

- [X] Temporarily for a query, unnamed 
- [X] Temporarily for a session, named
- [X] Permanently for Snowpark and SQL, named
- [X] Locally, leveraging Snowflake DataFrame `collections`

# Cleanup

In [None]:
// Optionally Cleanup your objects
session.sql("drop function raw.say_howdy(varchar)").collect
session.sql("drop function raw.say_goodbye(varchar)").collect