# Snowpark Scala in Workspace Notebooks (Prototype)

This notebook demonstrates running **Scala** and **Snowpark Scala** within a
Snowflake Workspace Notebook using a `%%scala` cell magic powered by JPype.

**Architecture:** Python kernel → JPype (JNI) → JVM (in-process) → Scala REPL → Snowpark

---

## Contents

1. [Installation & Configuration](#1)
2. [Basic Scala Execution](#2)
3. [Python ↔ Scala Interop](#3)
4. [Snowpark Scala Session](#4)
5. [Diagnostics](#5)

---
<a id="1"></a>
## 1. Installation & Configuration

### 1.1 Install JDK, Scala, and Snowpark JAR

Run the setup script. This takes ~2-4 minutes on first run (installs
OpenJDK 17, Scala 2.12, Ammonite, Snowpark JAR via micromamba + coursier).

On subsequent runs it detects what is already installed and skips those steps.

In [None]:
!bash setup_scala_environment.sh

### 1.2 Configure Python Environment & Register %%scala Magic

This cell:
1. Sets `JAVA_HOME` and `PATH`
2. Installs JPype1 into the kernel venv (if needed)
3. Starts the JVM in-process with the Scala + Snowpark classpath
4. Initialises the Scala REPL (Ammonite-lite or IMain)
5. Registers the `%%scala` cell magic

In [None]:
from scala_helpers import setup_scala_environment

result = setup_scala_environment()

print(f"Success:          {result['success']}")
print(f"Java version:     {result['java_version']}")
print(f"Scala version:    {result['scala_version']}")
print(f"Interpreter type: {result['interpreter_type']}")
print(f"JVM started:      {result['jvm_started']}")
print(f"Magic registered: {result['magic_registered']}")

if result['errors']:
    print(f"\nErrors:")
    for err in result['errors']:
        print(f"  - {err}")

### 1.3 Verify Scala Execution

In [None]:
%%scala
println(s"Hello from Scala ${util.Properties.versionString}")
println(s"Java: ${System.getProperty("java.version")}")
println(s"OS: ${System.getProperty("os.name")}")

---
<a id="2"></a>
## 2. Basic Scala Execution

State persists across `%%scala` cells — vals, defs, imports, and classes
defined in one cell are available in the next.

In [None]:
%%scala
// Define a value
val greeting = "Hello from Snowflake Workspace Notebook!"
println(greeting)

In [None]:
%%scala
// Previous cell's 'greeting' is still in scope
println(s"Greeting length: ${greeting.length}")

// Define a function
def factorial(n: Int): BigInt = if (n <= 1) 1 else n * factorial(n - 1)

println(s"10! = ${factorial(10)}")
println(s"20! = ${factorial(20)}")

In [None]:
%%scala
// Collections and functional programming
val numbers = (1 to 10).toList
val squares = numbers.map(n => n * n)
val evenSquares = squares.filter(_ % 2 == 0)

println(s"Numbers:      $numbers")
println(s"Squares:      $squares")
println(s"Even squares: $evenSquares")
println(s"Sum:          ${evenSquares.sum}")

In [None]:
%%scala
// Case classes and pattern matching
case class Employee(name: String, department: String, salary: Double)

val employees = List(
  Employee("Alice", "Engineering", 120000),
  Employee("Bob", "Engineering", 115000),
  Employee("Carol", "Data Science", 130000),
  Employee("Dave", "Data Science", 125000),
  Employee("Eve", "Product", 110000)
)

val byDept = employees.groupBy(_.department).map {
  case (dept, emps) => (dept, emps.map(_.salary).sum / emps.size)
}

byDept.toList.sortBy(-_._2).foreach {
  case (dept, avgSalary) =>
    println(f"  $dept%-20s $$${avgSalary}%,.0f")
}

---
<a id="3"></a>
## 3. Python ↔ Scala Interoperability

### 3.1 Push values from Python to Scala

In [None]:
from scala_helpers import push_to_scala

# Push a string and number from Python into the Scala interpreter
push_to_scala("pythonMessage", "Hello from Python!")
push_to_scala("pythonNumber", 42)

In [None]:
%%scala
// Access the variables pushed from Python
println(s"From Python: $pythonMessage")
println(s"Number: $pythonNumber")

### 3.2 Pull values from Scala to Python

In [None]:
%%scala
val scalaResult = (1 to 100).sum
println(s"Sum 1..100 = $scalaResult")

In [None]:
from scala_helpers import pull_from_scala

value = pull_from_scala("scalaResult")
print(f"Pulled from Scala: {value} (type: {type(value).__name__})")

---
<a id="4"></a>
## 4. Snowpark Scala Session

### 4.1 Inject credentials from Python session

The Python kernel already has an active Snowpark session. We extract its
credentials and set them as environment variables for the Scala side.

In [None]:
from snowflake.snowpark.context import get_active_session
from scala_helpers import inject_session_credentials

session = get_active_session()
creds = inject_session_credentials(session)

print("Credentials injected:")
for k, v in creds.items():
    if k == "SNOWFLAKE_PAT":
        print(f"  {k}: {'SET' if v else 'NOT SET'}")
    else:
        print(f"  {k}: {v}")

### 4.2 Create PAT for authentication

If you haven't already created a PAT (from the R notebook setup), create one now.
This is the same PATManager used for R/ADBC.

In [None]:
# Uncomment and run if SNOWFLAKE_PAT is not already set
# from r_helpers import PATManager
# pat_mgr = PATManager(session)
# pat_result = pat_mgr.create_pat(days_to_expiry=1, force_recreate=True)
# print(pat_result)

### 4.3 Create Snowpark Scala Session

In [None]:
from scala_helpers import create_snowpark_scala_session_code

# Preview the code that will be executed
code = create_snowpark_scala_session_code(use_pat=True)
print(code)

In [None]:
%%scala
import com.snowflake.snowpark._
import com.snowflake.snowpark.functions._

val session = Session.builder.configs(Map(
  "URL"           -> sys.env("SNOWFLAKE_URL"),
  "USER"          -> sys.env("SNOWFLAKE_USER"),
  "ROLE"          -> sys.env("SNOWFLAKE_ROLE"),
  "DB"            -> sys.env("SNOWFLAKE_DATABASE"),
  "SCHEMA"        -> sys.env("SNOWFLAKE_SCHEMA"),
  "WAREHOUSE"     -> sys.env("SNOWFLAKE_WAREHOUSE"),
  "TOKEN"         -> sys.env("SNOWFLAKE_PAT"),
  "AUTHENTICATOR" -> "oauth"
)).create

println("Snowpark Scala session created!")

### 4.4 Query Snowflake from Scala

In [None]:
%%scala
// Basic query
session.sql("SELECT CURRENT_USER() AS user, CURRENT_ROLE() AS role, CURRENT_WAREHOUSE() AS warehouse").show()

In [None]:
%%scala
// DataFrame operations
val df = session.sql("SELECT 'Scala' AS language, 'Snowpark' AS framework, CURRENT_TIMESTAMP() AS ts")
df.show()

In [None]:
%%scala
// Show available tables
session.sql("SHOW TABLES LIMIT 5").show()

### 4.5 Cross-language Data Sharing

Both Python and Scala sessions connect to the same Snowflake account.
Use temp tables to share data between them.

In [None]:
# Python: create a temp table
session.sql("""
    CREATE OR REPLACE TEMPORARY TABLE scala_demo (
        id INT, name STRING, value DOUBLE
    ) AS
    SELECT column1, column2, column3 FROM VALUES
        (1, 'alpha', 10.5),
        (2, 'beta', 20.3),
        (3, 'gamma', 30.7)
""").collect()
print("Temp table 'scala_demo' created from Python")

In [None]:
%%scala
// Scala: read the temp table created by Python
val demo = session.table("scala_demo")
demo.show()

// Compute something
val total = demo.select(sum(col("VALUE"))).collect()(0).getDouble(0)
println(s"Total value: $total")

---
<a id="5"></a>
## 5. Diagnostics

In [None]:
from scala_helpers import print_diagnostics
print_diagnostics()