# Mixing Spark and PySpark cells in the same Notebook

Sparkmagic enables the use of Python, Scala, and R code cells within the same Notebook and SparkSession, allowing you to mix UDFs from different languages in a single DataFrame and leverage any Spark library—whether in Python, Scala, or R—in the language of your choice.

**Note:** This notebook illustrates the use of Python and Scala, but the process for R is the same.

**Note:** Remember to specify spark.jars or spark.PyFiles (as needed) when you want to import external packages into Spark.

### Sharing UDFs

Custom logic, such as UDFs, can be used in any language. This example shows how you can use Python functions in Scala

In [13]:
%%spark -l scala

val df = spark.range(10)

FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

df: org.apache.spark.sql.Dataset[Long] = [id: bigint]


In [14]:
%%spark -l python


def plus_one(x):
    return x + 1


def real_world_function(x, y, z):
    # import pandas, networkx, scikit ...
    pass

spark.udf.register("plus_one", plus_one, returnType="int")

FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

<function plus_one at 0x7feb35e11950>

In [15]:
%%spark -l scala

val df2 = df.withColumn("col2", callUDF("plus_one", $"id"))
df2.explain()
df2.show()

FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

df2: org.apache.spark.sql.DataFrame = [id: bigint, col2: int]
== Physical Plan ==
*(2) Project [id#21L, pythonUDF0#29 AS col2#23]
+- BatchEvalPython [plus_one(id#21L)], [id#21L, pythonUDF0#29]
   +- *(1) Range (0, 10, step=1, splits=4)
+---+----+
| id|col2|
+---+----+
|  0|   1|
|  1|   2|
|  2|   3|
|  3|   4|
|  4|   5|
|  5|   6|
|  6|   7|
|  7|   8|
|  8|   9|
|  9|  10|
+---+----+



Logic can also be shared via Views:

In [16]:
%%spark -l python

df = spark.range(10)
df.createTempView("for_scala")

FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

In [22]:
%%spark -l scala

val otherDF = spark.range(10).withColumn("scala_col", $"id" * 100)

spark.table("for_scala").join(otherDF, Seq("id")).createOrReplaceTempView("for_python")

FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

otherDF: org.apache.spark.sql.DataFrame = [id: bigint, scala_col: bigint]


In [23]:
%%spark -l python

spark.table("for_python").explain()
spark.table("for_python").show()

FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

== Physical Plan ==
*(2) Project [id#38L, scala_col#75L]
+- *(2) BroadcastHashJoin [id#38L], [id#73L], Inner, BuildLeft
   :- BroadcastExchange HashedRelationBroadcastMode(List(input[0, bigint, false]))
   :  +- *(1) Range (0, 10, step=1, splits=4)
   +- *(2) Project [id#73L, (id#73L * 100) AS scala_col#75L]
      +- *(2) Range (0, 10, step=1, splits=4)
+---+---------+
| id|scala_col|
+---+---------+
|  0|        0|
|  1|      100|
|  2|      200|
|  3|      300|
|  4|      400|
|  5|      500|
|  6|      600|
|  7|      700|
|  8|      800|
|  9|      900|
+---+---------+