- createGlobalTempView() registers a DataFrame as a global temporary view.

- Unlike createTempView(), the global temp view is accessible across different Spark sessions.

- It is stored in the global_temp database and can be queried from anywhere in the same Spark application.

In [3]:
from pyspark.sql import SparkSession

spark = SparkSession.builder.appName("createGlobalTempViewFunction").getOrCreate()


Using Spark's default log4j profile: org/apache/spark/log4j2-defaults.properties
25/09/15 07:37:46 WARN Utils: Your hostname, KLZPC0015, resolves to a loopback address: 127.0.1.1; using 172.25.17.96 instead (on interface eth0)
25/09/15 07:37:46 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
Using Spark's default log4j profile: org/apache/spark/log4j2-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
25/09/15 07:37:59 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
25/09/15 07:38:02 WARN Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041.


In [4]:
data = [
    ("Souvik", "Army", 80000),
    ("Soukarjya", "BDO", 75000),
    ("Sandip", "MTS", 50000),
    ("Prodipta", "Data Analyst", 25000),
    ("RamaSai", "System Engineer", 40000),
    ("Riya", "System Engineer", 16000),
    ("Padma", "Data Analyst", 18000)
]

columns = ["name", "department", "salary"]

df = spark.createDataFrame(data, columns)
df.show()



                                                                                

+---------+---------------+------+
|     name|     department|salary|
+---------+---------------+------+
|   Souvik|           Army| 80000|
|Soukarjya|            BDO| 75000|
|   Sandip|            MTS| 50000|
| Prodipta|   Data Analyst| 25000|
|  RamaSai|System Engineer| 40000|
|     Riya|System Engineer| 16000|
|    Padma|   Data Analyst| 18000|
+---------+---------------+------+



In [5]:
# create a global temporary view
df.createOrReplaceGlobalTempView("employee_global_view")


In [6]:
# query the global temporary view from the current session
result1 = spark.sql("select * from global_temp.employee_global_view")
result1.show()




+---------+---------------+------+
|     name|     department|salary|
+---------+---------------+------+
|   Souvik|           Army| 80000|
|Soukarjya|            BDO| 75000|
|   Sandip|            MTS| 50000|
| Prodipta|   Data Analyst| 25000|
|  RamaSai|System Engineer| 40000|
|     Riya|System Engineer| 16000|
|    Padma|   Data Analyst| 18000|
+---------+---------------+------+



                                                                                

- Query the global temp view from another session
- ----------------------------------------------
- In a different notebook or session, run:
spark.sql("select * from global_temp.employee_global_view").show()


In [9]:
# Example: get average salary by department
result2 = spark.sql(
    """
select department, avg(salary) as avg_sal
from global_temp.employee_global_view
group by department
"""
)
result2.show()


[Stage 4:>                                                          (0 + 4) / 4]

+---------------+-------+
|     department|avg_sal|
+---------------+-------+
|           Army|80000.0|
|            BDO|75000.0|
|            MTS|50000.0|
|System Engineer|28000.0|
|   Data Analyst|21500.0|
+---------------+-------+



                                                                                

- Notebook Summary:
    - createGlobalTempView() allows sessions in the same Spark application.
    - The view is session-independent and stored under the 'global_temp' database.
    - Use global_temp.view_name when querying it

In [10]:
# Example: get average salary by department
result2 = spark.sql(
    """
select department, avg(salary) as avg_sal
from global_temp.employee_global_view
group by department
"""
)
result2.show()




+---------------+-------+
|     department|avg_sal|
+---------------+-------+
|           Army|80000.0|
|            BDO|75000.0|
|            MTS|50000.0|
|System Engineer|28000.0|
|   Data Analyst|21500.0|
+---------------+-------+



                                                                                

In [11]:
spark.sql("show tables in global_temp").show()


+-----------+--------------------+-----------+
|  namespace|           tableName|isTemporary|
+-----------+--------------------+-----------+
|global_temp|employee_global_view|       true|
+-----------+--------------------+-----------+

