In [0]:
%mdIn PySpark, createGlobalTempView() lets you register a DataFrame as a global temporary view that can be accessed across multiple SparkSessions within the same Spark application.

By default, these views are stored in the global_temp database.

In [0]:
data = [
    (1, "Alice", 25, "HR"),
    (2, "Bob", 30, "IT"),
    (3, "Cathy", 28, "Finance"),
    (4, "David", 35, "IT"),
    (5, "Eva", 40, "HR")
]

columns = ["id", "name", "age", "department"]

df = spark.createDataFrame(data, columns)

df.display()

id,name,age,department
1,Alice,25,HR
2,Bob,30,IT
3,Cathy,28,Finance
4,David,35,IT
5,Eva,40,HR


In [0]:
# Create a global temporary view
df.createGlobalTempView("employees_global")
#⚡ Note: Global views are always created inside the global_temp database.


In [0]:
# Must prefix with global_temp
result = spark.sql("SELECT name, department FROM global_temp.employees_global")
result.display()


name,department
Alice,HR
Bob,IT
Cathy,Finance
David,IT
Eva,HR


In [0]:
# Create a new Spark session
new_spark = spark.newSession()

# Query using the new session
result2 = new_spark.sql("SELECT department, COUNT(*) as emp_count FROM global_temp.employees_global GROUP BY department")
result2.display()


department,emp_count
HR,2
IT,2
Finance,1


### ✅ Key Points

createTempView() → session-scoped (only in the current SparkSession).

createGlobalTempView() → application-scoped (accessible across SparkSessions in the same app).

Always query with global_temp.<view_name>.