####Upload the CSV to DBFS

####Create a managed table all_employee from the CSV

In [0]:
file_path = "/FileStore/tables/Employee.csv"

In [0]:
# read CSV into a Spark DataFrame (header + infer schema)
df = spark.read.option("header", "true").option("inferSchema", "true").csv(file_path)

In [0]:
# quick look
display(df.limit(5))
df.printSchema()

Education,JoiningYear,City,PaymentTier,Age,Gender,EverBenched,ExperienceInCurrentDomain,LeaveOrNot
Bachelors,2017,Bangalore,3,34,Male,No,0,0
Bachelors,2013,Pune,1,28,Female,No,3,1
Bachelors,2014,New Delhi,3,38,Female,No,2,0
Masters,2016,Bangalore,3,27,Male,No,5,1
Masters,2017,Pune,3,24,Male,Yes,2,1


root
 |-- Education: string (nullable = true)
 |-- JoiningYear: integer (nullable = true)
 |-- City: string (nullable = true)
 |-- PaymentTier: integer (nullable = true)
 |-- Age: integer (nullable = true)
 |-- Gender: string (nullable = true)
 |-- EverBenched: string (nullable = true)
 |-- ExperienceInCurrentDomain: integer (nullable = true)
 |-- LeaveOrNot: integer (nullable = true)



In [0]:
# save as a managed table in the metastore
df.write.mode("overwrite").saveAsTable("all_employee")

print("Saved as table: all_employee")

Saved as table: all_employee


####Quick validation


In [0]:
%sql
-- check columns + types

DESCRIBE TABLE all_employee;

col_name,data_type,comment
Education,string,
JoiningYear,int,
City,string,
PaymentTier,int,
Age,int,
Gender,string,
EverBenched,string,
ExperienceInCurrentDomain,int,
LeaveOrNot,int,


In [0]:
%sql
-- count rows

SELECT COUNT(*) FROM all_employee;

count(1)
4653


In [0]:
%sql
-- peek data

SELECT * FROM all_employee LIMIT 10;

Education,JoiningYear,City,PaymentTier,Age,Gender,EverBenched,ExperienceInCurrentDomain,LeaveOrNot
Bachelors,2017,Bangalore,3,34,Male,No,0,0
Bachelors,2013,Pune,1,28,Female,No,3,1
Bachelors,2014,New Delhi,3,38,Female,No,2,0
Masters,2016,Bangalore,3,27,Male,No,5,1
Masters,2017,Pune,3,24,Male,Yes,2,1
Bachelors,2016,Bangalore,3,22,Male,No,0,0
Bachelors,2015,New Delhi,3,38,Male,No,0,0
Bachelors,2016,Bangalore,3,34,Female,No,2,1
Bachelors,2016,Pune,3,23,Male,No,1,0
Masters,2017,New Delhi,2,37,Male,No,2,0


#### Create the experienced_employee view (SQL) — adapted to your file

In [0]:
%sql
-- create or replace view (basic)

CREATE OR REPLACE VIEW experienced_employee AS
SELECT
  JoiningYear,
  ExperienceInCurrentDomain,
  City,
  Gender,
  Education
FROM all_employee
WHERE ExperienceInCurrentDomain > 5;

Add comments (Databricks supports COMMENT ON for views & columns):

In [0]:
%sql
COMMENT ON VIEW experienced_employee IS 'View for experienced employees (ExperienceInCurrentDomain > 5)';
COMMENT ON COLUMN experienced_employee.ExperienceInCurrentDomain IS 'Years in current domain';
COMMENT ON COLUMN experienced_employee.JoiningYear IS 'Year employee joined';

Then inspect the view:

In [0]:
%sql
DESCRIBE TABLE EXTENDED experienced_employee;
SELECT * FROM experienced_employee LIMIT 10;

JoiningYear,ExperienceInCurrentDomain,City,Gender,Education
2016,7,Bangalore,Male,Bachelors
2016,7,Pune,Female,Bachelors
2016,6,Bangalore,Male,Bachelors
2014,6,Pune,Male,Bachelors
2017,6,Bangalore,Female,Bachelors
2014,7,Bangalore,Female,Masters
2017,6,New Delhi,Male,Bachelors
2014,7,Bangalore,Male,Bachelors
2018,6,Bangalore,Female,Masters
2012,7,Bangalore,Male,Bachelors


#####Notes / explanation

- CREATE OR REPLACE VIEW makes it easy to iterate.

- COMMENT ON writes descriptive metadata you (or others) can read later.

####Temporary-view example (subscribed_movies) — demo data (Python + SQL)

In [0]:
# Python cell: create demo members + movies temp views
members = spark.createDataFrame(
    [(1, "Alice Johnson"), (2, "Bob Smith"), (3, "Carol King")],
    ["id", "full_name"]
)
members.createOrReplaceTempView("members")

movies = spark.createDataFrame(
    [(1, "Inception"), (2, "Toy Story"), (1, "The Matrix")],
    ["member_id", "movie_title"]
)
movies.createOrReplaceTempView("movies")

print("Temporary views 'members' and 'movies' created for this session.")

Temporary views 'members' and 'movies' created for this session.


In [0]:
%sql
CREATE TEMPORARY VIEW subscribed_movies AS
SELECT mo.member_id, mb.full_name, mo.movie_title
FROM movies AS mo
INNER JOIN members AS mb
  ON mo.member_id = mb.id;

-- check it
SELECT * FROM subscribed_movies;

[0;31m---------------------------------------------------------------------------[0m
[0;31mAnalysisException[0m                         Traceback (most recent call last)
File [0;32m<command-4997728828851533>, line 1[0m
[0;32m----> 1[0m get_ipython()[38;5;241m.[39mrun_cell_magic([38;5;124m'[39m[38;5;124msql[39m[38;5;124m'[39m, [38;5;124m'[39m[38;5;124m'[39m, [38;5;124m'[39m[38;5;124mCREATE TEMPORARY VIEW subscribed_movies AS[39m[38;5;130;01m\n[39;00m[38;5;124mSELECT mo.member_id, mb.full_name, mo.movie_title[39m[38;5;130;01m\n[39;00m[38;5;124mFROM movies AS mo[39m[38;5;130;01m\n[39;00m[38;5;124mINNER JOIN members AS mb[39m[38;5;130;01m\n[39;00m[38;5;124m  ON mo.member_id = mb.id;[39m[38;5;130;01m\n[39;00m[38;5;130;01m\n[39;00m[38;5;124m-- check it[39m[38;5;130;01m\n[39;00m[38;5;124mSELECT * FROM subscribed_movies;[39m[38;5;130;01m\n[39;00m[38;5;124m'[39m)

File [0;32m/databricks/python/lib/python3.12/site-packages/IPython/core/inter

#####Notes

- CREATE TEMPORARY VIEW exists only for your session — once the cluster/session ends, the view disappears. Good for ad-hoc joins and experiments.

####Schema-binding demo

This demonstrates how a view behaves when the base table schema changes.

In [0]:
%sql
-- create a simple base table
CREATE TABLE IF NOT EXISTS emp(name STRING, income INT);

-- add sample rows
INSERT INTO emp VALUES ('Alice', 50000), ('Bob', 60000);

-- create a view with schema binding (default behavior in Databricks)
CREATE OR REPLACE VIEW emp_v WITH SCHEMA BINDING AS
SELECT * FROM emp;

-- now alter base table by adding a column
ALTER TABLE emp ADD COLUMN bonus SMALLINT;

-- select from the view
SELECT * FROM emp_v;



#####Explanation

With 'WITH SCHEMA BINDING' the view is tied to the base table’s schema; adding the bonus column to emp will not make bonus appear in emp_v (the view will keep the original column list). This demonstrates how schema-binding preserves the view contract. If you want views to adapt to new columns, use WITH SCHEMA EVOLUTION (or recreate the view) — Databricks documents this behavior and provides the WITH SCHEMA EVOLUTION option for that case.

####Helpful utility commands & cleanup

In [0]:
%sql
-- show CREATE VIEW text

SHOW CREATE TABLE experienced_employee;   -- or use right-click in UI to "Show query"



In [0]:
%sql
-- describe view metadata

DESCRIBE TABLE EXTENDED experienced_employee;



In [0]:
%sql
-- drop if you want to clean up

DROP VIEW IF EXISTS experienced_employee;
DROP VIEW IF EXISTS subscribed_movies;
DROP VIEW IF EXISTS emp_v;
DROP TABLE IF EXISTS emp;

