
## Overview

This notebook will show you how to create and query a table or DataFrame that you uploaded to DBFS. [DBFS](https://docs.databricks.com/user-guide/dbfs-databricks-file-system.html) is a Databricks File System that allows you to store data for querying inside of Databricks. This notebook assumes that you have a file already inside of DBFS that you would like to read from.

This notebook is written in **Python** so the default cell type is Python. However, you can use different languages by using the `%LANGUAGE` syntax. Python, Scala, SQL, and R are all supported.

In [0]:
%sql
CREATE or replace TABLE employee_data (
    employee_id INT ,
    department_id INT,
    position VARCHAR(50),
    salary INT,
    tenure_months INT,
    job_experience_years INT
);

INSERT INTO employee_data VALUES
(1, 101, 'Software Engineer', 90000, 24, 2),
(2, 102, 'Data Analyst', 60000, 18, 1),
(3, 101, 'Project Manager', 90000, 36, 5),
(4, 103, 'UX Designer', 75000, 27, 3),
(5, 102, 'Business Analyst', 70000, 22, 2),
(6, 101, 'Software Engineer', 80000, 30, 4),
(7, 103, 'Product Manager', 95000, 42, 7),
(8, 102, 'Data Scientist', 100000, 48, 8);

num_affected_rows,num_inserted_rows
8,8


In [0]:
%sql
select * from employee_data


employee_id,department_id,position,salary,tenure_months,job_experience_years
1,101,Software Engineer,90000,24,2
2,102,Data Analyst,60000,18,1
3,101,Project Manager,90000,36,5
4,103,UX Designer,75000,27,3
5,102,Business Analyst,70000,22,2
6,101,Software Engineer,80000,30,4
7,103,Product Manager,95000,42,7
8,102,Data Scientist,100000,48,8


In [0]:
%sql
select employee_id,department_id,position,salary,tenure_months,
    ROW_NUMBER() OVER(partition by department_id order by salary desc) as row_num
    from employee_data;

employee_id,department_id,position,salary,tenure_months,row_num
1,101,Software Engineer,90000,24,1
3,101,Project Manager,90000,36,1
6,101,Software Engineer,80000,30,3
8,102,Data Scientist,100000,48,1
5,102,Business Analyst,70000,22,2
2,102,Data Analyst,60000,18,3
7,103,Product Manager,95000,42,1
4,103,UX Designer,75000,27,2


In [0]:
%sql
select employee_id,department_id,position,salary,tenure_months,
    Rank() OVER(partition by department_id order by salary desc) as row_num
    from employee_data;

employee_id,department_id,position,salary,tenure_months,row_num
1,101,Software Engineer,90000,24,1
3,101,Project Manager,90000,36,1
6,101,Software Engineer,80000,30,3
8,102,Data Scientist,100000,48,1
5,102,Business Analyst,70000,22,2
2,102,Data Analyst,60000,18,3
7,103,Product Manager,95000,42,1
4,103,UX Designer,75000,27,2


In [0]:
%sql
select employee_id,department_id,position,salary,tenure_months,
    dense_rank() OVER(partition by department_id order by salary desc) as row_num
    from employee_data;

employee_id,department_id,position,salary,tenure_months,row_num
1,101,Software Engineer,90000,24,1
3,101,Project Manager,90000,36,1
6,101,Software Engineer,80000,30,2
8,102,Data Scientist,100000,48,1
5,102,Business Analyst,70000,22,2
2,102,Data Analyst,60000,18,3
7,103,Product Manager,95000,42,1
4,103,UX Designer,75000,27,2
