## SQL - Data Retrieval
    Data retrieval is the process of extracting meaningful information from a database using SQL queries.
    - Explore data
    - Filter relevant records
    - Sort and summarize information
    - Prepare datasets for analysis & reporting

    Why Data Retrieval is the Core of Data Analysis?
     -> Raw databases contain millions of rows
        Analysts never need all data
        SQL helps in ::
        - Removing noise
        - Focusing on required attributes
        - Creating analysis-ready datasets

### Database Connection

In [9]:
%reload_ext sql
%config SqlMagic.style = '_DEPRECATED_DEFAULT'
%sql mysql+pymysql://root:@localhost/test

In [11]:
%%sql
SELECT version();

 * mysql+pymysql://root:***@localhost/test
1 rows affected.


version()
10.4.32-MariaDB


### Create table Employee

In [17]:
%%sql
CREATE TABLE employees (
    emp_id INT,
    emp_name VARCHAR(50),
    department VARCHAR(30),
    salary INT,
    age INT,
    city VARCHAR(30),
    joining_date DATE
);

 * mysql+pymysql://root:***@localhost/test
0 rows affected.


[]

### Insert Data

In [20]:
%%sql
INSERT INTO employees VALUES
(101, 'Amit', 'IT', 60000, 25, 'Pune', '2022-06-10'),
(102, 'Neha', 'HR', 45000, 28, 'Mumbai', '2021-03-15'),
(103, 'Rahul', 'IT', 75000, 30, 'Pune', '2020-01-20'),
(104, 'Sneha', 'Finance', 50000, 26, 'Delhi', '2022-11-05'),
(105, 'Vikas', 'HR', 40000, 35, 'Mumbai', '2019-08-12'),
(106, 'Priya', 'IT', 85000, 29, 'Bangalore', '2018-04-25'),
(107, 'Arjun', 'Finance', 65000, 32, 'Pune', '2020-09-18');

 * mysql+pymysql://root:***@localhost/test
7 rows affected.


[]

### 1. SELECT [ Foundation of Retrieval ]
    - SELECT defines which columns you want to analyze
    - It does not modify data
    - Multiple columns can be selected

### Display all employee records

In [23]:
%%sql
select * from employees;

 * mysql+pymysql://root:***@localhost/test
7 rows affected.


emp_id,emp_name,department,salary,age,city,joining_date
101,Amit,IT,60000,25,Pune,2022-06-10
102,Neha,HR,45000,28,Mumbai,2021-03-15
103,Rahul,IT,75000,30,Pune,2020-01-20
104,Sneha,Finance,50000,26,Delhi,2022-11-05
105,Vikas,HR,40000,35,Mumbai,2019-08-12
106,Priya,IT,85000,29,Bangalore,2018-04-25
107,Arjun,Finance,65000,32,Pune,2020-09-18


### Display only employee name and salary

In [26]:
%%sql 
select emp_name,salary from employees;

 * mysql+pymysql://root:***@localhost/test
7 rows affected.


emp_name,salary
Amit,60000
Neha,45000
Rahul,75000
Sneha,50000
Vikas,40000
Priya,85000
Arjun,65000


### 2. DISTINCT [ Uniqueness Control ]
    - Removes duplicate rows from the result
    - Applied after SELECT
    - Helps in :: - Identifying unique categories
                  - Understanding data diversity

### Display all unique departments

In [36]:
%%sql 
select distinct department from employees;

 * mysql+pymysql://root:***@localhost/test
3 rows affected.


department
IT
HR
Finance


### Display all unique cities

In [39]:
%%sql 
select distinct city from employees;

 * mysql+pymysql://root:***@localhost/test
4 rows affected.


city
Pune
Mumbai
Delhi
Bangalore


### 3. WHERE [ Filtering Logic ]
    - Filters rows before they are returned, used to apply conditions
    - Most frequently used clause in analysis
    - Comparison: =, >, <, !=
    - Range: BETWEEN
    - Set: IN
    - Pattern: LIKE
    - Logical: AND, OR, NOT

### Find employees working in IT department

In [51]:
%%sql 
select * from employees where department ='IT';

 * mysql+pymysql://root:***@localhost/test
3 rows affected.


emp_id,emp_name,department,salary,age,city,joining_date
101,Amit,IT,60000,25,Pune,2022-06-10
103,Rahul,IT,75000,30,Pune,2020-01-20
106,Priya,IT,85000,29,Bangalore,2018-04-25


### List employees with salary greater than 60,000

In [56]:
%%sql 
select * from employees where salary > 60000;

 * mysql+pymysql://root:***@localhost/test
3 rows affected.


emp_id,emp_name,department,salary,age,city,joining_date
103,Rahul,IT,75000,30,Pune,2020-01-20
106,Priya,IT,85000,29,Bangalore,2018-04-25
107,Arjun,Finance,65000,32,Pune,2020-09-18


### Find employees from Pune city

In [59]:
%%sql
select * from employees where city = 'Pune';

 * mysql+pymysql://root:***@localhost/test
3 rows affected.


emp_id,emp_name,department,salary,age,city,joining_date
101,Amit,IT,60000,25,Pune,2022-06-10
103,Rahul,IT,75000,30,Pune,2020-01-20
107,Arjun,Finance,65000,32,Pune,2020-09-18


### Display employees aged between 25 and 30

In [70]:
%%sql
select * from employees where age between 25 and 30 ;

 * mysql+pymysql://root:***@localhost/test
5 rows affected.


emp_id,emp_name,department,salary,age,city,joining_date
101,Amit,IT,60000,25,Pune,2022-06-10
102,Neha,HR,45000,28,Mumbai,2021-03-15
103,Rahul,IT,75000,30,Pune,2020-01-20
104,Sneha,Finance,50000,26,Delhi,2022-11-05
106,Priya,IT,85000,29,Bangalore,2018-04-25


### 4. ORDER BY (Sorting Results)
    - Sorts the result set
    - Executed after WHERE
    - Default sorting order is ASC

### Sort employees by salary [ low to high ]

In [89]:
%%sql
select * from employees 
order by salary ;

 * mysql+pymysql://root:***@localhost/test
7 rows affected.


emp_id,emp_name,department,salary,age,city,joining_date
105,Vikas,HR,40000,35,Mumbai,2019-08-12
102,Neha,HR,45000,28,Mumbai,2021-03-15
104,Sneha,Finance,50000,26,Delhi,2022-11-05
101,Amit,IT,60000,25,Pune,2022-06-10
107,Arjun,Finance,65000,32,Pune,2020-09-18
103,Rahul,IT,75000,30,Pune,2020-01-20
106,Priya,IT,85000,29,Bangalore,2018-04-25


### Display employees sorted by department and salary`

In [97]:
%%sql
select emp_name, department, salary from employees
order by salary;

 * mysql+pymysql://root:***@localhost/test
7 rows affected.


emp_name,department,salary
Vikas,HR,40000
Neha,HR,45000
Sneha,Finance,50000
Amit,IT,60000
Arjun,Finance,65000
Rahul,IT,75000
Priya,IT,85000


### 5. LIMIT (Row Control)
    - Restricts number of output rows
    - Mostly used for:
       - Sampling
       - Debugging
       - Previewing large datasets

### Show top 3 highest paid employees

In [77]:
%%sql
select * from employees 
order by salary desc 
limit 3;

 * mysql+pymysql://root:***@localhost/test
3 rows affected.


emp_id,emp_name,department,salary,age,city,joining_date
106,Priya,IT,85000,29,Bangalore,2018-04-25
103,Rahul,IT,75000,30,Pune,2020-01-20
107,Arjun,Finance,65000,32,Pune,2020-09-18


### Display first 5 employees (preview data)

In [114]:
%%sql
select * from employees 
limit 5;

 * mysql+pymysql://root:***@localhost/test
5 rows affected.


emp_id,emp_name,department,salary,age,city,joining_date
101,Amit,IT,60000,25,Pune,2022-06-10
102,Neha,HR,45000,28,Mumbai,2021-03-15
103,Rahul,IT,75000,30,Pune,2020-01-20
104,Sneha,Finance,50000,26,Delhi,2022-11-05
105,Vikas,HR,40000,35,Mumbai,2019-08-12


### 6. Aliases [ AS ] â€“ Readability & Clarity
    - Temporary renaming of :: Columns, Tables
    - Exists only during query execution
    - Makes output understandable for non-technical users

### Rename salary column as Monthly_Salary

In [106]:
%%sql 
select emp_name, salary as Monthly_Salary
from employees
limit 3;

 * mysql+pymysql://root:***@localhost/test
3 rows affected.


emp_name,Monthly_Salary
Amit,60000
Neha,45000
Rahul,75000


### Use table alias to fetch IT employees

In [111]:
%%sql
select e.emp_name, e.salary,e.city
from employees as e
where e.department ='IT';

 * mysql+pymysql://root:***@localhost/test
3 rows affected.


emp_name,salary,city
Amit,60000,Pune
Rahul,75000,Pune
Priya,85000,Bangalore


## Data Analysis Query -> Show top 2 IT employees from Pune with salary above 60k

In [122]:
%%sql 
select * from employees
where department = 'IT' and city ='Pune' and salary > 60000
order by salary desc
limit 2;

 * mysql+pymysql://root:***@localhost/test
1 rows affected.


emp_id,emp_name,department,salary,age,city,joining_date
103,Rahul,IT,75000,30,Pune,2020-01-20
