#  Data Science Learning Journey  
*Curiosity to Capability — One Notebook at a Time*

---
Compiled and authored by **Partho Sarothi Das**   
	Dhaka, Bangladesh  
	Bachelor's & Master's in Statistics  
	Investment Banking Professional → Aspiring Data Scientist 
    
---

# DML – Data Manipulation Language

## INSERT: Insert data into table

### 1. Insert data into a single row

```sql
INSERT INTO sheren.users (user_id, name, email, password)
VALUES (NULL, 'Sherena', 'sherena@gamil.com', '1234')
```
> **sheren** is database name; **users** is table name

*Since 'user_id' is AUTO_INCREMENT, this value can be skiped*

```sql
INSERT INTO sheren.users (name, email, password)
VALUES ('Sherena', 'sherena@gamil.com', '1234')
```

*If you insert all the values in the correct sequence, you can omit the column names*

```sql
INSERT INTO sheren.users
VALUES (NULL, 'Sherena', 'sherena@gamil.com', '1234')
```
### 2. Insert data into multiple rows

```sql
INSERT INTO partho.users VALUES
(NULL, 'Amitab', 'amitab@gmail.com','123'),
(NULL, 'Shah Rukh', 'shahrukh@gmail.com','124'),
(NULL, 'Rani', 'rani@gmail.com', '898'),
(NULL, 'Kajol', 'kajol@gmail.com' , '234')
```

### 3. Insert `.csv` file

Database_name --> on tables right click ---> Table Data Import Wizard

## UPDATE : Database update

### Example-1 : 
**Modify the processor_brand bionic to bionic_pro.**

```sql
UPDATE partho.smartphones
SET processor_brand = 'bionic_pro'
WHERE processor_brand = 'bionic'
```

### Example-2 :
**Modify the email and password for name = Partho**

```sql
UPDATE partho.users
SET email = 'partho@gmail.com', password='1234'
WHERE name = 'Partho'
```

## DELETE : Delete Database

### DELETE Syntax

```sql
DELETE FROM table_name WHERE condition;
```

### Delete All Records

It is possible to delete all rows in a table without deleting the table. This means that the table structure, attributes, and indexes will be intact:

```sql
DELETE FROM table_name
```

### Delete a Table

```sql
DELETE table_name
```

### Example-1 :
**Delete phones with price> 200000**

```SQL
DELETE FROM partho.smartphones
WHERE price > 200000
```

# DQL – Data Query Language

## SELECT: Retrieving Data

### 1. To retrieve all the rows

```sql
SELECT * FROM partho.users
```

### 2. Columns Filter

```sql
SELECT model, price, rating FROM partho.smartphones
```
**Rename columns:**
```sql
SELECT model, battery_capacity AS 'Ah', os AS 'Operating System' FROM partho.smartphones
```
**With some math**

**Example-1:**
```sql
SELECT model, sqrt(resolution_width*resolution_width + resolution_height*resolution_height)/screen_size AS ppi
FROM partho.smartphones
```
**Example-2:**
```sql
SELECT model, rating/10 FROM partho.smartphones
```

### 3. DISTINCT ----> Retrieve unique values

**Unique Values**

```sql
SELECT DISTINCT(model) AS 'All Brands'
FROM partho.smartphones
```
**DISTINCT Combinations**

```sql
SELECT DISTINCT brand_name, processor_brand 
FROM partho.smartphones;
```

### 4. WHERE ----> Row Filter

**Example-1 :** Finding all value for brand_name *samsung*

```sql
SELECT * FROM partho.smartphones
WHERE brand_name = 'samsung'
```

**Example-2 :** Finding all smartphones whose price is greater than 50000

```sql
SELECT * FROM partho.smartphones
WHERE price>50000
```

### 5. ORDER BY

**Example-1 :** Find all values for barnd_name *samsung* in **ascending order** on price

```sql
SELECT * FROM partho.smartphones
WHERE brand_name = 'samsung'
ORDER BY price DESC
```

**Example-2 :** Find all values for barnd_name *samsung* in **descending order** on price

```sql
SELECT * FROM partho.smartphones
WHERE brand_name = 'samsung'
ORDER BY price DESC
```
**Example-3 :** Find **top 10** values for barnd_name *apple* in **descending order** on screen_size

```sql
SELECT brand_name,screen_size FROM partho.smartphones
WHERE BRAND_NAME = 'apple'
ORDER BY screen_size DESC LIMIT 10
```

### 6. BETWEEN

**Example-1 :** Find all phones in the price range 10000 to 20000

Process:1

```SQL
SELECT * FROM partho.smartphones
WHERE price BETWEEN 10000 AND 20000
```
Process:2
```sql
SELECT * FROM partho.smartphones
WHERE price > 10000 AND price < 20000
```

### 7. IN

```sql
SELECT * FROM partho.smartphones
WHERE processor_brand IN ("snapdragon", "exynos", "bionic")
```

### 8. NOT IN

```sql
SELECT * FROM partho.smartphones
WHERE processor_brand NOT IN ("snapdragon", "exynos", "bionic")
```

### Excercise

**Example-1 :** Find all phones rating > 80, price < 25000

```sql
SELECT * FROM partho.smartphones
WHERE rating > 80 AND price < 25000
```

**Example-2 :** Find bands who sell phones price>50000 

```sql
SELECT DISTINCT(brand_name) FROM partho.smartphones
WHERE price> 50000
```

## Aggregate Functions

**1. MAX**: Find the highest price in the price column:

```sql
SELECT MAX(price) FROM partho.smartphones
```

**2. MIN**: Find the lowest price in the price column:

```sql
SELECT MIN(price) FROM partho.smartphones
```
**3. SUM**: Find the total price of all phones

```sql
SELECT SUM(price) FROM partho.smartphones;
```

**4. COUNT**: Find the number of samsung phone
```sql
SELECT COUNT(*) FROM partho.smartphones
WHERE brand_name = 'samsung'
```

**5. COUNT(DISTINCT)**: Find the number of brand available

```sql
SELECT COUNT(DISTINCT(brand_name)) FROM partho.smartphones
```

**6. STD**: Standard Deviation of screen size
```sql
SELECT STD(screen_size) FROM partho.smartphones
```

**7. VARIANCE**: VARIANCE of screen size

```sql
SELECT VARIANCE(screen_size) FROM partho.smartphones
```


## Scalar Functions


**1. ROUND**: ROUND the values of screen_size upto 1 decimal

```sql
SELECT ROUND(screen_size,1) FROM partho.smartphones
```

**2. CEIL**

```sql
SELECT CEIL(screen_size) FROM partho.smartphones;
```

**3. FLOOR**

```sql
SELECT FLOOR(screen_size) FROM partho.smartphones;
```

## Excercise ----> Sorting Data

*1. Find top 5 Samsung phones with biggest screen size*

```sql
SELECT  model, screen_size FROM partho.smartphones
WHERE brand_name = 'samsung'
ORDER BY screen_size DESC LIMIT 5
```

*2. Sort all the phone with in descending order by the number of total cameras*

```sql
SELECT * FROM partho.smartphones
ORDER BY (num_rear_cameras+num_front_cameras) DESC
```

*3. Sort data on the basis of ppi in decreasing order*

```sql
SELECT * FROM partho.smartphones
ORDER BY SQRT(resolution_height*resolution_height + resolution_width*resolution_width)/screen_size DESC
```
**Alternative**

```sql
SELECT brand_name, SQRT(resolution_height*resolution_height + resolution_width*resolution_width)/screen_size as ppi 
FROM partho.smartphones
ORDER BY ppi DESC
```

*4. Find the phone with 2nd largest battery*

```sql
SELECT model, battery_capacity FROM partho.smartphones
ORDER BY battery_capacity DESC LIMIT 1, 1
```

Note: LIMIT x,y ---> starts from xth item; Show yth item from xth item

*5. Find the name and rating of the worst rated apple phone*

```sql
SELECT brand_name, rating FROM partho.smartphones
WHERE brand_name = 'apple'
ORDER BY rating 
```

*6. Sort phones alphabetically and then on the basis of rating in descending order*

```sql
SELECT * FROM partho.smartphones
ORDER BY brand_name, rating DESC
```

*7. Sort phones alphabetically and then on the basis of price in ascending order*

```sql
SELECT * FROM partho.smartphones
ORDER BY brand_name, price
```

## Excercise ----> Grouping Data

*1. Group smartphones by brand and get count, average price, max rating, avg screen size and avg battery capacity*

```sql
SELECT brand_name, COUNT(*) AS 'Num_phone',
ROUND(AVG(price),2) AS 'avg price',
MAX(rating) AS 'max rating',
ROUND(AVG(screen_size),2) AS 'avg screen size',
ROUND(AVG(battery_capacity),2) AS 'avg battery capacity'
FROM partho.smartphones
GROUP BY brand_name
ORDER BY Num_phone DESC
```

*2. Group smartphones by whether they have an NFC and get the average price and rating*

```sql
SELECT has_nfc, 
AVG(price) AS 'avg_price',
AVG(rating) AS 'avg_rating'
FROM partho.smartphones
GROUP BY has_nfc
```

*3. Group smartphones by the extended memory available and get the average price*

```sql
SELECT extended_memory_available,
AVG(price) AS 'AVG_Price'
FROM partho.smartphones
GROUP BY extended_memory_available
```

*4. Group smartphones by the brand and processor brand and get the count of models and the average primary camera resolution (rear)*

```sql
SELECT brand_name, processor_brand,
COUNT(*) AS 'num_phone',
ROUND(AVG(primary_camera_rear)) AS 'avg camera resolution'
FROM partho.smartphones
GROUP BY brand_name, processor_brand
```

*5. Find top 5 most costly phone brands*

```sql
SELECT brand_name,
ROUND(AVG(price)) AS 'avg_price'
FROM partho.smartphones
GROUP BY brand_name
ORDER BY avg_price DESC LIMIT 5
```

*6. Which brand makes the smallest screen smartphones.*

```sql
SELECT brand_name, ROUND(avg(screen_size),2) AS 'avg_screen_size'
FROM partho.smartphones
GROUP BY brand_name ORDER BY avg_screen_size LIMIT 1
```

*7. Avg price of 5g phones vs avg price of non 5g phones*

```sql
SELECT has_5g, AVG(price) AS 'avg_price'
FROM partho.smartphones
GROUP BY has_5g
```

*8. Group smartphones by the brand, and find the brand with the highest number of models that have both NFC and an IR blaster.*

```sql
SELECT brand_name, COUNT(*) AS 'model_count'
FROM partho.smartphones
WHERE has_nfc = 'True' AND has_ir_blaster = 'True'
GROUP BY brand_name
ORDER BY model_count DESC LIMIT 1
```

*9. Find all Samsung 5g enabled smartphones and find out the avg price for NFC and Non-NFC phones*

```sql
SELECT has_nfc, AVG(price) AS 'avg_price'
FROM partho.smartphones
WHERE brand_name = 'samsung' AND has_5g = 'True'
GROUP BY has_nfc
```

*10. Find the phone name, price of the costliest phone.*

```sql
SELECT brand_name, price FROM partho.smartphones
ORDER BY price DESC LIMIT 1
```

## Excercise ----> HAVING

SELECT -----> WHERE   
GROUP BY ----> HAVING

*1. Find the avg rating of smartphone brands which have more than 20 phones.*

```sql
SELECT brand_name, COUNT(*) AS count,
ROUND(AVG(price)) AS 'avg_price'
FROM partho.smartphones
GROUP BY brand_name
HAVING count >20
```

*2. Find the top 3 brands with the highest avg ram that have a refresh rate of at least 90 Hz and fast charging available and don't consider brands which have less than 10 phones.*

```sql
SELECT brand_name,
avg(ram_capacity) AS 'avg_ram'
FROM partho.smartphones
WHERE refresh_rate > 90 AND fast_charging_available = '1'
GROUP BY brand_name
HAVING COUNT(brand_name)>10
ORDER BY avg_ram DESC LIMIT 3
```

*3. Find the avg price of all the phone brands with avg rating> 70 and num_phones more than 10 among all 5g enabled phones.*

```sql
SELECT brand_name, 
AVG(price) AS 'avg_price',
AVG(rating) AS 'avg_rating'
FROM partho.smartphones
WHERE has_5g = 'True'
GROUP BY brand_name
HAVING avg_rating > 70 AND COUNT(*) > 10
```

---
# Excercise on ipl dataset
---

*1. Find the top 5 batsman in IPL*

```sql
SELECT batter, SUM(batsman_run) AS 'total_run'
FROM partho.ipl
GROUP BY batter
ORDER BY total_run DESC LIMIT 5
```

*2. Find the 2nd highest 6 hitter in IPL*

```sql
SELECT batter, COUNT(*) AS num_six FROM partho.ipl
WHERE batsman_run = '6'
GROUP BY batter
ORDER BY num_six DESC LIMIT 1,1
```

*3. Find Virat Kohli's performance against all IPL teams.*

```sql
SELECT bowling_team, SUM(batsman_runs) AS 'total_runs' FROM partho.deliveries
WHERE batter = 'V Kohli'
GROUP BY bowling_team
ORDER BY total_runs DESC
```

*4. Find top 10 batsman with centuries in IPL*

```sql
SELECT match_id, batter, SUM(batsman_runs) AS 'total_runs'
FROM partho.deliveries
GROUP BY batter,  match_id
HAVING total_runs >=100
```

*5.  Find the top 5 batsman with highest strike rate who have played a min of 1000 balls*

```sql
SELECT batter, SUM(batsman_runs) AS 'runs', COUNT(batsman_runs) AS 'total_ball', 
ROUND(SUM(batsman_runs)*100/COUNT(*),2) AS 'strike_rate' 
FROM partho.deliveries
GROUP BY batter
HAVING total_ball>1000
ORDER BY strike_rate DESC LIMIT 5
```

---
# Excercise on insurance dataset
---

*1. Show records of 'male' patient from 'southwest' region.*

```sql
SELECT * FROM partho.insurance
WHERE gender = 'male' AND region = 'southwest'
```

*2. Show all records having bmi in range 30 to 45 both inclusive.*

```sql
SELECT * FROM partho.insurance
WHERE bmi BETWEEN 30 AND 45
```

*3. Show minimum and maximum bloodpressure of diabetic patient who smokes. Make column names as MinBP and MaxBP respectively.*

```sql
SELECT MIN(bloodpressure) AS MinBP,  
MAX(bloodpressure) AS MaxBP
FROM partho.insurance
WHERE smoker = 'Yes'
```

*4. Find no of unique patients who are not from southwest region.*

```sql
SELECT COUNT(DISTINCT(PatientID)) FROM partho.insurance
WHERE region != 'southwest'
```

*5. Total claim amount from male smoker.*

```sql
SELECT SUM(claim)  AS total_clain 
FROM partho.insurance
WHERE gender = 'male' AND smoker = 'Yes'
```

*6. Select all records of south region.*

```sql
SELECT * FROM partho.insurance
WHERE region LIKE 'south%'
```

*7. No of patient having normal blood pressure. Normal range[90-120]*

```sql
SELECT * FROM partho.insurance
WHERE bloodpressure BETWEEN 90 AND 120
```

*8. What is the average claim amount for non-smoking female patients who are diabetic?*

```sql
SELECT AVG(claim) AS 'avg_claim' FROM partho.insurance
WHERE smoker = 'No' AND gender = 'female' AND diabetic = 'Yes'
```

*9. Write a SQL query to delete all records for patients who are smokers and have no children.*

```sql
DELETE FROM partho.insurance
WHERE smoker = 'Yes' AND children=0
```


## JOIN

### Join --- > inner join

*Example:1* **All** the columns

```sql
SELECT * FROM join_class.membership T1
JOIN join_class.users1 T2
ON T1.user_id = T2.user_id
```

*Example:2* With **selective** columns

```sql
SELECT T1.user_id, T1.name, T2.membership_id 
FROM join_class.users1 T1
JOIN join_class.membership T2
ON T1.user_id = T2.user_id
```

### Left Join

```sql
SELECT * FROM join_class.membership T1
LEFT JOIN join_class.users1 T2
ON T1.user_id = T2.user_id
```

### Right Join

```sql
SELECT * FROM join_class.membership T1
RIGHT JOIN join_class.users1 T2
ON T1.user_id = T2.user_id
```

### Full Join : UNION

```sql
SELECT * FROM join_class.membership T1
LEFT JOIN join_class.users1 T2
ON T1.user_id = T2.user_id
UNION
SELECT * FROM join_class.membership T1
RIGHT JOIN join_class.users1 T2
ON T1.user_id = T2.user_id

```

### Cross Join

```sql
SELECT * FROM join_class.users1
CROSS JOIN join_class.groups
```

### Set Operations : Should be same table structure

### UNION 
>Join two tables with same table structure and **remove** duplicates

```SQL
SELECT * FROM join_class.person1
UNION
SELECT * FROM join_class.person2
```

### UNION ALL
>Join two tables with same table structure and **keeps** duplicates

```SQL
SELECT * FROM join_class.person1
UNION ALL
SELECT * FROM join_class.person2
```

### SELF JOIN

```sql
SELECT * FROM join_class.users1 T1
JOIN join_class.users1 T2
ON T1.emergency_contact = T2.user_id
```

### Join ON **more than one column**

```sql
SELECT * FROM join_class.students T1
JOIN join_class.class T2
ON T1.class_id = T2.class_id
AND T1.enrollment_year = T2.class_year
```

### Excersice ----> join

*1. Find order_id, name, city by joining users and orders from flipkart database.*

```sql
SELECT T1.order_id, T2.name, T2.city
FROM flipkart.orders T1
JOIN flipkart.users T2
ON T1.user_id = T2.user_id
```

2. Find order_id, product category by joining order_details and category from flipkart database.

```sql
SELECT T1.order_id, T2.category
FROM flipkart.order_details T1
JOIN flipkart.category T2
ON T1.category_id = T2.category_id
```

*3. Find all the orders placed in pune*

```sql
SELECT * FROM flipkart.orders T1
JOIN flipkart.users T2
ON T1.user_id = T2.user_id
WHERE T2.city = 'Pune'
```

*4. Find all the orders placed in pune by Priyanka*

```sql
SELECT * FROM flipkart.orders T1
JOIN flipkart.users T2
ON T1.user_id = T2.user_id
WHERE T2.city = 'Pune' AND T2.name = 'Priyanka'
```

*5. Find all orders under Chairs category*

```sql
SELECT * FROM flipkart.order_details T1
JOIN flipkart.category T2
ON T1.category_id = T2.category_id
WHERE T2.category = 'Chairs'
```

*6. Find all profitable orders*

```sql
SELECT T1.order_id, SUM(T2.profit) AS 'total_profit' 
FROM flipkart.orders T1
JOIN flipkart.order_details T2
ON T1.order_id = T2.order_id
GROUP BY T1.order_id
HAVING total_profit>0
```

*7. Find the customer who has placed max number of orders*

```sql
SELECT T1.user_id, T2.name, COUNT(T1.order_id) AS total_orders
FROM flipkart.orders T1
JOIN flipkart.users T2
ON T1.user_id = T2.user_id
GROUP BY T1.user_id, T2.name
ORDER BY total_orders DESC LIMIT 1
```

*8. Which is the most profitable category*

```sql
SELECT T2.category, SUM(T1.profit) AS 'category_wise_profit'
FROM flipkart.order_details T1
JOIN flipkart.category T2
ON T1.category_id = T2.category_id
GROUP BY T2.category
ORDER BY category_wise_profit DESC LIMIT 1
```

*9. Which is the most profitable state*

```sql
SELECT T3.state, SUM(T2.profit) AS 'total_profit' 
FROM flipkart.orders T1
JOIN flipkart.order_details T2
ON T1.order_id = T2.order_id
JOIN flipkart.users T3
ON T1.user_id = T3.user_id
GROUP BY T3.state
ORDER BY total_profit DESC LIMIT 1
```

*10. Find all categories with profit higher than 5000*

```sql
SELECT T2.category, SUM(T1.profit) AS 'category_wise_profit'
FROM flipkart.order_details T1
JOIN flipkart.category T2
ON T1.category_id = T2.category_id
GROUP BY T2.category
HAVING category_wise_profit > 5000
```