<h1 style="color:red" align="center">LIMIT, DISTINCT, ORDER BY Clause</h1>

## Use of LIMIT, DISTINCT, and ORDER BY Clauses in Data Science

## LIMIT Clause
- **Purpose**: Retrieve a subset of rows from a dataset.
- **Usage**: Useful for working with large datasets by limiting the number of rows returned.
- **Example**: When exploring a dataset, data scientists might use the LIMIT clause to retrieve a small sample of rows for a quick overview.

## DISTINCT Keyword
- **Purpose**: Identify unique values within a dataset.
- **Usage**: Helps in data cleaning and preprocessing by identifying unique categories, removing duplicates, and understanding data distribution.
- **Example**: Data scientists might use the DISTINCT keyword to identify unique product categories in sales data or unique user IDs in engagement data.

## ORDER BY Clause
- **Purpose**: Sort data based on specific criteria.
- **Usage**: Essential for preparing data for visualization and analysis, and gaining insights.
- **Example**: Data scientists might use the ORDER BY clause to sort time-series data in chronological order or sort survey responses based on response scores.

These clauses and keywords provide data scientists with the flexibility to manipulate and analyze datasets effectively, enabling them to extract valuable insights and make informed decisions in their data science projects.

#### Use the ERD Diagram to understand and write queries :

![ERD.png](attachment:ERD.png)


## LIMIT Clause
The LIMIT clause is used to restrict the number of rows returned by a query to a specified number.

### Syntax
```sql
SELECT column1, column2, ...
FROM table_name
LIMIT number_of_rows;

#### An example of showing just the first 5 rows of the orders table with all of the columns might look like the following :

`SELECT * 
FROM orders 
LIMIT 5;`

### Question

#### Try using LIMIT yourself below by writing a query that displays all the data in the occurred_at, id, and channel columns of the web_events table, and limits the output to only the first 20 rows.

`SELECT occurred_at, id, channel
FROM web_events
LIMIT 20;`

# DISTINCT Clause in SQL

## Definition
The DISTINCT keyword in SQL is used to retrieve unique values from a column or set of columns in a database table.

## Use Case
- **Purpose**: Identify unique values within a dataset.
- **Usage**: Helps in data cleaning and preprocessing by removing duplicates and understanding data distribution.
- **Example**: Suppose we have a 'Products' table with a 'Category' column. Using DISTINCT, we can identify unique product categories in the dataset.
  
```sql
SELECT DISTINCT Category
FROM Products;


#### An example of showing distinct id in the accounts table with name, website, lat and long assosiated with that id might look like the following :

`SELECT DISTINCT id, name, website, lat, long
FROM accounts;`

## ORDER BY Clause in SQL

### Definition
The ORDER BY clause in SQL is used to sort the result set of a query based on one or more columns in ascending or descending order.

### Use Case
- **Purpose**: Sort data based on specific criteria.
- **Usage**: Essential for preparing data for visualization, analysis, and gaining insights.
- **Example**: Suppose we have a 'Sales' table with a 'SalesDate' column. Using ORDER BY, we can sort the sales data in chronological order.
  
```sql
SELECT *
FROM Sales
ORDER BY SalesDate ASC;


#### Note : Remember DESC can be added after the column in your ORDER BY statement to sort in descending order, as the default is to sort in ascending order.

### Let's get some practice using ORDER BY :

- #### 1- Write a query to return the 10 earliest orders in the orders table. Include the id, occurred_at, and total_amt_usd.

`SELECT id, occurred_at, total_amt_usd
FROM orders
ORDER BY occurred_at
LIMIT 10;`

- #### 2- Write a query to return the top 5 orders in terms of largest total_amt_usd. Include the id, account_id, and total_amt_usd.

`SELECT id, account_id, total_amt_usd
FROM orders
ORDER BY total_amt_usd DESC
LIMIT 5;`

- #### 3- Write a query to return the lowest 20 orders in terms of smallest total_amt_usd. Include the id, account_id, and total_amt_usd.

`SELECT id, account_id, total_amt_usd
FROM orders
ORDER BY total_amt_usd
LIMIT 20;`

#### Note : You can also use ORDER BY on more than one column at a time. When you provide a list of columns in an ORDER BY command, the sorting occurs using the leftmost column in your list first, then the next column from the left, and so on. We still have the ability to flip the way we order using DESC.

### Let's get some practice using ORDER BY with multiple columns :

- #### 1- Write a query that displays the order ID, account ID, and total dollar amount for all the orders, sorted first by the account ID (in ascending order), and then by the total dollar amount (in descending order).

`SELECT order_id, account_id, total_amt_usd
FROM orders
ORDER BY account_id, total_amt_usd DESC;`

- #### 2- Write a query that again displays order ID, account ID, and total dollar amount for each order, but this time sorted first by total dollar amount (in descending order), and then by account ID (in ascending order).

`SELECT order_id, account_id, total_amt_usd
FROM orders
ORDER BY total_amt_usd DESC, account_id;`