# Scaling Apps in SQL

### Partitioning

Partitioning is a technique used in relational databases to improve application performance by managing the growth of data. There are two types of partitioning: horizontal and vertical.

Horizontal and vertical partitioning are techniques used in relational databases to manage large amounts of data efficiently, which can have an impact on the performance and scalability of applications that use these databases. 

![Partitioning Example](Pictures/partitioned-external-tables.png)

### Horizontal Partitioning

Horizontal partitioning can be beneficial for scaling applications because it allows for more efficient querying of data, as well as more efficient adding and deleting of data. It also enables the use of local indexes at the partition level, which can further improve query performance. Horizontal partitioning is commonly used in data warehouses and for time-oriented or naturally partitioned data, such as sales data by time or by region.

Horizontal partitioning, also known as sharding, involves breaking large tables into smaller partitions or chunks. This allows the application to treat each partition as if it were a separate table, while maintaining a simpler logic around the application code. 


![Partitioning Example](Pictures/partitioning1.png)

### Vertical Partitioning

Vertical partitioning, on the other hand, involves separating groups of columns into multiple tables. This can be useful for improving query performance when certain attributes or columns are frequently queried together. Vertical partitioning can increase the number of rows that can be stored in a single data block, resulting in more efficient I/O operations. It also allows for the use of global indexes for each partition, reducing the amount of I/O performed. 

Vertical partitioning is often used in conjunction with columnar data storage, where data is stored based on columns rather than rows, and can be useful in data warehouses or for grouping similarly used attributes.


![Image Title](Pictures/partitioning_2.png)

In short:

- Horizontal partitioning involves breaking large tables into smaller partitions, each containing a subset of rows.
- Benefits of horizontal partitioning include more efficient querying, indexing at the partition level, and usage in data warehouses for time-oriented data or natural grouping based on subject area.
- Vertical partitioning involves separating groups of columns into multiple tables.
- Benefits of vertical partitioning include improved data retrieval performance, increased rows stored in a single data block, and reduced I/O operations.
- Global indexes are used in vertical partitioning, reducing the amount of I/O operations required.
- Vertical partitioning is used in data warehouses and can be based on similarly used attributes.

Horizontal and vertical partitioning are techniques used in relational databases to optimize performance and scalability of applications by efficiently managing large amounts of data.  Both techniques can be beneficial for scaling applications that require efficient data storage and retrieval.

### Other than horizontal and vertical partitioning, there are other important types as well: 

![Image Title](Pictures/partitioning.gif)


#Photo Credit: Oracle

1. Range partitioning: 
- In range partitioning, data is partitioned based on a specified range of values for a particular column. 
- For example, if you have a table with a date column and you choose to range partition it, you could partition the data based on date ranges, such as all data for January in one partition, all data for February in another partition, and so on.
- This type of partitioning can be useful when you have data that can be logically divided into ranges and you want to efficiently manage data based on those ranges.

2. List partitioning: 
- In list partitioning, data is partitioned based on a predefined list of values for a particular column. 
- For example, if you have a table with a column that represents different countries, you could partition the data based on a list of countries, such as all data for the United States in one partition, all data for Canada in another partition, and so on. 
- This type of partitioning can be useful when you have data that can be logically divided into discrete values and you want to efficiently manage data based on those values.

3. Hash partitioning: 
- In hash partitioning, data is partitioned based on a hash function applied to a particular column. The hash function generates a hash value for each row, and the rows are then distributed across partitions based on the hash value. 
- This type of partitioning can evenly distribute data across partitions, which can be useful for load balancing and improving query performance. 
- However, it may not be suitable for partitioning based on specific ranges or values.

### Example

We will create a sales history table that is partitioned by the month of sale. 

Here's the task: 

- The months will be represented as numbers from 1 to 12. 
- The table will have attributes such as product ID, product name, product type, total units sold, and month of sale. 
- To create a primary key for this table, we will use a combination of the month of sale and product ID.

### Creating a Range-Partitioned Table

### Create partitioned tables for nodes in the database management system