# Chapter 6: PROC SQL

## Introduction to PROC SQL

SQL (Structured Query Language) is widely used for managing relational databases. It was developed by IBM in the 1970s and is supported by various DBMSs, including SAP Business Object, MySQL, and SAS.

PROC SQL in SAS serves as an alternative to the DATA step and PROC procedures. It allows:
- Creating SAS tables
- Listing table contents (variables, names, labels)
- Sorting tables
- Generating descriptive statistics
- Performing table joins (merging multiple tables)

---

## Differences Between PROC SQL and DATA & PROC Steps

### Syntax Differences

#### 1) Separators in Variable Lists
- In DATA and PROC steps, variables are separated by spaces:

In [None]:
var ident jobcode wage;

- In PROC SQL, variables are separated by commas :

In [None]:
 select ident, jobcode, wage;

#### 2) Ending the Procedure
- PROC SQL ends with `quit;` instead of `run;`.

#### 3) Syntax Structure
```sas
proc sql <options>;
  SQL statements;
quit;
```
- `quit;` is mandatory to end PROC SQL.
- The order of statements is flexible.
- SQL statements are exclusive to PROC SQL, except for some (e.g., `title`, `drop=`, `keep=`).

#### 4) Example: Simple Query

In [None]:
libname in 'd:\mesfichierssas';
proc sql;
  title 'Example Query';
  select ident, jobcode, salary
  from in.payroll
  where jobcode contains 'NA'
  order by salary descending;
quit;

## 1) Creating a Database

In [None]:
proc sql;
  create table employees (
    id num,
    name char(50),
    salary num
  );
quit;

## 2) The `SELECT` Statement as an Alternative to PROC PRINT

In [None]:
proc sql;
  select * from employees;
quit;

## 3) `CREATE TABLE` with Additional Features
### Creating Variables

In [None]:
proc sql;
  create table new_table as
  select id, name, salary, salary*1.1 as new_salary
  from employees;
quit;

### Descriptive Statistics

In [None]:
proc sql;
  select avg(salary) as avg_salary, max(salary) as max_salary
  from employees;
quit;

### Grouping Data (`BY`, `CLASS`, `GROUP BY`)

In [None]:
proc sql;
  select jobcode, avg(salary) as avg_salary
  from employees
  group by jobcode;
quit;

### Conditional Statements (`IF…THEN…ELSE` Equivalent)

In [None]:
proc sql;
  select name, salary,
    case when salary > 50000 then 'High'
         else 'Low' end as salary_category
  from employees;
quit;

### Filtering Data (`WHERE` Clause) 

In [None]:
proc sql;
  select * from employees
  where salary > 40000;
quit;

### Sorting Data (Ascending/Descending Order)

In [None]:
proc sql;
  select distinct jobcode from employees;
quit;

## 4) Computing Statistics While Removing Duplicates

In [None]:
proc sql;
  select jobcode, avg(salary) as avg_salary
  from (select distinct * from employees)
  group by jobcode;
quit;

## 5) Inserting Statistics into a Table

In [None]:
proc sql;
  create table summary as
  select jobcode, avg(salary) as avg_salary
  from employees
  group by jobcode;
quit;

## 6) Table Joins in PROC SQL
### Cross Join

In [None]:
proc sql;
  select * from table1, table2;
quit;

### Full Join

In [None]:
proc sql;
  select * from table1
  full join table2
  on table1.id = table2.id;
quit;

### Left Join

In [None]:
proc sql;
  select * from table1
  left join table2
  on table1.id = table2.id;
quit;

### Right Join

In [None]:
proc sql;
  select * from table1
  right join table2
  on table1.id = table2.id;
quit;

### Inner Join

In [None]:
proc sql;
  select * from table1
  inner join table2
  on table1.id = table2.id;
quit;

<br>

<div align="center">
  <span style="font-family: sans-serif; font-size: 24px; color: Black;">
     This document provides an overview of PROC SQL and its applications in SAS.
  </span>
</div>