In [1]:
%load_ext sql
%sql duckdb://

# Errors in this File

- none of the syntax in this file works with DuckDB, so you can't run the cells
- the concepts such as data types do apply though
- the syntax in here may be useful for other systems though not as widely

# All Built-in Data Types/Literals
This code snippet demonstrates the usage of various built-in data types and literals in SQL.

- Integer data type: Used for storing whole numbers.
- Floating-point data type: Used for storing decimal numbers.
- Character data type: Used for storing fixed-length (CHAR) or variable-length (VARCHAR) strings.
- Boolean data type: Used for storing true (1) or false (0) values.
- Date and time data types: Used for storing dates (DATE) and timestamps (DATETIME).
- Null data type: Used for representing the absence of a value.
- Literal values: Directly specified values in SQL statements.

The code declares variables of different data types, assigns values to them, and prints the values to the console. It also demonstrates the usage of literal values in print statements.
```

In [2]:
%%sql

-- Demonstration of SQL built-in data types and literals

-- Integer data type
DECLARE @intVariable INT; -- Declaration of an integer variable
SET @intVariable = 10; -- Assignment of a value to the integer variable
PRINT @intVariable; -- Expected output: 10

-- Floating-point data type
DECLARE @floatVariable FLOAT; -- Declaration of a floating-point variable
SET @floatVariable = 3.14; -- Assignment of a value to the floating-point variable
PRINT @floatVariable; -- Expected output: 3.14

-- Character data type
DECLARE @charVariable CHAR(5); -- Declaration of a fixed-length character variable
SET @charVariable = 'Hello'; -- Assignment of a value to the character variable
PRINT @charVariable; -- Expected output: Hello

DECLARE @varcharVariable VARCHAR(10); -- Declaration of a variable-length character variable
SET @varcharVariable = 'World'; -- Assignment of a value to the character variable
PRINT @varcharVariable; -- Expected output: World

-- Boolean data type
DECLARE @bitVariable BIT; -- Declaration of a boolean variable
SET @bitVariable = 1; -- Assignment of a value to the boolean variable
PRINT @bitVariable; -- Expected output: 1

-- Date and time data types
DECLARE @dateVariable DATE; -- Declaration of a date variable
SET @dateVariable = '2022-01-01'; -- Assignment of a value to the date variable
PRINT @dateVariable; -- Expected output: 2022-01-01

DECLARE @datetimeVariable DATETIME; -- Declaration of a datetime variable
SET @datetimeVariable = '2022-01-01 12:34:56'; -- Assignment of a value to the datetime variable
PRINT @datetimeVariable; -- Expected output: 2022-01-01 12:34:56.000

-- Null data type
DECLARE @nullVariable INT; -- Declaration of a variable that can hold NULL values
SET @nullVariable = NULL; -- Assignment of NULL to the variable
PRINT @nullVariable; -- Expected output: NULL

-- Literal values
PRINT 42; -- Expected output: 42
PRINT 'Hello, World!'; -- Expected output: Hello, World!
PRINT 3.14; -- Expected output: 3.14
PRINT '2022-01-01'; -- Expected output: 2022-01-01

RuntimeError: If using snippets, you may pass the --with argument explicitly.
For more details please refer: https://jupysql.ploomber.io/en/latest/compose.html#with-argument


Original error message from DB driver:
(duckdb.ParserException) Parser Error: syntax error at or near "DECLARE"
LINE 4: DECLARE @intVariable INT; -- Declaration of an integer variable...
        ^
[SQL: -- Demonstration of SQL built-in data types and literals

-- Integer data type
DECLARE @intVariable INT; -- Declaration of an integer variable]
(Background on this error at: https://sqlalche.me/e/20/f405)

If you need help solving this issue, send us a message: https://ploomber.io/community


# All Ways of Quoting
Explanation:
- SQL supports various ways of quoting strings, allowing flexibility in representing string literals.
- Single quotes (`'`) are the most commonly used way to quote strings in SQL.
- Double quotes (`"`) can also be used to quote strings, although they are less commonly used and may require enabling certain settings in some database systems.
- Square brackets (`[]`) are used to quote identifiers, such as table or column names, in SQL Server and some other database systems.
- Backticks (`` ` ``) are used to quote identifiers in MySQL.
- Quoted identifiers (double quotes) are used to preserve case sensitivity or to use reserved words as identifiers in ANSI SQL.
- String concatenation can be performed using the `+` operator.
- Quotes can be escaped within a quoted string by doubling them up.
- Unicode strings can be represented using the `N` prefix before the string literal.

Note: The specific syntax and behavior may vary slightly depending on the database system being used.

In [None]:
%%sql

-- Single quotes
DECLARE @singleQuote VARCHAR(20) = 'This is a single-quoted string';
PRINT @singleQuote; -- This is a single-quoted string

-- Double quotes
DECLARE @doubleQuote VARCHAR(20) = "This is a double-quoted string";
PRINT @doubleQuote; -- This is a double-quoted string

-- Square brackets
DECLARE @squareBracket VARCHAR(20) = [This is a square-bracketed string];
PRINT @squareBracket; -- This is a square-bracketed string

-- Backticks (MySQL)
DECLARE @backtick VARCHAR(20) = `This is a backtick-quoted string`;
PRINT @backtick; -- This is a backtick-quoted string

-- Quoted identifier (ANSI SQL)
CREATE TABLE "MyTable" (
    "Column1" INT,
    "Column2" INT
);
SELECT "Column1", "Column2" FROM "MyTable";

-- Concatenation of quoted strings
DECLARE @concatenatedString VARCHAR(50) = 'Hello ' + 'World';
PRINT @concatenatedString; -- Hello World

-- Escaping quotes
DECLARE @escapedQuote VARCHAR(20) = 'This is a single-quoted string with a ''quote'' inside';
PRINT @escapedQuote; -- This is a single-quoted string with a 'quote' inside

-- Unicode strings (prefix N)
DECLARE @unicodeString NVARCHAR(20) = N'This is a Unicode string';
PRINT @unicodeString; -- This is a Unicode string

# Grouping/Nesting
Explanation:
This code snippet demonstrates the grouping and nesting syntax in SQL. The `GROUP BY` clause is used to group rows based on a specified column or columns. The aggregate functions (`SUM`, `AVG`, `MAX`, `MIN`, `COUNT`) are then used to perform calculations on the grouped data.

In the first query, the total salary for each department is calculated using the `SUM` function. The result is grouped by the `department` column.

In the second query, the average salary for each department is calculated using the `AVG` function. The `WHERE` clause is used to exclude the IT department from the calculation. The result is grouped by the `department` column.

In the third query, the maximum and minimum salary for each department are calculated using the `MAX` and `MIN` functions, respectively. The result is grouped by the `department` column.

In the fourth query, the number of employees in each department is calculated using the `COUNT` function. The result is grouped by the `department` column.

The output of each query will be printed, showing the department and the corresponding calculated value.

In [None]:
%%sql

-- Create a table to store employee information
CREATE TABLE employees (
    id INT PRIMARY KEY,
    name VARCHAR(50),
    department VARCHAR(50),
    salary DECIMAL(10, 2)
);

-- Insert some sample data into the table
INSERT INTO employees (id, name, department, salary)
VALUES (1, 'John Doe', 'IT', 5000),
       (2, 'Jane Smith', 'HR', 6000),
       (3, 'Mike Johnson', 'IT', 5500),
       (4, 'Emily Brown', 'Finance', 7000),
       (5, 'David Lee', 'IT', 4500);

-- Retrieve the total salary for each department
SELECT department, SUM(salary) AS total_salary
FROM employees
GROUP BY department;

-- Retrieve the average salary for each department, excluding the IT department
SELECT department, AVG(salary) AS average_salary
FROM employees
WHERE department <> 'IT'
GROUP BY department;

-- Retrieve the maximum and minimum salary for each department
SELECT department, MAX(salary) AS max_salary, MIN(salary) AS min_salary
FROM employees
GROUP BY department;

-- Retrieve the number of employees in each department
SELECT department, COUNT(*) AS employee_count
FROM employees
GROUP BY department;

# Variables
Explanation:
In SQL, variables are used to store and manipulate data within a script or a batch of statements. The `DECLARE` keyword is used to declare a variable, specifying its name and data type. The `SET` keyword is used to assign a value to a variable. Variables can be used in various SQL statements, such as `PRINT` and `SELECT`, to display or manipulate data.

In the code snippet above, we demonstrate the syntax and usage of variables in SQL. We start by declaring an integer variable `@myVariable` and assigning it a value of 10. We then print the value of `@myVariable` using the `PRINT` statement.

Next, we declare and assign a value to a variable `@anotherVariable` in a single statement. We print the value of `@anotherVariable` using the `PRINT` statement.

We also demonstrate declaring multiple variables in a single statement (`@var1`, `@var2`, and `@var3`). We assign values to these variables and print their values using the `PRINT` statement.

Finally, we show how variables can be used in a `SELECT` statement to retrieve and display their values.

Variables in SQL provide flexibility and allow for dynamic data manipulation within scripts and queries.

In [None]:
%%sql

-- Declare a variable
DECLARE @myVariable INT;

-- Assign a value to the variable
SET @myVariable = 10;

-- Print the value of the variable
PRINT 'The value of @myVariable is: ' + CAST(@myVariable AS VARCHAR(10));
-- Expected output: The value of @myVariable is: 10

-- Declare and assign a value to a variable in a single statement
DECLARE @anotherVariable VARCHAR(20) = 'Hello, World!';

-- Print the value of the variable
PRINT 'The value of @anotherVariable is: ' + @anotherVariable;
-- Expected output: The value of @anotherVariable is: Hello, World!

-- Declare multiple variables in a single statement
DECLARE @var1 INT, @var2 VARCHAR(10), @var3 DATE;

-- Assign values to the variables
SET @var1 = 5;
SET @var2 = 'SQL';
SET @var3 = GETDATE();

-- Print the values of the variables
PRINT 'The values of @var1, @var2, and @var3 are: ' + CAST(@var1 AS VARCHAR(10)) + ', ' + @var2 + ', ' + CAST(@var3 AS VARCHAR(20));
-- Expected output: The values of @var1, @var2, and @var3 are: 5, SQL, [current date and time]

-- Use variables in a SELECT statement
SELECT @var1 AS Variable1, @var2 AS Variable2, @var3 AS Variable3;
-- Expected output: Variable1: 5, Variable2: SQL, Variable3: [current date and time]

# Macros
Summary:
SQL macros provide a way to define reusable code snippets in SQL. They can simplify complex queries and avoid repetitive code. Macros can be created using the `CREATE MACRO` statement and can be used with or without parameters. Parameters can have default values, making them optional. Macros are invoked using the `SELECT` statement. They can be dropped using the `DROP MACRO` statement when they are no longer needed.

In [None]:
%%sql

-- SQL Macros

-- Macros are a way to define reusable code snippets in SQL. They can be used to simplify complex queries or to avoid repetitive code.

-- Creating a macro
CREATE OR REPLACE MACRO get_employee_count()
RETURNS INT
AS
'
SELECT COUNT(*) FROM employees;
'
;

-- Using the macro
SELECT get_employee_count(); -- Expected output: the total number of employees

-- Macros can also accept parameters
CREATE OR REPLACE MACRO get_employee_by_department(department_name VARCHAR)
RETURNS TABLE (employee_id INT, first_name VARCHAR, last_name VARCHAR)
AS
'
SELECT employee_id, first_name, last_name
FROM employees
WHERE department = :department_name;
'
;

-- Using the macro with a parameter
SELECT * FROM get_employee_by_department('Sales'); -- Expected output: employees in the Sales department

-- Macros can have optional parameters with default values
CREATE OR REPLACE MACRO get_employee_by_salary_range(min_salary INT, max_salary INT = NULL)
RETURNS TABLE (employee_id INT, first_name VARCHAR, last_name VARCHAR)
AS
'
SELECT employee_id, first_name, last_name
FROM employees
WHERE salary >= :min_salary
  AND (:max_salary IS NULL OR salary <= :max_salary);
'
;

-- Using the macro with different parameter combinations
SELECT * FROM get_employee_by_salary_range(50000); -- Expected output: employees with salary >= 50000
SELECT * FROM get_employee_by_salary_range(50000, 70000); -- Expected output: employees with salary between 50000 and 70000

-- Macros can be dropped when no longer needed
DROP MACRO get_employee_count;
DROP MACRO get_employee_by_department;
DROP MACRO get_employee_by_salary_range;

# Null and Nullability Checks
Explanation:
- The code snippet demonstrates various ways to handle null and perform nullability checks in SQL.
- The `employees` table is created with a nullable `salary` column.
- Data is inserted into the table, including rows with NULL values for the `salary` column.
- The first query selects all employees and their salaries, demonstrating the presence of NULL values.
- The next two queries use `IS NULL` and `IS NOT NULL` to filter employees based on the presence or absence of a salary.
- The `UPDATE` statement sets the salary of an employee to NULL.
- The `DELETE` statement removes employees with NULL salaries.
- The `CASE` statement is used to display a custom message based on the presence or absence of a salary.
- The `COALESCE` function is used to replace NULL values with a default value (0 in this case).
- The `IFNULL` function (specific to MySQL) is used to replace NULL values with a default value (0 in this case).

In [3]:
%%sql

CREATE OR REPLACE TABLE employees (
    id INT PRIMARY KEY,
    name VARCHAR(50),
    age INT,
    salary DECIMAL(10,2) NULL
);

INSERT INTO employees (id, name, age, salary)
VALUES (1, 'John Doe', 30, 5000.00),
       (2, 'Jane Smith', 25, NULL),
       (3, 'Mike Johnson', 35, 8000.00),
       (4, 'Sarah Williams', 40, NULL);

[33mThere's a new jupysql version available (0.10.1), you're running 0.10.0. To upgrade: pip install jupysql --upgrade[0m


Count


In [4]:
%sql SELECT name FROM employees WHERE salary IS NULL;

name
Jane Smith
Sarah Williams


In [5]:
%sql SELECT name FROM employees WHERE salary IS NOT NULL;

name
John Doe
Mike Johnson


In [6]:
%sql UPDATE employees SET salary = NULL WHERE id = 1;

Count


In [7]:
%%sql

SELECT name, 
       CASE WHEN salary IS NULL THEN 'No Salary' ELSE 'Has Salary' END AS salary_status
FROM employees;

name,salary_status
John Doe,No Salary
Jane Smith,No Salary
Mike Johnson,Has Salary
Sarah Williams,No Salary


In [8]:
%sql SELECT name, COALESCE(salary, 0) AS salary FROM employees;

name,salary
John Doe,0.0
Jane Smith,0.0
Mike Johnson,8000.0
Sarah Williams,0.0


In [9]:
%sql SELECT name, IFNULL(salary, 0) AS salary FROM employees;

name,salary
John Doe,0.0
Jane Smith,0.0
Mike Johnson,8000.0
Sarah Williams,0.0


# Case Sensitivity (Tables, DBs, Variables, Functions, etc.)

In SQL, the case sensitivity of tables, databases, variables, functions, and stored procedures can be controlled by using double quotes around their names. When a name is enclosed in double quotes, it becomes case-sensitive, meaning that it must be referenced with the exact casing used during creation. This allows for more precise control over object naming and referencing.

In [10]:
%%sql

CREATE OR REPLACE TABLE "MyTable" (
    "ID" INT,
    "Name" VARCHAR(50)
);

INSERT INTO "MyTable" ("ID", "Name") VALUES (1, 'John');
INSERT INTO "MyTable" ("ID", "Name") VALUES (2, 'Mary');

Count


In [11]:
%sql SELECT "ID", "Name" FROM "MyTable";

ID,Name
1,John
2,Mary


# Overall Statement Syntax and Effect of Clause order

Overall Statement Syntax and Effect of Clause Order
- SQL statements are composed of various clauses that can be combined to perform different operations on the data.
- The order of the clauses in a SQL statement can affect the result set and the behavior of the query.
- The `SELECT` clause is used to specify the columns to retrieve from the table.
- The `FROM` clause specifies the table(s) from which to retrieve the data.
- The `WHERE` clause is used to filter the rows based on a condition.
- The `ORDER BY` clause is used to sort the result set based on one or more columns.
- The `LIMIT` clause is used to limit the number of rows returned by the query.
- The `UPDATE` statement is used to modify existing records in a table.
- The `DELETE` statement is used to remove records from a table.
- The `JOIN` clause is used to combine rows from two or more tables based on a related column between them.
- The `GROUP BY` clause is used to group rows based on one or more columns.
- The `HAVING` clause is used to filter the groups based on a condition.
- Subqueries can be used within a SQL statement to retrieve data from another query.

NOTE: the clauses need to come in this order although the actual backend processing follows a different order from the specified order.

# Casting (eg. string to date and other combinations)

Explanation:
In SQL, casting is used to convert one data type to another. The `CAST` and `CONVERT` functions are commonly used for casting. 

To cast a string to a date, you can use either the `CAST` or `CONVERT` function, specifying the target data type as `DATE`. The resulting date will be in the format `YYYY-MM-DD`.

To cast a date to a string, you can use the `CAST` or `CONVERT` function, specifying the target data type as `VARCHAR` and providing the appropriate length. The resulting string will be in the format `YYYY-MM-DD`.

Similarly, you can cast a string to a datetime or a datetime to a string using the `CAST` or `CONVERT` function, specifying the appropriate data types.

To cast a string to an integer, use the `CAST` or `CONVERT` function, specifying the target data type as `INT`. The resulting integer will be the numeric value of the string.

To cast an integer to a string, use the `CAST` or `CONVERT` function, specifying the target data type as `VARCHAR` and providing the appropriate length.

To cast a string to a decimal, use the `CAST` or `CONVERT` function, specifying the target data type as `DECIMAL` and providing the precision and scale. The resulting decimal will be the numeric value of the string.

To cast a decimal to a string, use the `CAST` or `CONVERT` function, specifying the target data type as `VARCHAR` and providing the appropriate length.

There is also another syntax called __type constructor__ such as `DATE '2022-01-01'` where you specify that a string is of a certain type, for instance.

The code snippet demonstrates various casting scenarios and prints the expected results for each casting operation.

In [4]:
%%sql

CREATE OR REPLACE TABLE conversion_example (
    sample_string VARCHAR,
    sample_date DATE,
    sample_int INTEGER,
    sample_decimal DECIMAL(4, 2)
);

INSERT INTO conversion_example (sample_string, sample_date, sample_int, sample_decimal)
VALUES ('2022-01-01', '2022-01-01', 123, 3.14);

Count


In [13]:
%sql SELECT CAST(sample_string AS DATE) FROM conversion_example;

CAST(sample_string AS DATE)
2022-01-01


In [14]:
%sql SELECT DATE '2022-01-01';

CAST('2022-01-01' AS DATE)
2022-01-01


In [15]:
%sql SELECT CAST(sample_date AS VARCHAR) FROM conversion_example;

CAST(sample_date AS VARCHAR)
2022-01-01


In [16]:
%sql SELECT CAST(sample_string AS TIMESTAMP) FROM conversion_example;

CAST(sample_string AS TIMESTAMP)
2022-01-01 00:00:00


In [17]:
%sql SELECT TIMESTAMP '2022-01-01 12:00:00';

CAST('2022-01-01 12:00:00' AS TIMESTAMP)
2022-01-01 12:00:00


In [18]:
%sql SELECT SUBSTR(CAST(sample_date AS VARCHAR), 1, 10) FROM conversion_example;

"substr(CAST(sample_date AS VARCHAR), 1, 10)"
2022-01-01


In [19]:
%sql SELECT CAST(sample_string AS INTEGER) FROM conversion_example;

CAST(sample_string AS INTEGER)


In [3]:
%sql SELECT CAST(sample_int AS VARCHAR) FROM conversion_example;

CAST(sample_int AS VARCHAR)
123


In [4]:
%sql SELECT CAST(sample_string AS DECIMAL(4, 2)) FROM conversion_example;

"CAST(sample_string AS DECIMAL(4,2))"


In [2]:
%sql SELECT DECIMAL '3.14';

"CAST('3.14' AS DECIMAL(18,3))"
3.14


In [5]:
%sql SELECT CAST(sample_decimal AS VARCHAR) FROM conversion_example;

CAST(sample_decimal AS VARCHAR)
3.14
