In [1]:
%load_ext sql
%sql duckdb://

# Built-in Functions (e.g., Date functions, String functions)
This code snippet demonstrates various built-in functions in SQL for working with dates and strings.

For date functions, it shows how to get the current date, time, and timestamp using `CURRENT_DATE`, `CURRENT_TIME`, and `CURRENT_TIMESTAMP` respectively. It also demonstrates extracting year, month, and day from a date using `EXTRACT`, calculating the difference between two dates using `DATEDIFF`, and formatting dates using `TO_CHAR`.

For string functions, it shows how to get the length of a string using `LENGTH`, convert a string to uppercase or lowercase using `UPPER` and `LOWER`, concatenate two strings using `CONCAT`, replace a substring in a string using `REPLACE`, trim leading and trailing spaces from a string using `TRIM`, get the position of a substring in a string using `POSITION`, repeat a string a specified number of times using `REPEAT`, and reverse a string using `REVERSE`. It also demonstrates converting a string to a date or timestamp using `TO_DATE` and `TO_TIMESTAMP`.

These functions can be useful for manipulating and formatting dates and strings in SQL queries.

NOTE: Due to sql dialect differences, not all of these will work anywhere.  For instance, some of the ones below don't work.

In [2]:
%%sql

CREATE OR REPLACE TABLE employees (
    id INT PRIMARY KEY,
    name VARCHAR(100),
    hire_date DATE
);

INSERT INTO employees (id, name, hire_date)
VALUES (1, 'John Doe', '2020-01-01'),
       (2, 'Jane Smith', '2019-05-15'),
       (3, 'Mike Johnson', '2021-03-10');

Count


In [3]:
%%sql
SELECT CURRENT_DATE; -- Expected output: current date in YYYY-MM-DD format

CURRENT_DATE
2023-08-30


In [4]:
%%sql
SELECT CURRENT_TIME; -- Expected output: current time in HH:MM:SS format

CURRENT_TIME
20:31:43.108000


In [5]:
%%sql
SELECT CURRENT_TIMESTAMP; -- Expected output: current timestamp in YYYY-MM-DD HH:MM:SS format

CURRENT_TIMESTAMP
2023-08-30 20:31:43.108000


In [6]:
%%sql
SELECT EXTRACT(YEAR FROM hire_date) AS hire_year
FROM employees; -- Expected output: hire_year column with the year of each employee's hire date

hire_year
2020
2019
2021


In [7]:
%%sql
SELECT EXTRACT(MONTH FROM hire_date) AS hire_month
FROM employees; -- Expected output: hire_month column with the month of each employee's hire date

hire_month
1
5
3


In [8]:
%%sql
SELECT EXTRACT(DAY FROM hire_date) AS hire_day
FROM employees; -- Expected output: hire_day column with the day of each employee's hire date

hire_day
1
15
10


In [29]:
%%sql
SELECT DATEDIFF('day', DATE '2021-12-31', DATE '2020-01-01') AS date_diff; -- Expected output: 729 (number of days between the two dates)

date_diff
-730


In [30]:
%%sql
SELECT TO_CHAR(hire_date, 'Month DD, YYYY') AS formatted_date
FROM employees; -- Expected output: formatted_date column with hire dates in Month DD, YYYY format

RuntimeError: If using snippets, you may pass the --with argument explicitly.
For more details please refer: https://jupysql.ploomber.io/en/latest/compose.html#with-argument


Original error message from DB driver:
(duckdb.CatalogException) Catalog Error: Scalar Function with name to_char does not exist!
Did you mean "chr"?
LINE 1: SELECT TO_CHAR(hire_date, 'Month DD, YYYY') AS...
               ^
[SQL: SELECT TO_CHAR(hire_date, 'Month DD, YYYY') AS formatted_date
FROM employees; -- Expected output: formatted_date column with hire dates in Month DD, YYYY format]
(Background on this error at: https://sqlalche.me/e/20/f405)

If you need help solving this issue, send us a message: https://ploomber.io/community


In [16]:
%%sql
SELECT TO_CHAR(CURRENT_TIMESTAMP, 'YYYY-MM-DD HH24:MI:SS') AS formatted_datetime; -- Expected output: current date and time in YYYY-MM-DD HH:MI:SS format

RuntimeError: If using snippets, you may pass the --with argument explicitly.
For more details please refer: https://jupysql.ploomber.io/en/latest/compose.html#with-argument


Original error message from DB driver:
(Your query contains named parameters (MI, SS, MI, SS) but the named parameters feature is disabled. Enable it with: %config SqlMagic.named_parameters=True)
(duckdb.CatalogException) Catalog Error: Scalar Function with name to_char does not exist!
Did you mean "chr"?
LINE 1: SELECT TO_CHAR(CURRENT_TIMESTAMP, 'YYYY-MM-DD ...
               ^
[SQL: SELECT TO_CHAR(CURRENT_TIMESTAMP, 'YYYY-MM-DD HH24:MI:SS') AS formatted_datetime; -- Expected output: current date and time in YYYY-MM-DD HH:MI:SS format]
(Background on this error at: https://sqlalche.me/e/20/f405)

If you need help solving this issue, send us a message: https://ploomber.io/community


In [17]:
%%sql
SELECT TO_CHAR(hire_date, 'Day') AS day_of_week
FROM employees; -- Expected output: day_of_week column with the day of the week for each employee's hire date

RuntimeError: If using snippets, you may pass the --with argument explicitly.
For more details please refer: https://jupysql.ploomber.io/en/latest/compose.html#with-argument


Original error message from DB driver:
(duckdb.CatalogException) Catalog Error: Scalar Function with name to_char does not exist!
Did you mean "chr"?
LINE 1: SELECT TO_CHAR(hire_date, 'Day') AS day_of_wee...
               ^
[SQL: SELECT TO_CHAR(hire_date, 'Day') AS day_of_week
FROM employees; -- Expected output: day_of_week column with the day of the week for each employee's hire date]
(Background on this error at: https://sqlalche.me/e/20/f405)

If you need help solving this issue, send us a message: https://ploomber.io/community


In [18]:
%%sql
SELECT LENGTH('Hello, World!') AS string_length; -- Expected output: 13 (number of characters in the string)

string_length
13


In [19]:
%%sql
SELECT UPPER('hello') AS uppercase_string; -- Expected output: "HELLO"

uppercase_string
HELLO


In [20]:
%%sql
SELECT LOWER('WORLD') AS lowercase_string; -- Expected output: "world"

lowercase_string
world


In [21]:
%%sql
SELECT CONCAT('Hello', ' ', 'World') AS concatenated_string; -- Expected output: "Hello World"

concatenated_string
Hello World


In [22]:
%%sql
SELECT REPLACE('Hello, World!', 'World', 'SQL') AS replaced_string; -- Expected output: "Hello, SQL!"

replaced_string
"Hello, SQL!"


In [23]:
%%sql
SELECT TRIM('   Hello, World!   ') AS trimmed_string; -- Expected output: "Hello, World!"

trimmed_string
"Hello, World!"


In [24]:
%%sql
SELECT POSITION('World' IN 'Hello, World!') AS substring_position; -- Expected output: 8 (position of "World" in the string)

substring_position
8


In [25]:
%%sql
SELECT REPEAT('SQL ', 3) AS repeated_string; -- Expected output: "SQL SQL SQL "

repeated_string
SQL SQL SQL


In [26]:
%%sql
SELECT REVERSE('Hello, World!') AS reversed_string; -- Expected output: "!dlroW ,olleH"

reversed_string
"!dlroW ,olleH"


In [27]:
%%sql
SELECT TO_DATE('2022-01-01', 'YYYY-MM-DD') AS converted_date; -- Expected output: 2022-01-01 (date in YYYY-MM-DD format)

RuntimeError: If using snippets, you may pass the --with argument explicitly.
For more details please refer: https://jupysql.ploomber.io/en/latest/compose.html#with-argument


Original error message from DB driver:
(duckdb.CatalogException) Catalog Error: Scalar Function with name to_date does not exist!
Did you mean "to_days"?
LINE 1: SELECT TO_DATE('2022-01-01', 'YYYY-MM-DD') AS ...
               ^
[SQL: SELECT TO_DATE('2022-01-01', 'YYYY-MM-DD') AS converted_date; -- Expected output: 2022-01-01 (date in YYYY-MM-DD format)]
(Background on this error at: https://sqlalche.me/e/20/f405)

If you need help solving this issue, send us a message: https://ploomber.io/community


In [28]:
%%sql
SELECT TO_TIMESTAMP('2022-01-01 12:34:56', 'YYYY-MM-DD HH24:MI:SS') AS converted_timestamp; -- Expected output: 2022-01-01 12:34:56 (timestamp in YYYY-MM-DD HH:MI:SS format)

RuntimeError: (Your query contains named parameters (MI, SS, MI, SS) but the named parameters feature is disabled. Enable it with: %config SqlMagic.named_parameters=True)
(duckdb.BinderException) Binder Error: No function matches the given name and argument types 'to_timestamp(VARCHAR, VARCHAR)'. You might need to add explicit type casts.
	Candidate functions:
	to_timestamp(BIGINT) -> TIMESTAMP

LINE 1: SELECT TO_TIMESTAMP('2022-01-01 12:34:56', 'YY...
               ^
[SQL: SELECT TO_TIMESTAMP('2022-01-01 12:34:56', 'YYYY-MM-DD HH24:MI:SS') AS converted_timestamp; -- Expected output: 2022-01-01 12:34:56 (timestamp in YYYY-MM-DD HH:MI:SS format)]
(Background on this error at: https://sqlalche.me/e/20/f405)
If you need help solving this issue, send us a message: https://ploomber.io/community


# User-defined Functions

In this code snippet, we demonstrate the usage of user-defined functions in SQL. We create a table called "employees" with columns for id, name, and salary. Then, we define a user-defined function called "calculate_bonus" that takes the salary as input and returns the bonus amount based on the salary.

Inside the function, we declare a local variable "bonus" to store the calculated bonus. We use an IF statement to determine the bonus amount based on the salary. If the salary is greater than 5000, the bonus is set to 10% of the salary. Otherwise, the bonus is set to 5% of the salary.

After creating the function, we insert some sample data into the "employees" table. Finally, we call the user-defined function for each employee and print their name, salary, and bonus amount using a SELECT statement.

Expected output:
```
+--------------+--------+--------+
| name         | salary | bonus  |
+--------------+--------+--------+
| John Doe     | 6000   | 600.00 |
| Jane Smith   | 4000   | 200.00 |
| Mike Johnson | 8000   | 800.00 |
+--------------+--------+--------+
```

NOTE: the syntax shown here doesn't work in Duck DB/Jupysql but works in some systems.  In a lot of scenarios, such as using SQL from Java or Python, you can write Java or Python code to manipulate data before passing it into another API.

In [31]:
%%sql

CREATE OR REPLACE TABLE employees (
    id INT PRIMARY KEY,
    name VARCHAR(100),
    salary DECIMAL(10, 2)
);

Count


In [32]:
%%sql

CREATE FUNCTION calculate_bonus(salary DECIMAL(10, 2))
RETURNS DECIMAL(10, 2)
BEGIN
    DECLARE bonus DECIMAL(10, 2);
    
    -- Calculate the bonus based on salary
    IF salary > 5000 THEN
        SET bonus = salary * 0.1;
    ELSE
        SET bonus = salary * 0.05;
    END IF;
    
    RETURN bonus;
END;

RuntimeError: If using snippets, you may pass the --with argument explicitly.
For more details please refer: https://jupysql.ploomber.io/en/latest/compose.html#with-argument


Original error message from DB driver:
(duckdb.ParserException) Parser Error: syntax error at or near "DECIMAL"
LINE 1: CREATE FUNCTION calculate_bonus(salary DECIMAL(10, 2))
                                               ^
[SQL: CREATE FUNCTION calculate_bonus(salary DECIMAL(10, 2))
RETURNS DECIMAL(10, 2)
BEGIN
    DECLARE bonus DECIMAL(10, 2);
    
    -- Calculate the bonus based on salary
    IF salary > 5000 THEN
        SET bonus = salary * 0.1;
    ELSE
        SET bonus = salary * 0.05;
    END IF;
    
    RETURN bonus;
END;]
(Background on this error at: https://sqlalche.me/e/20/f405)

If you need help solving this issue, send us a message: https://ploomber.io/community


In [33]:
%%sql

INSERT INTO employees (id, name, salary)
VALUES (1, 'John Doe', 6000),
       (2, 'Jane Smith', 4000),
       (3, 'Mike Johnson', 8000);

Count


In [34]:
%%sql

SELECT name, salary, calculate_bonus(salary) AS bonus
FROM employees;

RuntimeError: If using snippets, you may pass the --with argument explicitly.
For more details please refer: https://jupysql.ploomber.io/en/latest/compose.html#with-argument


Original error message from DB driver:
(duckdb.CatalogException) Catalog Error: Scalar Function with name calculate_bonus does not exist!
Did you mean "acos"?
LINE 1: SELECT name, salary, calculate_bonus(salary) AS bonus
                             ^
[SQL: SELECT name, salary, calculate_bonus(salary) AS bonus
FROM employees;]
(Background on this error at: https://sqlalche.me/e/20/f405)

If you need help solving this issue, send us a message: https://ploomber.io/community


# Sequences (for generating unique numbers)

In SQL, sequences are used to generate unique numbers. They are often used to generate primary key values for tables. The code snippet demonstrates various aspects of working with sequences.

- The first step is to create a sequence using the `CREATE SEQUENCE` statement. In this case, a sequence named `my_sequence` is created without any parameters.
- To get the next value from the sequence, the `NEXTVAL` function is used. It returns the next value in the sequence. In the example, `NEXTVAL('my_sequence')` is used, which will return `1` as the initial value.
- Sequences can also be created with parameters such as initial value and increment. The `CREATE SEQUENCE` statement can be used with the `START WITH` and `INCREMENT BY` clauses to specify these parameters. In the code snippet, a sequence named `my_sequence_with_params` is created with an initial value of `100` and an increment of `10`.
- The `NEXTVAL` function can be used with sequences that have parameters as well. In the example, `NEXTVAL('my_sequence_with_params')` is used to get the next value from the sequence, which will be `100` initially and `110` in the subsequent call.
- The `SETVAL` function is used to set the current value of a sequence. In the code snippet, `SETVAL('my_sequence_with_params', 200)` is used to set the current value of `my_sequence_with_params` to `200`.
- The `CURRVAL` function is used to get the current value of a sequence. In the example, `CURRVAL('my_sequence_with_params')` is used to retrieve the current value, which will be `210` after setting it using `SETVAL`.
- Sequences can be restarted using the `RESTART IDENTITY` clause. In the code snippet, `RESTART IDENTITY('my_sequence_with_params')` is used to restart the sequence `my_sequence_with_params`.
- After restarting the sequence, the `NEXTVAL` function is used again to get the next value, which will be `100` as specified by the `START WITH` clause.
- Finally, the sequence can be dropped using the `DROP SEQUENCE` statement. In the example, `DROP SEQUENCE my_sequence_with_params` is used to remove the sequence `my_sequence_with_params` from the database.

Sequences are a powerful feature in SQL that provide an easy way to generate unique numbers for various purposes, especially when generating primary key values for tables.

NOTE: this doesn't seem to be working properly so I didn't bother finishing it.

In [35]:
%%sql

CREATE SEQUENCE my_sequence;

Count


In [36]:
%%sql

SELECT NEXTVAL('my_sequence'); -- Expected output: 1

nextval('my_sequence')
1


In [41]:
%%sql

CREATE OR REPLACE SEQUENCE my_sequence_with_params
    START WITH 100
    INCREMENT BY 10;

Count


In [42]:
%%sql

SELECT NEXTVAL('my_sequence_with_params'); -- Expected output: 100

nextval('my_sequence_with_params')
1


In [43]:
%%sql

SELECT NEXTVAL('my_sequence_with_params'); -- Expected output: 110

nextval('my_sequence_with_params')
11


In [None]:
-- Set the current value of the sequence
SELECT SETVAL('my_sequence_with_params', 200);

-- Get the next value after setting the current value
SELECT NEXTVAL('my_sequence_with_params'); -- Expected output: 210

-- Get the current value of the sequence
SELECT CURRVAL('my_sequence_with_params'); -- Expected output: 210

-- Restart the sequence
SELECT RESTART IDENTITY('my_sequence_with_params');

-- Get the next value after restarting the sequence
SELECT NEXTVAL('my_sequence_with_params'); -- Expected output: 100

-- Drop the sequence
DROP SEQUENCE my_sequence_with_params;