Certainly! Here's a more detailed version of the SQL commands, categorized into their respective languages and with explanations of their functions:

---

## **DDL - Data Definition Language**

These commands are used to define and manage database objects like tables, schemas, indexes, views, and so on.

### 1. **CREATE**

Used to create database objects such as tables, views, indexes, and schemas.

```sql
CREATE TABLE employees (
    id INT PRIMARY KEY,
    first_name VARCHAR(100),
    last_name VARCHAR(100),
    hire_date DATE
);
```

### 2. **ALTER**

Used to modify existing database objects, such as adding, deleting, or modifying columns in a table.

```sql
ALTER TABLE employees ADD email VARCHAR(100);
ALTER TABLE employees DROP COLUMN email;
ALTER TABLE employees MODIFY first_name VARCHAR(150);
```

### 3. **DROP**

Used to remove an existing database object, such as a table, view, or index.

```sql
DROP TABLE employees;
DROP INDEX emp_idx;
```

### 4. **RENAME**

Used to rename a database object.

```sql
RENAME employees TO staff;
```

### 5. **TRUNCATE**

Used to remove all rows from a table but does not remove the table itself. Unlike `DELETE`, it does not log individual row deletions, making it faster but less flexible.

```sql
TRUNCATE TABLE employees;
```

### 6. **COMMENT**

Used to add or modify comments on database objects.

```sql
COMMENT ON COLUMN employees.first_name IS 'Employee first name';
COMMENT ON TABLE employees IS 'Table containing employee data';
```

---

## **DQL - Data Query Language**

These commands are used to query or retrieve data from the database.

### 1. **SELECT**

Used to retrieve data from one or more tables. It can be combined with clauses like `WHERE`, `JOIN`, `GROUP BY`, `ORDER BY`, etc.

```sql
SELECT first_name, last_name FROM employees WHERE hire_date > '2020-01-01';
SELECT * FROM employees ORDER BY last_name;
```

---

## **DML - Data Manipulation Language**

These commands are used to manipulate data within database tables.

### 1. **INSERT**

Used to insert data into a table.

```sql
INSERT INTO employees (id, first_name, last_name, hire_date)
VALUES (1, 'John', 'Doe', '2021-07-15');
```

### 2. **UPDATE**

Used to modify existing data in a table.

```sql
UPDATE employees SET last_name = 'Smith' WHERE id = 1;
```

### 3. **DELETE**

Used to remove data from a table.

```sql
DELETE FROM employees WHERE id = 1;
```

### 4. **MERGE**

A combination of `INSERT` and `UPDATE`. It is used to synchronize two tables by inserting new rows or updating existing ones based on a condition.

```sql
MERGE INTO employees e
USING temp_employees t
ON (e.id = t.id)
WHEN MATCHED THEN
    UPDATE SET e.first_name = t.first_name
WHEN NOT MATCHED THEN
    INSERT (id, first_name, last_name) VALUES (t.id, t.first_name, t.last_name);
```

### 5. **CALL**

Used to execute a stored procedure.

```sql
CALL UpdateEmployeeSalary(1, 50000);
```

### 6. **EXPLAIN PLAN**

Used to display the execution plan for a query, showing how the database will execute the query.

```sql
EXPLAIN PLAN FOR SELECT * FROM employees WHERE hire_date > '2020-01-01';
```

### 7. **LOCK TABLE**

Used to lock a table to prevent other transactions from modifying it until the lock is released. This is often used in concurrent transaction scenarios to ensure data integrity.

```sql
LOCK TABLE employees IN EXCLUSIVE MODE;
```

---

## **DCL - Data Control Language**

These commands are used to control access to data in the database.

### 1. **GRANT**

Used to assign privileges or roles to users or roles in the database.

```sql
GRANT SELECT, INSERT ON employees TO user1;
```

### 2. **REVOKE**

Used to remove privileges or roles from users or roles in the database.

```sql
REVOKE SELECT, INSERT ON employees FROM user1;
```

---

## **TCL - Transaction Control Language**

These commands are used to manage the transactions in the database.

### 1. **COMMIT**

Used to save all changes made in the current transaction to the database.

```sql
COMMIT;
```

### 2. **ROLLBACK**

Used to undo changes made in the current transaction.

```sql
ROLLBACK;
```

### 3. **SAVEPOINT**

Used to set a savepoint within a transaction. This allows you to roll back to a specific point within the transaction.

```sql
SAVEPOINT before_update;
UPDATE employees SET salary = 60000 WHERE id = 1;
ROLLBACK TO SAVEPOINT before_update;
```

### 4. **SET TRANSACTION**

Used to set the properties of a transaction, such as its isolation level.

```sql
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
```

---

### Summary of SQL Commands

| **Category** | **Command**       | **Description**                                                                    |
| ------------ | ----------------- | ---------------------------------------------------------------------------------- |
| **DDL**      | `CREATE`          | Create a new table, index, or other database objects.                              |
|              | `ALTER`           | Modify an existing table, column, or other database object.                        |
|              | `DROP`            | Delete a table, index, or other database object.                                   |
|              | `RENAME`          | Rename an existing table or other object.                                          |
|              | `TRUNCATE`        | Remove all rows from a table without removing the table itself.                    |
|              | `COMMENT`         | Add comments to tables, columns, or other database objects.                        |
| **DQL**      | `SELECT`          | Retrieve data from one or more tables.                                             |
| **DML**      | `INSERT`          | Insert data into a table.                                                          |
|              | `UPDATE`          | Update existing data in a table.                                                   |
|              | `DELETE`          | Delete data from a table.                                                          |
|              | `MERGE`           | Synchronize data between two tables, performing both INSERT and UPDATE operations. |
|              | `CALL`            | Execute a stored procedure.                                                        |
|              | `EXPLAIN PLAN`    | Display the execution plan for a query.                                            |
|              | `LOCK TABLE`      | Lock a table for exclusive use in a transaction.                                   |
| **DCL**      | `GRANT`           | Grant privileges or roles to users or roles.                                       |
|              | `REVOKE`          | Remove privileges or roles from users or roles.                                    |
| **TCL**      | `COMMIT`          | Save all changes made in the current transaction.                                  |
|              | `ROLLBACK`        | Undo changes made in the current transaction.                                      |
|              | `SAVEPOINT`       | Set a point in a transaction to which you can later roll back.                     |
|              | `SET TRANSACTION` | Set the properties for a transaction, such as isolation level.                     |

---

These commands form the basis for interacting with relational databases, enabling users to define, query, manipulate, and control access to the data.


Here’s a **PostgreSQL cheat sheet** that summarizes the most important SQL commands, functions, and operations that you may encounter while working with PostgreSQL. It’s organized into categories for quick reference.

---

## **PostgreSQL Cheat Sheet**

### **1. Basic PostgreSQL Commands**

- **Login to PostgreSQL**

  ```sh
  psql -U username -d database_name
  ```

- **Connect to a specific database**

  ```sql
  \c database_name
  ```

- **List all databases**

  ```sql
  \l
  ```

- **List all tables in the current database**

  ```sql
  \dt
  ```

- **Describe a table structure**

  ```sql
  \d table_name
  ```

- **Exit psql**
  ```sql
  \q
  ```

---

### **2. Data Definition Language (DDL)**

#### **Create Table**

```sql
CREATE TABLE table_name (
    column1 datatype PRIMARY KEY,
    column2 datatype NOT NULL,
    column3 datatype DEFAULT 'value'
);
```

#### **Alter Table**

- **Add a column**

  ```sql
  ALTER TABLE table_name ADD COLUMN column_name datatype;
  ```

- **Drop a column**

  ```sql
  ALTER TABLE table_name DROP COLUMN column_name;
  ```

- **Modify a column type**

  ```sql
  ALTER TABLE table_name ALTER COLUMN column_name SET DATA TYPE new_datatype;
  ```

- **Rename a table**
  ```sql
  ALTER TABLE old_table_name RENAME TO new_table_name;
  ```

#### **Drop Table**

```sql
DROP TABLE table_name;
```

#### **Create Index**

```sql
CREATE INDEX index_name ON table_name (column_name);
```

#### **Drop Index**

```sql
DROP INDEX index_name;
```

---

### **3. Data Manipulation Language (DML)**

#### **Insert Data**

- **Insert one row**

  ```sql
  INSERT INTO table_name (column1, column2, column3)
  VALUES (value1, value2, value3);
  ```

- **Insert multiple rows**
  ```sql
  INSERT INTO table_name (column1, column2)
  VALUES
    (value1, value2),
    (value3, value4),
    (value5, value6);
  ```

#### **Select Data**

```sql
SELECT column1, column2 FROM table_name WHERE condition;
```

- **Select all columns**

  ```sql
  SELECT * FROM table_name;
  ```

- **Select with condition**

  ```sql
  SELECT column1 FROM table_name WHERE condition;
  ```

- **Limit the number of rows**

  ```sql
  SELECT * FROM table_name LIMIT 10;
  ```

- **Order by**

  ```sql
  SELECT * FROM table_name ORDER BY column_name DESC;
  ```

- **Aggregate functions** (COUNT, AVG, MAX, MIN, SUM)
  ```sql
  SELECT COUNT(*), AVG(salary) FROM employees;
  ```

#### **Update Data**

```sql
UPDATE table_name
SET column1 = value1, column2 = value2
WHERE condition;
```

#### **Delete Data**

```sql
DELETE FROM table_name WHERE condition;
```

#### **Truncate Table** (Removes all rows, faster than DELETE)

```sql
TRUNCATE TABLE table_name;
```

---

### **4. Data Query Language (DQL)**

#### **JOINs**

- **INNER JOIN**

  ```sql
  SELECT * FROM table1
  INNER JOIN table2 ON table1.id = table2.id;
  ```

- **Left JOIN (or Left Outer JOIN)**

  ```sql
  SELECT * FROM table1
  LEFT JOIN table2 ON table1.id = table2.id;
  ```

- **Right JOIN (or Right Outer JOIN)**

  ```sql
  SELECT * FROM table1
  RIGHT JOIN table2 ON table1.id = table2.id;
  ```

- **Full JOIN**

  ```sql
  SELECT * FROM table1
  FULL JOIN table2 ON table1.id = table2.id;
  ```

- **Cross JOIN**
  ```sql
  SELECT * FROM table1
  CROSS JOIN table2;
  ```

#### **Subqueries**

- **Subquery in SELECT**

  ```sql
  SELECT name FROM employees
  WHERE department_id = (SELECT id FROM departments WHERE name = 'Sales');
  ```

- **Subquery in FROM**
  ```sql
  SELECT * FROM (SELECT column1, column2 FROM table_name) AS subquery;
  ```

---

### **5. Data Control Language (DCL)**

#### **Grant Privileges**

```sql
GRANT ALL PRIVILEGES ON table_name TO username;
```

#### **Revoke Privileges**

```sql
REVOKE ALL PRIVILEGES ON table_name FROM username;
```

---

### **6. Data Types in PostgreSQL**

- **Numeric Types:**

  - `INTEGER`, `BIGINT`
  - `DECIMAL`, `NUMERIC`
  - `REAL`, `DOUBLE PRECISION`

- **String Types:**

  - `CHAR(n)`, `VARCHAR(n)`, `TEXT`

- **Date/Time Types:**
  - `DATE`, `TIME`, `TIMESTAMP`
  - `INTERVAL`
- **Boolean:**

  - `BOOLEAN`

- **Other Types:**
  - `UUID`, `BYTEA`
  - `JSON`, `JSONB`

---

### **7. Constraints in PostgreSQL**

- **Primary Key**

  ```sql
  PRIMARY KEY (column)
  ```

- **Foreign Key**

  ```sql
  FOREIGN KEY (column) REFERENCES other_table (column)
  ```

- **Unique**

  ```sql
  UNIQUE (column)
  ```

- **Check**

  ```sql
  CHECK (condition)
  ```

- **Not Null**
  ```sql
  column_name datatype NOT NULL
  ```

---

### **8. Indexing in PostgreSQL**

#### **Create Index**

```sql
CREATE INDEX index_name ON table_name (column_name);
```

#### **Unique Index**

```sql
CREATE UNIQUE INDEX index_name ON table_name (column_name);
```

#### **Drop Index**

```sql
DROP INDEX index_name;
```

---

### **9. Functions and Operators**

#### **String Functions**

- **Concatenate strings**

  ```sql
  SELECT CONCAT(column1, ' ', column2) FROM table_name;
  ```

- **Substring**

  ```sql
  SELECT SUBSTRING(column FROM 1 FOR 3) FROM table_name;
  ```

- **Length of string**
  ```sql
  SELECT LENGTH(column_name) FROM table_name;
  ```

#### **Date/Time Functions**

- **Get current date and time**

  ```sql
  SELECT CURRENT_TIMESTAMP;
  ```

- **Extract year**

  ```sql
  SELECT EXTRACT(YEAR FROM timestamp_column) FROM table_name;
  ```

- **Date addition**
  ```sql
  SELECT timestamp_column + INTERVAL '1 day' FROM table_name;
  ```

#### **Aggregation Functions**

- **Count rows**

  ```sql
  SELECT COUNT(*) FROM table_name;
  ```

- **Sum**

  ```sql
  SELECT SUM(column_name) FROM table_name;
  ```

- **Average**
  ```sql
  SELECT AVG(column_name) FROM table_name;
  ```

---

### **10. Transactions in PostgreSQL**

#### **Start Transaction**

```sql
BEGIN;
```

#### **Commit Transaction**

```sql
COMMIT;
```

#### **Rollback Transaction**

```sql
ROLLBACK;
```

#### **Savepoint**

```sql
SAVEPOINT savepoint_name;
```

#### **Rollback to Savepoint**

```sql
ROLLBACK TO SAVEPOINT savepoint_name;
```

---

### **11. PostgreSQL Schemas**

#### **Create Schema**

```sql
CREATE SCHEMA schema_name;
```

#### **List Schemas**

```sql
\dn
```

#### **Set Search Path**

```sql
SET search_path TO schema_name;
```

---

### **12. Backup & Restore**

#### **Backup a Database**

```sh
pg_dump dbname > backup_file.sql
```

#### **Restore a Database**

```sh
psql dbname < backup_file.sql
```

---

### **13. Miscellaneous**

#### **Show Current Database**

```sql
SELECT current_database();
```

#### **Show Current User**

```sql
SELECT current_user;
```

#### **Generate UUID**

```sql
SELECT gen_random_uuid();
```

---

This cheat sheet provides a comprehensive overview of common PostgreSQL commands and functions. It should cover most basic and intermediate use cases for interacting with PostgreSQL databases!


In [5]:
import pandas as pd
import psycopg
from contextlib import contextmanager


@contextmanager
def get_connection(config):
    """
    Context manager to handle PostgreSQL connection lifecycle.
    Ensures that the connection is closed after use.

    :param config: Dictionary containing database connection parameters.
    """
    conn = None
    try:
        conn = psycopg.connect(
            dbname=config["dbname"],
            user=config["user"],
            password=config["password"],
            host=config["host"],
            port=config["port"],
        )
        yield conn
        print(f"Connected to database {config['dbname']} successfully.")
    except psycopg.Error as e:
        print(f"Error connecting to database {config['dbname']}: {e}")
        raise
    finally:
        if conn:
            conn.close()
            print(f"Connection to database {config['dbname']} closed.")
        else:
            print(f"Failed to connect to database {config['dbname']}.")


def fetch_column_names_for_query(conn, query):
    """
    Fetch column names based on the query.

    :param conn: A psycopg connection object.
    :param query: SQL query to analyze.
    :return: A list of column names.
    """
    with conn.cursor() as cursor:
        try:
            cursor.execute(f"SELECT * FROM ({query}) AS subquery LIMIT 0;")
            return [desc[0] for desc in cursor.description]
        except Exception as e:
            print(f"Error executing query to fetch column names: {e}")
            raise


def fetch_data_in_chunks(conn, query, chunk_size=10000, params=None):
    """
    Fetch data in chunks to handle large datasets.

    :param conn: A psycopg connection object.
    :param query: SQL query to execute.
    :param chunk_size: Number of rows to fetch per chunk.
    :param params: Optional parameters for the SQL query.
    :return: A generator that yields chunks of data.
    """
    with conn.cursor() as cursor:
        cursor.execute(query, params)
        while True:
            data = cursor.fetchmany(chunk_size)
            if not data:
                break
            yield data


def run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=None
):
    """
    Execute a user query across multiple databases and return the results as a pandas DataFrame.

    :param db_configs: List of dictionaries containing database connection parameters.
    :param user_query: SQL query provided by the user.
    :param chunk_size: Number of rows to fetch per chunk.
    :param params: Optional parameters for the SQL query.
    :return: A pandas DataFrame with the combined results from all databases.
             The DataFrame includes a column for the database name.
    """
    dfs = []

    for config in db_configs:
        try:
            with get_connection(config) as conn:
                # Fetch the correct column names based on the user query
                column_names = fetch_column_names_for_query(conn, user_query)

                # Fetch data in chunks and append to DataFrames list
                for chunk in fetch_data_in_chunks(conn, user_query, chunk_size, params):
                    if chunk:
                        # Validate that the number of columns matches the expected number
                        if len(chunk[0]) != len(column_names):
                            print(
                                f"Warning: Number of columns in result from {config['dbname']} does not match the provided column names."
                            )
                            continue

                        for row in chunk:
                            if len(row) != len(column_names):
                                print(
                                    f"Warning: Row length mismatch in {config['dbname']}."
                                )
                                continue

                        # Create DataFrame and append to the list
                        df = pd.DataFrame(chunk, columns=column_names)
                        df['database'] = config['dbname']  # Add the database name as a column
                        dfs.append(df)
                    else:
                        print(f"No data returned from {config['dbname']}.")
        except Exception as e:
            print(f"Error processing database {config['dbname']}: {e}")

    # Concatenate all DataFrames into a single DataFrame
    combined_df = pd.concat(dfs, ignore_index=True) if dfs else pd.DataFrame()

    # Log message if no data was retrieved
    if combined_df.empty:
        print("No data was retrieved from the databases.")

    return combined_df


# Example usage
db_configs = [
    {
        "dbname": "Employees",
        "user": "postgres",
        "password": "root",
        "host": "localhost",
        "port": "5432",
    }
]

# Sample query
user_query = "SELECT * FROM employees ORDER BY emp_no ASC"  # Adjust the query as needed
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)
print(result_df)


Connected to database Employees successfully.
Connection to database Employees closed.
        emp_no  birth_date first_name last_name gender   hire_date   database
0        10001  1953-09-02     Georgi   Facello      M  1986-06-26  Employees
1        10002  1964-06-02    Bezalel    Simmel      F  1985-11-21  Employees
2        10003  1959-12-03      Parto   Bamford      M  1986-08-28  Employees
3        10004  1954-05-01  Chirstian   Koblick      M  1986-12-01  Employees
4        10005  1955-01-21    Kyoichi  Maliniak      M  1989-09-12  Employees
...        ...         ...        ...       ...    ...         ...        ...
300019  499995  1958-09-24     Dekang  Lichtner      F  1993-01-12  Employees
300020  499996  1953-03-07       Zito      Baaz      M  1990-09-27  Employees
300021  499997  1961-08-03    Berhard    Lenart      M  1986-04-21  Employees
300022  499998  1956-09-05   Patricia   Breugel      M  1993-10-13  Employees
300023  499999  1958-05-01     Sachin   Tsukuda      M 

In [6]:
db_configs = [
    {
        "dbname": "Employees",
        "user": "postgres",
        "password": "root",
        "host": "localhost",
        "port": "5432",
    }
]

# Sample query
user_query = "SELECT * FROM employees"  # Adjust the query as needed
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

print(result_df)

Connected to database Employees successfully.
Connection to database Employees closed.
        emp_no  birth_date first_name last_name gender   hire_date   database
0        10001  1953-09-02     Georgi   Facello      M  1986-06-26  Employees
1        10002  1964-06-02    Bezalel    Simmel      F  1985-11-21  Employees
2        10003  1959-12-03      Parto   Bamford      M  1986-08-28  Employees
3        10004  1954-05-01  Chirstian   Koblick      M  1986-12-01  Employees
4        10005  1955-01-21    Kyoichi  Maliniak      M  1989-09-12  Employees
...        ...         ...        ...       ...    ...         ...        ...
300019  499995  1958-09-24     Dekang  Lichtner      F  1993-01-12  Employees
300020  499996  1953-03-07       Zito      Baaz      M  1990-09-27  Employees
300021  499997  1961-08-03    Berhard    Lenart      M  1986-04-21  Employees
300022  499998  1956-09-05   Patricia   Breugel      M  1993-10-13  Employees
300023  499999  1958-05-01     Sachin   Tsukuda      M 

In [7]:
db_configs = [
    {
        "dbname": "Employees",
        "user": "postgres",
        "password": "root",
        "host": "localhost",
        "port": "5432",
    }
]

# Sample query
user_query = """
SELECT * FROM "public"."departments"
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

print(result_df)

Connected to database Employees successfully.
Connection to database Employees closed.
  dept_no           dept_name   database
0    d001           Marketing  Employees
1    d002             Finance  Employees
2    d003     Human Resources  Employees
3    d004          Production  Employees
4    d005         Development  Employees
5    d006  Quality Management  Employees
6    d007               Sales  Employees
7    d008            Research  Employees
8    d009    Customer Service  Employees


In [8]:
db_configs = [
    {
        "dbname": "Employees",
        "user": "postgres",
        "password": "root",
        "host": "localhost",
        "port": "5432",
    }
]

# Sample query
user_query = """
SELECT * FROM "public"."salaries"
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

print(result_df)

Connected to database Employees successfully.
Connection to database Employees closed.
         emp_no  salary   from_date     to_date   database
0         10001   60117  1986-06-26  1987-06-26  Employees
1         10001   62102  1987-06-26  1988-06-25  Employees
2         10001   66074  1988-06-25  1989-06-25  Employees
3         10001   66596  1989-06-25  1990-06-25  Employees
4         10001   66961  1990-06-25  1991-06-25  Employees
...         ...     ...         ...         ...        ...
2844042  499999   63707  1997-11-30  1998-11-30  Employees
2844043  499999   67043  1998-11-30  1999-11-30  Employees
2844044  499999   70745  1999-11-30  2000-11-29  Employees
2844045  499999   74327  2000-11-29  2001-11-29  Employees
2844046  499999   77303  2001-11-29  9999-01-01  Employees

[2844047 rows x 5 columns]


In [9]:
db_configs = [
    {
        "dbname": "Store",
        "user": "postgres",
        "password": "root",
        "host": "localhost",
        "port": "5432",
    }
]

# Sample query
user_query = """
SELECT * FROM "public"."customers" 
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

print(result_df)

Connected to database Store successfully.
Connection to database Store closed.
       customerid firstname    lastname             address1 address2  \
0               1    VKUUXF  ITHOMQJNYX  4608499546 Dell Way     None   
1               2    HQNMZH  UNUKXHJVXB  5119315633 Dell Way     None   
2               3    JTNRNB  LYYSHTQJRE  6297761196 Dell Way     None   
3               4    XMFYXD  WQLQHUHLFE  9862764981 Dell Way     None   
4               5    PGDTDU  ETBYBNEGUT  2841895775 Dell Way     None   
...           ...       ...         ...                  ...      ...   
19995       19996    KKDZUC  NSAXRLLPEM  5392978326 Dell Way     None   
19996       19997    GSGAVT  SWRJHDYMQA  3311555452 Dell Way     None   
19997       19998    MQTIKL  EBVMFUZVUU  7635641998 Dell Way     None   
19998       19999    XSKHEO  JZBOUAKOVK  5488070628 Dell Way     None   
19999       20000    WODADM  AEBUFMJAWZ  6224597470 Dell Way     None   

          city state    zip    country  regi

In [10]:
db_configs = [
    {
        "dbname": "Employees",
        "user": "postgres",
        "password": "root",
        "host": "localhost",
        "port": "5432",
    }
]

# Sample query
user_query = """
SELECT * FROM "public"."salaries" WHERE emp_no=10001
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

print(result_df)

Connected to database Employees successfully.
Connection to database Employees closed.
    emp_no  salary   from_date     to_date   database
0    10001   60117  1986-06-26  1987-06-26  Employees
1    10001   62102  1987-06-26  1988-06-25  Employees
2    10001   66074  1988-06-25  1989-06-25  Employees
3    10001   66596  1989-06-25  1990-06-25  Employees
4    10001   66961  1990-06-25  1991-06-25  Employees
5    10001   71046  1991-06-25  1992-06-24  Employees
6    10001   74333  1992-06-24  1993-06-24  Employees
7    10001   75286  1993-06-24  1994-06-24  Employees
8    10001   75994  1994-06-24  1995-06-24  Employees
9    10001   76884  1995-06-24  1996-06-23  Employees
10   10001   80013  1996-06-23  1997-06-23  Employees
11   10001   81025  1997-06-23  1998-06-23  Employees
12   10001   81097  1998-06-23  1999-06-23  Employees
13   10001   84917  1999-06-23  2000-06-22  Employees
14   10001   85112  2000-06-22  2001-06-22  Employees
15   10001   85097  2001-06-22  2002-06-22  Emplo

In [11]:
db_configs = [
    {
        "dbname": "Employees",
        "user": "postgres",
        "password": "root",
        "host": "localhost",
        "port": "5432",
    }
]

# Sample query
user_query = """
SELECT emp_no AS "Employee #", birth_date AS "Birthday", first_name AS "First name" FROM "public"."employees"
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

print(result_df)

Connected to database Employees successfully.
Connection to database Employees closed.
        Employee #    Birthday First name   database
0            10001  1953-09-02     Georgi  Employees
1            10002  1964-06-02    Bezalel  Employees
2            10003  1959-12-03      Parto  Employees
3            10004  1954-05-01  Chirstian  Employees
4            10005  1955-01-21    Kyoichi  Employees
...            ...         ...        ...        ...
300019      499995  1958-09-24     Dekang  Employees
300020      499996  1953-03-07       Zito  Employees
300021      499997  1961-08-03    Berhard  Employees
300022      499998  1956-09-05   Patricia  Employees
300023      499999  1958-05-01     Sachin  Employees

[300024 rows x 4 columns]


In [12]:
db_configs = [
    {
        "dbname": "Employees",
        "user": "postgres",
        "password": "root",
        "host": "localhost",
        "port": "5432",
    }
]

# Sample query
user_query = """
SELECT CONCAT(emp_no, ' is a ', title) AS "Employee Title" FROM "public"."titles"
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

print(result_df)

Connected to database Employees successfully.
Connection to database Employees closed.
                     Employee Title   database
0        10001 is a Senior Engineer  Employees
1                  10002 is a Staff  Employees
2        10003 is a Senior Engineer  Employees
3               10004 is a Engineer  Employees
4        10004 is a Senior Engineer  Employees
...                             ...        ...
443303         499997 is a Engineer  Employees
443304  499997 is a Senior Engineer  Employees
443305     499998 is a Senior Staff  Employees
443306            499998 is a Staff  Employees
443307         499999 is a Engineer  Employees

[443308 rows x 2 columns]


In [13]:
db_configs = [
    {
        "dbname": "Employees",
        "user": "postgres",
        "password": "root",
        "host": "localhost",
        "port": "5432",
    }
]

# Sample query
user_query = """
SELECT emp_no, CONCAT(first_name, ' ', last_name) AS "Full name" FROM "public"."employees"
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

print(result_df)

Connected to database Employees successfully.
Connection to database Employees closed.
        emp_no          Full name   database
0        10001     Georgi Facello  Employees
1        10002     Bezalel Simmel  Employees
2        10003      Parto Bamford  Employees
3        10004  Chirstian Koblick  Employees
4        10005   Kyoichi Maliniak  Employees
...        ...                ...        ...
300019  499995    Dekang Lichtner  Employees
300020  499996          Zito Baaz  Employees
300021  499997     Berhard Lenart  Employees
300022  499998   Patricia Breugel  Employees
300023  499999     Sachin Tsukuda  Employees

[300024 rows x 3 columns]


In [14]:
db_configs = [
    {
        "dbname": "Employees",
        "user": "postgres",
        "password": "root",
        "host": "localhost",
        "port": "5432",
    }
]

# Sample query
user_query = """
SELECT COUNT(emp_no) FROM "public"."employees"
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

print(result_df)

Connected to database Employees successfully.
Connection to database Employees closed.
    count   database
0  300024  Employees


In [15]:
db_configs = [
    {
        "dbname": "Employees",
        "user": "postgres",
        "password": "root",
        "host": "localhost",
        "port": "5432",
    }
]

# Sample query
user_query = """
SELECT SUM(salary) FROM "public"."salaries"
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

print(result_df)

Connected to database Employees successfully.
Connection to database Employees closed.
            sum   database
0  181480757419  Employees


In [16]:
db_configs = [
    {
        "dbname": "Employees",
        "user": "postgres",
        "password": "root",
        "host": "localhost",
        "port": "5432",
    }
]

# Sample query
user_query = """
SELECT * FROM "public"."employees" WHERE first_name='Mayumi' AND last_name='Schueller'
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

print(result_df)

Connected to database Employees successfully.
Connection to database Employees closed.
   emp_no  birth_date first_name  last_name gender   hire_date   database
0   10054  1957-04-04     Mayumi  Schueller      M  1995-03-13  Employees


In [17]:
db_configs = [
    {
        "dbname": "Employees",
        "user": "postgres",
        "password": "root",
        "host": "localhost",
        "port": "5432",
    }
]

# Sample query
user_query = """
SELECT * FROM "public"."employees" WHERE gender='F'
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

print(result_df)

Connected to database Employees successfully.
Connection to database Employees closed.
        emp_no  birth_date first_name  last_name gender   hire_date   database
0        10002  1964-06-02    Bezalel     Simmel      F  1985-11-21  Employees
1        10006  1953-04-20     Anneke    Preusig      F  1989-06-02  Employees
2        10007  1957-05-23    Tzvetan  Zielinski      F  1989-02-10  Employees
3        10009  1952-04-19     Sumant       Peac      F  1985-02-18  Employees
4        10010  1963-06-01  Duangkaew   Piveteau      F  1989-08-24  Employees
...        ...         ...        ...        ...    ...         ...        ...
120046  499988  1962-09-28   Bangqing    Kleiser      F  1986-06-06  Employees
120047  499991  1962-02-26      Pohua    Sichman      F  1989-01-12  Employees
120048  499992  1960-10-12     Siamak   Salverda      F  1987-05-10  Employees
120049  499994  1952-02-26      Navin    Argence      F  1990-04-24  Employees
120050  499995  1958-09-24     Dekang   Lich

In [18]:
db_configs = [
    {
        "dbname": "Employees",
        "user": "postgres",
        "password": "root",
        "host": "localhost",
        "port": "5432",
    }
]

# Sample query
user_query = """
SELECT first_name, last_name, hire_date FROM "public"."employees" 
WHERE first_name='Georgi' AND last_name='Facello' AND hire_date='1986-06-26'
OR first_name='Bezalel' AND last_name='Simmel'
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

print(result_df)

Connected to database Employees successfully.
Connection to database Employees closed.
  first_name last_name   hire_date   database
0     Georgi   Facello  1986-06-26  Employees
1    Bezalel    Simmel  1985-11-21  Employees


In [19]:
db_configs = [
    {
        "dbname": "Employees",
        "user": "postgres",
        "password": "root",
        "host": "localhost",
        "port": "5432",
    }
]

# Sample query
user_query = """
SELECT * FROM "public"."employees" 
WHERE first_name='Georgi' AND last_name='Facello' AND hire_date='1986-06-26'
OR first_name='Bezalel' AND last_name='Simmel'
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

print(result_df)

Connected to database Employees successfully.
Connection to database Employees closed.
   emp_no  birth_date first_name last_name gender   hire_date   database
0   10001  1953-09-02     Georgi   Facello      M  1986-06-26  Employees
1   10002  1964-06-02    Bezalel    Simmel      F  1985-11-21  Employees


In [20]:
db_configs[0]["dbname"] = "Store"

# Sample query
user_query = """
SELECT COUNT(firstname) FROM "public"."customers" 
WHERE gender='F' AND (state='OR' OR state='NY')
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df

Connected to database Store successfully.
Connection to database Store closed.


Unnamed: 0,count,database
0,200,Store


In [21]:
db_configs[0]["dbname"] = "Store"

# Sample query
user_query = """
SELECT * FROM "public"."customers" 
WHERE NOT age='55' AND NOT age='20'
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

len(result_df)

Connected to database Store successfully.
Connection to database Store closed.


19419

In [22]:
db_configs[0]["dbname"] = "Store"

# Sample query
user_query = """
SELECT * FROM "public"."customers" 
WHERE age>'44' AND income='100000'
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df.head(5)

Connected to database Store successfully.
Connection to database Store closed.


Unnamed: 0,customerid,firstname,lastname,address1,address2,city,state,zip,country,region,...,phone,creditcardtype,creditcard,creditcardexpiration,username,password,age,income,gender,database
0,1,VKUUXF,ITHOMQJNYX,4608499546 Dell Way,,QSDPAGD,SD,24101,US,1,...,4608499546,1,1979279217775911,2012/03,user1,password,55,100000,M,Store
1,3,JTNRNB,LYYSHTQJRE,6297761196 Dell Way,,LWVIFXJ,OH,96082,US,1,...,6297761196,4,8728086929768325,2010/12,user3,password,47,100000,M,Store
2,6,FXDZBW,BAXPEEKXVJ,6192740010 Dell Way,,OPLRCNT,IN,99300,US,1,...,6192740010,5,7730283664073796,2011/01,user6,password,72,100000,M,Store
3,9,NCGWRC,CJOPRHUHIE,7291678624 Dell Way,,ZAVIELY,VT,78838,US,1,...,7291678624,1,7172072122339160,2009/10,user9,password,86,100000,M,Store
4,15,SIQANV,QQNKJSURDA,3354132892 Dell Way,,BREQSOA,AK,37471,US,1,...,3354132892,4,8717996907886119,2008/05,user15,password,66,100000,M,Store


---

```sql
SELECT state, gender FROM customers
WHERE gender = 'F' AND state = 'OR' OR state = 'NY';
```

GET ME STATE AND GENDER WHERE YOU ARE:
    A FEMALE FROM OREGON 
            OR
    YOU ARE FROM NY 

---

---

```sql
SELECT state, gender FROM customers
WHERE gender = 'F' AND state = 'OR' 
OR gender = 'F' AND state = 'NY';
```

GET ME STATE AND GENDER WHERE YOU ARE:
        A FEMALE FROM OREGON 
        OR A FEMALE FROM NY

---

---

```sql
SELECT state, gender FROM customers
WHERE gender = 'F' AND (state = 'OR' OR state = 'NY');
```

GET ME STATE AND GENDER WHERE YOU ARE:
        A FEMALE AND 
    YOU ARE FROM OREGON OR NY

---

---

A STATEMENT HAVING MULTIPLE OPERATORS IS EVALUATED BASED ON THE PRIORITY OF OPERATORS.


```sql
SELECT state, gender FROM customers
WHERE gender = 'F' AND state = 'OR' 
OR gender = 'F' AND state = 'NY';
```

1. GENDER FEMALE
2. FROM OREGON


1. GENDER FEMALE
2. FROM NY


---

---

```sql
SELECT state, gender FROM customers
WHERE gender = 'F' AND (state = 'OR' OR state = 'NY');
```
1. FROM OREGON
2. FROM NY
3. GENDER MALE

---

---

IF THE OPERATORS HAVE EQUAL PRECEDENCE, THEN THE OPERATORS ARE EVALUATED DIRECTIONALLY,
FROM LEFT TO RIGHT OR RIGHT TO LEFT.

```sql
age > 20
AND salary > 1000
AND gender = 'f'
AND NOT state = 'NY'
```

1. NOT FROM NY
2. OLDER THAN 20
3. SALARY > 1000
4. GENDER FEMALE

---

---

IF THE OPERATORS HAVE EQUAL PRECEDENCE, THEN THE OPERATORS ARE EVALUATED DIRECTIONALLY,
FROM LEFT TO RIGHT OR RIGHT TO LEFT.

```sql
(age > 20 OR age < 30)
AND salary > 1000
AND NOT state = 'NY'
AND NOT state = 'OR'
```

1. AGES 21 TO 29
2. NOT FROM NY
3. NOT FROM OR
4. SALARY > 1000

---

---

IF THE OPERATORS HAVE EQUAL PRECEDENCE, THEN THE OPERATORS ARE EVALUATED DIRECTIONALLY,
FROM LEFT TO RIGHT OR RIGHT TO LEFT.

```sql
age > 20 
OR age < 30
AND salary > 1000
AND NOT state = 'NY'
AND NOT state = 'OR'
```

1. YOUNGER THAN 30
2. NOT FROM NY
3. NOT FROM OR
4. SALARY > 1000

1. OLDER THAN 20

---

---

IF THE OPERATORS HAVE EQUAL PRECEDENCE, THEN THE OPERATORS ARE EVALUATED DIRECTIONALLY,
FROM LEFT TO RIGHT OR RIGHT TO LEFT.

```sql
age > 20 
OR age < 30
OR salary > 1000
```

FILTER 1
1. YOUNGER THAN 30

FILTER 2
1. OLDER THAN 20

FILTER 3
1. SALARY > 1000


---

---

IF THE OPERATORS HAVE EQUAL PRECEDENCE, THEN THE OPERATORS ARE EVALUATED DIRECTIONALLY,
FROM LEFT TO RIGHT OR RIGHT TO LEFT.

```sql
(
    salary > 10000 AND state = 'NY'
        OR (
            (age > 20 AND age < 30)
            AND salary <= 20000
        )
)
AND gender = 'F'
```

FILTER 1
1. SALARY > 10000
2. FROM NY
3. FEMALE

FILTER 2
1. BETWEEN 21 AND 29
2. SALARY LOWER THAN OR EQUAL TO 20000
3. FEMALE




---

---

SHOULD I USE NULL?

BE CAREFUL
BE MINDFUL
BE DELIBERATE

---

---

```sql
1 = 1 -- true
1 != 1 -- false
null = null -- null
null <> null -- null
```
 
---

---

NO MATTER WHAT YOU DO WITH NULL IT WILL ALWAYS BE NULL SUBTRACT, DIVIDE, EQUAL, ...

---

---

### THE "IS" OPERATOR

ALLOWS YOU TO FILTER ON VALUES THAT ARE NULL, NOT NULL, TRUE OR FALSE

```sql
SELECT * FROM <table>
WHERE <field> IS [NOT] NULL
```

```sql
SELECT * FROM <table>
WHERE <field> = '' IS NOT NULL
```

```sql
SELECT * FROM <table>
WHERE <field> = '' IS NOT FALSE
```

```sql
SELECT * FROM users
WHERE age = 20 IS FALSE
```


---

In [23]:
db_configs[0]["dbname"] = "Store"

# Sample query
user_query = """
SELECT * FROM "public"."customers" 
WHERE age>'44' AND income='100000'
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df.head(5)

Connected to database Store successfully.
Connection to database Store closed.


Unnamed: 0,customerid,firstname,lastname,address1,address2,city,state,zip,country,region,...,phone,creditcardtype,creditcard,creditcardexpiration,username,password,age,income,gender,database
0,1,VKUUXF,ITHOMQJNYX,4608499546 Dell Way,,QSDPAGD,SD,24101,US,1,...,4608499546,1,1979279217775911,2012/03,user1,password,55,100000,M,Store
1,3,JTNRNB,LYYSHTQJRE,6297761196 Dell Way,,LWVIFXJ,OH,96082,US,1,...,6297761196,4,8728086929768325,2010/12,user3,password,47,100000,M,Store
2,6,FXDZBW,BAXPEEKXVJ,6192740010 Dell Way,,OPLRCNT,IN,99300,US,1,...,6192740010,5,7730283664073796,2011/01,user6,password,72,100000,M,Store
3,9,NCGWRC,CJOPRHUHIE,7291678624 Dell Way,,ZAVIELY,VT,78838,US,1,...,7291678624,1,7172072122339160,2009/10,user9,password,86,100000,M,Store
4,15,SIQANV,QQNKJSURDA,3354132892 Dell Way,,BREQSOA,AK,37471,US,1,...,3354132892,4,8717996907886119,2008/05,user15,password,66,100000,M,Store


In [24]:
db_configs[0]["dbname"] = "Employees"

# Sample query
user_query = """
SELECT * FROM "public"."employees" 
WHERE emp_no IN (100001, 100006, 11008)
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df.head(5)

Connected to database Employees successfully.
Connection to database Employees closed.


Unnamed: 0,emp_no,birth_date,first_name,last_name,gender,hire_date,database
0,11008,1962-07-11,Gennady,Menhoudj,M,1988-09-18,Employees
1,100001,1953-02-07,Jasminko,Antonakopoulos,M,1994-12-25,Employees
2,100006,1956-07-13,Janalee,Himler,F,1986-01-15,Employees


In [25]:
db_configs[0]["dbname"] = "Employees"

# Sample query
user_query = """
SELECT * FROM "public"."employees" 
WHERE first_name LIKE 'M%y'
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df.head(5)

Connected to database Employees successfully.
Connection to database Employees closed.


Unnamed: 0,emp_no,birth_date,first_name,last_name,gender,hire_date,database
0,10011,1953-11-07,Mary,Sluis,F,1990-01-22,Employees
1,10042,1956-02-26,Magy,Stamatiou,F,1993-03-21,Employees
2,10281,1953-05-13,Moty,Kusakari,M,1994-12-03,Employees
3,10532,1959-08-31,Mary,Wossner,F,1986-05-18,Employees
4,10707,1957-06-04,Moty,Trystram,M,1990-12-24,Employees


In [26]:
db_configs[0]["dbname"] = "Employees"

# Sample query
user_query = """
SELECT * FROM "public"."employees" 
WHERE first_name ILIKE 'g%y'
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df.head(5)

Connected to database Employees successfully.
Connection to database Employees closed.


Unnamed: 0,emp_no,birth_date,first_name,last_name,gender,hire_date,database
0,10055,1956-06-06,Georgy,Dredge,M,1992-04-27,Employees
1,11008,1962-07-11,Gennady,Menhoudj,M,1988-09-18,Employees
2,11152,1962-11-13,Georgy,Walstra,F,1988-02-18,Employees
3,11474,1953-04-17,Gennady,Akazan,M,1990-07-26,Employees
4,11514,1963-08-17,Georgy,Iwayama,M,1988-07-30,Employees


---

### SHOW TIMEZONE;
### ALTER USER postgres SET timezone='UTC';


POSTGRESQL USES ISO-8601

HOW DOES DATES LOOK?

YYYY-MM-DDTHH:MM:SS
2017-08-17T12:47:16+02:00


TIMESTAMPS

A TIMESTAMP IS A DATE WITH TIME AND TIMEZONE INFO
---

---

INSERT INTO timezones VALUES(
    TIMESTAMP WITHOUT TIME ZONE '2000-01-01 10:00:00-05' ,
    TIMESTAMP WITH TIME ZONE '2000-01-01 10:00:00-05'
);

CREATE TABLE timezones (
    ts TIMESTAMP WITHOUT TIME ZONE,
    tz TIMESTAMP WITH TIME ZONE
)

---

In [27]:
db_configs[0]["dbname"] = "Employees"

# Sample query
user_query = """
SELECT now()
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df.head(5)

Connected to database Employees successfully.
Connection to database Employees closed.


Unnamed: 0,now,database
0,2025-05-02 13:13:26.610076+05:30,Employees


In [28]:
db_configs[0]["dbname"] = "Employees"

# Sample query
user_query = """
SELECT now()::date
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df.head(5)

Connected to database Employees successfully.
Connection to database Employees closed.


Unnamed: 0,now,database
0,2025-05-02,Employees


In [29]:
db_configs[0]["dbname"] = "Employees"

# Sample query
user_query = """
SELECT CURRENT_DATE
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df.head(5)

Connected to database Employees successfully.
Connection to database Employees closed.


Unnamed: 0,current_date,database
0,2025-05-02,Employees


In [30]:
db_configs[0]["dbname"] = "Employees"

# Sample query
user_query = """
SELECT to_char( CURRENT_DATE, 'dd/mm/yyyy' )
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df.head(5)

Connected to database Employees successfully.
Connection to database Employees closed.


Unnamed: 0,to_char,database
0,02/05/2025,Employees


In [31]:
db_configs[0]["dbname"] = "Employees"

# Sample query
user_query = """
SELECT to_char( CURRENT_DATE, 'DDD' )
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df.head(5)

Connected to database Employees successfully.
Connection to database Employees closed.


Unnamed: 0,to_char,database
0,122,Employees


In [32]:
db_configs[0]["dbname"] = "Employees"

# Sample query
user_query = """
SELECT now() - '2001/07/26'
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df.head(5)

Connected to database Employees successfully.
Connection to database Employees closed.


Unnamed: 0,?column?,database
0,8681 days 13:13:26.840899,Employees


In [33]:
db_configs[0]["dbname"] = "Employees"

# Sample query
user_query = """
SELECT date '1800/01/01'
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df.head(5)

Connected to database Employees successfully.
Connection to database Employees closed.


Unnamed: 0,date,database
0,1800-01-01,Employees


In [34]:
db_configs[0]["dbname"] = "Employees"

# Sample query
user_query = """
SELECT AGE(date '2001/07/26')
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df.head(5)

Connected to database Employees successfully.
Connection to database Employees closed.


Unnamed: 0,age,database
0,8672 days,Employees


In [35]:
db_configs[0]["dbname"] = "Employees"

# Sample query
user_query = """
SELECT AGE(date '1992/11/13', date '1800/01/01')
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df.head(5)

Connected to database Employees successfully.
Connection to database Employees closed.


Unnamed: 0,age,database
0,70392 days,Employees


In [36]:
db_configs[0]["dbname"] = "Employees"

# Sample query
user_query = """
SELECT EXTRACT (DAY FROM date '1992/11/13') AS DAY
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df.head(5)

Connected to database Employees successfully.
Connection to database Employees closed.


Unnamed: 0,day,database
0,13,Employees


In [37]:
db_configs[0]["dbname"] = "Employees"

# Sample query
user_query = """
SELECT EXTRACT (MONTH FROM date '1992/11/13') AS MONTH
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df.head(5)

Connected to database Employees successfully.
Connection to database Employees closed.


Unnamed: 0,month,database
0,11,Employees


In [38]:
db_configs[0]["dbname"] = "Employees"

# Sample query
user_query = """
SELECT EXTRACT (YEAR FROM date '1992/11/13') AS YEAR
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df.head(5)

Connected to database Employees successfully.
Connection to database Employees closed.


Unnamed: 0,year,database
0,1992,Employees


In [39]:
db_configs[0]["dbname"] = "Employees"

# Sample query
user_query = """
SELECT date_trunc( 'year', date '1992/11/13' )
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df.head(5)

Connected to database Employees successfully.
Connection to database Employees closed.


Unnamed: 0,date_trunc,database
0,1992-01-01 00:00:00+05:30,Employees


In [40]:
db_configs[0]["dbname"] = "Employees"

# Sample query
user_query = """
SELECT date_trunc( 'month', date '1992/11/13' )
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df.head(5)

Connected to database Employees successfully.
Connection to database Employees closed.


Unnamed: 0,date_trunc,database
0,1992-11-01 00:00:00+05:30,Employees


In [41]:
db_configs[0]["dbname"] = "Employees"

# Sample query
user_query = """
SELECT date_trunc( 'day', date '1992/11/13' )
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df.head(5)

Connected to database Employees successfully.
Connection to database Employees closed.


Unnamed: 0,date_trunc,database
0,1992-11-13 00:00:00+05:30,Employees


In [42]:
db_configs[0]["dbname"] = "Store"

# Sample query
user_query = """
SELECT * 
FROM orders 
WHERE orderdate <=now() - INTERVAL '30 days'
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df.head(5)



Connected to database Store successfully.
Connection to database Store closed.


Unnamed: 0,orderid,orderdate,customerid,netamount,tax,totalamount,database
0,1,2004-01-27,7888,313.24,25.84,339.08,Store
1,2,2004-01-01,4858,54.9,4.53,59.43,Store
2,3,2004-01-17,15399,160.1,13.21,173.31,Store
3,4,2004-01-28,17019,106.67,8.8,115.47,Store
4,5,2004-01-09,14771,256.0,21.12,277.12,Store


In [43]:
# db_configs[0]["dbname"] = "Employees"

# # Sample query
# user_query = """
# INTERVAL '1 year 2 months 3 days'
# INTERVAL '2 weeks ago'
# INTERVAL '1 year 3 hours 20 minutes'
# """
# params = None  # You can add parameters if needed

# # Execute the function
# result_df = run_query_with_dynamic_columns(
#     db_configs, user_query, chunk_size=10000, params=params
# )

# result_df.head(5)



In [44]:
db_configs[0]["dbname"] = "Employees"

# Sample query
user_query = """
select EXTRACT (
    year from interval '5 years 20 months'
)
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df.head(5)



Connected to database Employees successfully.
Connection to database Employees closed.


Unnamed: 0,extract,database
0,6,Employees


In [45]:
db_configs[0]["dbname"] = "Employees"

# Sample query
user_query = """
SELECT DISTINCT salary FROM public.salaries
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df.head(5)



Connected to database Employees successfully.
Connection to database Employees closed.


Unnamed: 0,salary,database
0,102171,Employees
1,42007,Employees
2,63775,Employees
3,94054,Employees
4,110727,Employees


In [46]:
# db_configs[0]["dbname"] = "Employees"

# # Sample query
# user_query = """
# SELECT DISTINCT salary, from_date FROM public.salaries
# """
# params = None  # You can add parameters if needed

# # Execute the function
# result_df = run_query_with_dynamic_columns(
#     db_configs, user_query, chunk_size=10000, params=params
# )

# result_df.head(5)



In [47]:
db_configs[0]["dbname"] = "Employees"

# Sample query
user_query = """
SELECT first_name, last_name FROM employees
ORDER BY first_name DESC, last_name DESC
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df.head(5)



Connected to database Employees successfully.
Connection to database Employees closed.


Unnamed: 0,first_name,last_name,database
0,Zvonko,Zuberek,Employees
1,Zvonko,Zobel,Employees
2,Zvonko,Zambonelli,Employees
3,Zvonko,Yurov,Employees
4,Zvonko,Yoshizawa,Employees


In [48]:
db_configs[0]["dbname"] = "Employees"

# Sample query
user_query = """
SELECT first_name, last_name FROM employees
ORDER BY LENGTH(first_name) DESC
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df.head(5)



Connected to database Employees successfully.
Connection to database Employees closed.


Unnamed: 0,first_name,last_name,database
0,Gopalakrishnan,Luit,Employees
1,Chandrasekaran,Muhlberg,Employees
2,Gopalakrishnan,Percebois,Employees
3,Gopalakrishnan,Peck,Employees
4,Gopalakrishnan,Osgood,Employees


In [49]:
db_configs[0]["dbname"] = "Employees"

# Sample query
user_query = """
SELECT a.emp_no, concat(a.first_name, ' ',a.last_name) as "name", b.salary
from employees as a, salaries as b
where a.emp_no = '10001'
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df.head(5)



Connected to database Employees successfully.
Connection to database Employees closed.


Unnamed: 0,emp_no,name,salary,database
0,10001,Georgi Facello,60117,Employees
1,10001,Georgi Facello,62102,Employees
2,10001,Georgi Facello,66074,Employees
3,10001,Georgi Facello,66596,Employees
4,10001,Georgi Facello,66961,Employees


In [50]:
db_configs[0]["dbname"] = "Employees"

# Sample query
user_query = """
SELECT a.emp_no, concat(a.first_name, ' ',a.last_name) as "name", b.salary
from employees as a, salaries as b
where a.emp_no = b.emp_no
order by a.emp_no
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df.head(5)



Connected to database Employees successfully.
Connection to database Employees closed.


Unnamed: 0,emp_no,name,salary,database
0,10001,Georgi Facello,60117,Employees
1,10001,Georgi Facello,62102,Employees
2,10001,Georgi Facello,66074,Employees
3,10001,Georgi Facello,66596,Employees
4,10001,Georgi Facello,66961,Employees


In [51]:
db_configs[0]["dbname"] = "Employees"

# Sample query
user_query = """
select a.emp_no, 
        concat(a.first_name,' ',last_name) as "name",
        b.salary
from employees as a 
INNER JOIN salaries as b on b.emp_no = a.emp_no
order by a.emp_no ASC
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df.head(5)



Connected to database Employees successfully.
Connection to database Employees closed.


Unnamed: 0,emp_no,name,salary,database
0,10001,Georgi Facello,60117,Employees
1,10001,Georgi Facello,62102,Employees
2,10001,Georgi Facello,66074,Employees
3,10001,Georgi Facello,66596,Employees
4,10001,Georgi Facello,66961,Employees


In [52]:
db_configs[0]["dbname"] = "Employees"

# Sample query
user_query = """
SELECT a.emp_no, 
        concat(a.first_name, ' ',a.last_name) as "name",
        b.salary,
        c.title,
        c.from_date as "promoted on"
from employees as a
INNER JOIN salaries as b on a.emp_no = b.emp_no
INNER JOIN titles as c on c.emp_no = a.emp_no
and c.from_date = (b.from_date + interval '2 days')
order by a.emp_no
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df.head(5)



Connected to database Employees successfully.
Connection to database Employees closed.


Unnamed: 0,emp_no,name,salary,title,promoted on,database
0,10004,Chirstian Koblick,60770,Senior Engineer,1995-12-01,Employees
1,10005,Kyoichi Maliniak,88063,Senior Staff,1996-09-12,Employees
2,10007,Tzvetan Zielinski,70220,Senior Staff,1996-02-11,Employees
3,10009,Sumant Peac,80944,Senior Engineer,1995-02-18,Employees
4,10012,Patricio Bridgland,54794,Senior Engineer,2000-12-18,Employees


In [None]:
db_configs[0]["dbname"] = "Employees"

# Sample query
user_query = """
SELECT a.emp_no, 
        CONCAT(a.first_name, ' ',a.last_name) as "name",
        b.salary,
        c.title,
        c.from_date as "promoted on"
FROM employees AS a
INNER JOIN salaries AS b ON a.emp_no = b.emp_no
INNER JOIN titles AS c 
    ON c.emp_no = a.emp_no AND (
        b.from_date = c.from_date
        OR (b.from_date + interval '2 days') = c.from_date
    )
        ORDER BY a.emp_no ASC, b.from_date ASC;
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df.head(5)



In [53]:
db_configs[0]["dbname"] = "Employees"

# Sample query
user_query = """
select a.emp_no,b.salary, b.from_date, c.title 
from employees as a
INNER JOIN salaries as b ON b.emp_no = a.emp_no
INNER JOIN titles as c on c.emp_no = a.emp_no
order by a.emp_no asc, b.from_date ASC
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df.head(5)



Connected to database Employees successfully.
Connection to database Employees closed.


Unnamed: 0,emp_no,salary,from_date,title,database
0,10001,60117,1986-06-26,Senior Engineer,Employees
1,10001,62102,1987-06-26,Senior Engineer,Employees
2,10001,66074,1988-06-25,Senior Engineer,Employees
3,10001,66596,1989-06-25,Senior Engineer,Employees
4,10001,66961,1990-06-25,Senior Engineer,Employees


In [54]:
db_configs[0]["dbname"] = "Employees"

# Sample query
user_query = """
select a.emp_no,b.salary, b.from_date, c.title 
from employees as a
INNER JOIN salaries as b ON b.emp_no = a.emp_no
INNER JOIN titles as c on c.emp_no = a.emp_no 
and (b.from_date + interval '2 days') = c.from_date
order by a.emp_no asc, b.from_date ASC
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df.head(5)



Connected to database Employees successfully.
Connection to database Employees closed.


Unnamed: 0,emp_no,salary,from_date,title,database
0,10004,60770,1995-11-29,Senior Engineer,Employees
1,10005,88063,1996-09-10,Senior Staff,Employees
2,10007,70220,1996-02-09,Senior Staff,Employees
3,10009,80944,1995-02-16,Senior Engineer,Employees
4,10012,54794,2000-12-16,Senior Engineer,Employees


In [55]:
db_configs[0]["dbname"] = "Employees"

# Sample query
user_query = """
select a.emp_no, b.salary, b.from_date, c.title 
from employees as a 
INNER JOIN salaries as b on b.emp_no = a.emp_no
INNER JOIN titles as C
        on c.emp_no = a.emp_no
        and (
        b.from_date = c.from_date
        or (b.from_date + INTERVAL '2 days') = c.from_date
        )
order by a.emp_no asc, b.from_date asc
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df.head(5)



Connected to database Employees successfully.
Connection to database Employees closed.


Unnamed: 0,emp_no,salary,from_date,title,database
0,10001,60117,1986-06-26,Senior Engineer,Employees
1,10002,65828,1996-08-03,Staff,Employees
2,10003,40006,1995-12-03,Senior Engineer,Employees
3,10004,40054,1986-12-01,Engineer,Employees
4,10004,60770,1995-11-29,Senior Engineer,Employees


In [56]:
db_configs[0]["dbname"] = "Employees"

# Sample query
user_query = """
SELECT a.emp_no, 
        concat(a.first_name, ' ',a.last_name) as "name",
        b.salary,
        COALESCE(c.title, 'No title change'),
        COALESCE(c.from_date::text, '-') as "title taken on"
from employees as a
INNER JOIN salaries as b on a.emp_no = b.emp_no
INNER JOIN titles as c 
on c.emp_no = a.emp_no and (
        c.from_date = (b.from_date + interval '2 days') OR
        c.from_date = b.from_date        
)
order by a.emp_no
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df.head(5)



Connected to database Employees successfully.
Connection to database Employees closed.


Unnamed: 0,emp_no,name,salary,coalesce,title taken on,database
0,10001,Georgi Facello,60117,Senior Engineer,1986-06-26,Employees
1,10002,Bezalel Simmel,65828,Staff,1996-08-03,Employees
2,10003,Parto Bamford,40006,Senior Engineer,1995-12-03,Employees
3,10004,Chirstian Koblick,40054,Engineer,1986-12-01,Employees
4,10004,Chirstian Koblick,60770,Senior Engineer,1995-12-01,Employees


In [57]:
db_configs[0]["dbname"] = "Employees"

# Sample query
user_query = """
SELECT COUNT(emp.emp_no)
from employees as emp
left JOIN dept_manager as dep on emp.emp_no = dep.emp_no
where dep.emp_no IS null
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df.head(5)



Connected to database Employees successfully.
Connection to database Employees closed.


Unnamed: 0,count,database
0,300000,Employees


In [58]:
db_configs[0]["dbname"] = "Employees"

# Sample query
user_query = """
select count(emp.emp_no)
from employees AS emp 
left JOIN dept_manager as dep on emp.emp_no = dep.emp_no
where dep.emp_no is null
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df.head(5)



Connected to database Employees successfully.
Connection to database Employees closed.


Unnamed: 0,count,database
0,300000,Employees


In [59]:
db_configs[0]["dbname"] = "Employees"

# Sample query
user_query = """
select emp.emp_no, dep.emp_no
from employees AS emp 
left JOIN dept_manager as dep on emp.emp_no = dep.emp_no
where dep.emp_no is not null
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df.head(5)



Connected to database Employees successfully.
Connection to database Employees closed.


Unnamed: 0,emp_no,emp_no.1,database
0,110022,110022,Employees
1,110039,110039,Employees
2,110085,110085,Employees
3,110114,110114,Employees
4,110183,110183,Employees


In [60]:
db_configs[0]["dbname"] = "Employees"

# Sample query
user_query = """
SELECT a.emp_no, b.salary, c.title
from employees as a
INNER JOIN salaries as b on b.emp_no = a.emp_no
INNER JOIN titles as c on c.emp_no = a.emp_no
and (c.from_date = b.from_date or c.from_date = b.from_date + interval '2 days')
order by a.emp_no
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df.head(5)



Connected to database Employees successfully.
Connection to database Employees closed.


Unnamed: 0,emp_no,salary,title,database
0,10001,60117,Senior Engineer,Employees
1,10002,65828,Staff,Employees
2,10003,40006,Senior Engineer,Employees
3,10004,40054,Engineer,Employees
4,10004,60770,Senior Engineer,Employees


In [61]:
db_configs[0]["dbname"] = "Employees"

# Sample query
user_query = """
SELECT a.emp_no, b.salary, COALESCE(c.title, 'no title change')
from employees as a
INNER JOIN salaries as b on b.emp_no = a.emp_no
INNER JOIN titles as c on c.emp_no = a.emp_no
and (c.from_date = b.from_date or c.from_date = b.from_date + interval '2 days')
order by a.emp_no
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df.head(5)



Connected to database Employees successfully.
Connection to database Employees closed.


Unnamed: 0,emp_no,salary,coalesce,database
0,10001,60117,Senior Engineer,Employees
1,10002,65828,Staff,Employees
2,10003,40006,Senior Engineer,Employees
3,10004,40054,Engineer,Employees
4,10004,60770,Senior Engineer,Employees


In [62]:
db_configs[0]["dbname"] = "Employees"

# Sample query
user_query = """
SELECT a.emp_no, 
        concat(a.first_name, ' ',a.last_name) as "name",
        b.salary,
        COALESCE(c.title, 'No title change'),
        COALESCE(c.from_date::text, '-') as "title taken on"
from employees as a
INNER JOIN salaries as b on a.emp_no = b.emp_no
INNER JOIN titles as c 
on c.emp_no = a.emp_no and (
        c.from_date = (b.from_date + interval '2 days') OR
        c.from_date = b.from_date        
)
order by a.emp_no
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df.head(5)



Connected to database Employees successfully.
Connection to database Employees closed.


Unnamed: 0,emp_no,name,salary,coalesce,title taken on,database
0,10001,Georgi Facello,60117,Senior Engineer,1986-06-26,Employees
1,10002,Bezalel Simmel,65828,Staff,1996-08-03,Employees
2,10003,Parto Bamford,40006,Senior Engineer,1995-12-03,Employees
3,10004,Chirstian Koblick,40054,Engineer,1986-12-01,Employees
4,10004,Chirstian Koblick,60770,Senior Engineer,1995-12-01,Employees


In [63]:
db_configs[0]["dbname"] = "Employees"

# Sample query
user_query = """
select e.emp_no, e.first_name, de.dept_no
from employees AS e 
INNER JOIN dept_emp as de USING(emp_no)
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df.head(5)



Connected to database Employees successfully.
Connection to database Employees closed.


Unnamed: 0,emp_no,first_name,dept_no,database
0,10005,Kyoichi,d003,Employees
1,10008,Saniya,d005,Employees
2,10010,Duangkaew,d004,Employees
3,10010,Duangkaew,d006,Employees
4,10011,Mary,d009,Employees


In [64]:
db_configs[0]["dbname"] = "Employees"

# Sample query
user_query = """
select e.emp_no, e.first_name, d.dept_name
from employees AS e 
INNER JOIN dept_emp as de USING(emp_no)
INNER JOIN departments as d USING(dept_no)
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df.head(5)



Connected to database Employees successfully.
Connection to database Employees closed.


Unnamed: 0,emp_no,first_name,dept_name,database
0,10005,Kyoichi,Human Resources,Employees
1,10008,Saniya,Development,Employees
2,10010,Duangkaew,Production,Employees
3,10010,Duangkaew,Quality Management,Employees
4,10011,Mary,Customer Service,Employees


In [65]:
db_configs[0]["dbname"] = "Employees"

# Sample query
user_query = """
select e.emp_no, e.first_name, d.dept_name
from employees AS e 
INNER JOIN dept_emp as de on de.emp_no = e.emp_no
INNER JOIN departments as d USING(dept_no)
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df.head(5)



Connected to database Employees successfully.
Connection to database Employees closed.


Unnamed: 0,emp_no,first_name,dept_name,database
0,10005,Kyoichi,Human Resources,Employees
1,10008,Saniya,Development,Employees
2,10010,Duangkaew,Production,Employees
3,10010,Duangkaew,Quality Management,Employees
4,10011,Mary,Customer Service,Employees


---

---

### GROUP BY

```sql
SELECT dept_no, COUNT(emp_no)
FROM dept_emp;
```
ERROR: column "dept_emp.dept_no" must appear in the GROUP BY clause or be used in an aggregate function


GROUP BY SPLITS DATA INTO GROUPS OR CHUNKS SO WE CAN APPLY FUNCTIONS
AGAINST THE GROUP RATHER THAN THE ENTIRE TABLE

In [81]:
db_configs[0]["dbname"] = "Employees"

# Sample query
user_query = """
SELECT dept_no, COUNT(emp_no)
FROM dept_emp 
GROUP BY dept_no
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df.head(5)



Connected to database Employees successfully.
Connection to database Employees closed.


Unnamed: 0,dept_no,count,database
0,d001,20211,Employees
1,d002,17346,Employees
2,d003,17786,Employees
3,d004,73485,Employees
4,d005,85707,Employees


In [82]:
db_configs[0]["dbname"] = "Employees"

# Sample query
user_query = """
SELECT dept_no
FROM dept_emp 
ORDER BY dept_no
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df.head(5)



Connected to database Employees successfully.
Connection to database Employees closed.


Unnamed: 0,dept_no,database
0,d001,Employees
1,d001,Employees
2,d001,Employees
3,d001,Employees
4,d001,Employees


In [84]:
db_configs[0]["dbname"] = "Employees"

# Sample query
user_query = """
SELECT dept_no
FROM dept_emp 
GROUP BY dept_no
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df.head(5)



Connected to database Employees successfully.
Connection to database Employees closed.


Unnamed: 0,dept_no,database
0,d001,Employees
1,d002,Employees
2,d003,Employees
3,d004,Employees
4,d005,Employees


In [87]:
db_configs[0]["dbname"] = "Employees"

# Sample query
user_query = """
SELECT dept_no, COUNT(emp_no)
FROM dept_emp  
GROUP BY dept_no
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df.head(5)



Connected to database Employees successfully.
Connection to database Employees closed.


Unnamed: 0,dept_no,count,database
0,d001,20211,Employees
1,d002,17346,Employees
2,d003,17786,Employees
3,d004,73485,Employees
4,d005,85707,Employees


ORDER OF OPERATIONS

(top to bottom)

FROM 

WHERE 

GROUP BY

SELECT 

ORDER



---


```sql
SELECT col1, COUNT(col2)
FROM <table>
WHERE col2 > X
GROUP BY col1
HAVING col1 ===y;
```

In [88]:
db_configs[0]["dbname"] = "Employees"

# Sample query
user_query = """
SELECT d.dept_name, COUNT(e.emp_no) AS "# of employees"
FROM employees AS e
INNER JOIN dept_emp AS de ON de.emp_no = e.emp_no
INNER JOIN departments AS d ON de.dept_no = d.dept_no
-- WHERE e.gender = 'F'
GROUP BY d.dept_name
-- HAVING count(e.emp_no) > 25000
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df.head(5)



Connected to database Employees successfully.
Connection to database Employees closed.


Unnamed: 0,dept_name,# of employees,database
0,Customer Service,23580,Employees
1,Development,85707,Employees
2,Finance,17346,Employees
3,Human Resources,17786,Employees
4,Marketing,20211,Employees


In [90]:
db_configs[0]["dbname"] = "Employees"

# Sample query
user_query = """
SELECT d.dept_no, d.dept_name, COUNT(e.emp_no) AS "# of employees"
FROM employees AS e
INNER JOIN dept_emp AS de ON de.emp_no = e.emp_no
INNER JOIN departments AS d ON de.dept_no = d.dept_no
-- WHERE e.gender = 'F'
GROUP BY d.dept_name, d.dept_no
-- HAVING count(e.emp_no) > 25000
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df.head(5)



Connected to database Employees successfully.
Connection to database Employees closed.


Unnamed: 0,dept_no,dept_name,# of employees,database
0,d001,Marketing,20211,Employees
1,d002,Finance,17346,Employees
2,d003,Human Resources,17786,Employees
3,d004,Production,73485,Employees
4,d005,Development,85707,Employees


In [93]:
db_configs[0]["dbname"] = "Employees"

# Sample query
user_query = """
SELECT d.dept_name, COUNT(e.emp_no) AS "# of employees"
FROM employees AS e
INNER JOIN dept_emp AS de ON de.emp_no = e.emp_no
INNER JOIN departments AS d ON de.dept_no = d.dept_no
-- WHERE e.gender = 'F'
GROUP BY d.dept_name
HAVING count(e.emp_no) > 25000
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df.head(5)



Connected to database Employees successfully.
Connection to database Employees closed.


Unnamed: 0,dept_name,# of employees,database
0,Development,85707,Employees
1,Production,73485,Employees
2,Sales,52245,Employees


In [94]:
db_configs[0]["dbname"] = "Employees"

# Sample query
user_query = """
SELECT d.dept_name, COUNT(e.emp_no) AS "# of employees"
FROM employees AS e
INNER JOIN dept_emp AS de ON de.emp_no = e.emp_no
INNER JOIN departments AS d ON de.dept_no = d.dept_no
WHERE e.gender = 'F'
GROUP BY d.dept_name
HAVING count(e.emp_no) > 25000
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df.head(5)



Connected to database Employees successfully.
Connection to database Employees closed.


Unnamed: 0,dept_name,# of employees,database
0,Development,34258,Employees
1,Production,29549,Employees


In [95]:
db_configs[0]["dbname"] = "Employees"

# Sample query
user_query = """
SELECT d.dept_name, COUNT(e.emp_no) AS "# of employees"
FROM employees AS e
INNER JOIN dept_emp AS de ON de.emp_no = e.emp_no
INNER JOIN departments AS d ON de.dept_no = d.dept_no
WHERE e.gender = 'F'
GROUP BY d.dept_name
ORDER BY COUNT(e.emp_no) DESC
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df.head(5)



Connected to database Employees successfully.
Connection to database Employees closed.


Unnamed: 0,dept_name,# of employees,database
0,Development,34258,Employees
1,Production,29549,Employees
2,Sales,20854,Employees
3,Customer Service,9448,Employees
4,Research,8439,Employees


In [None]:
db_configs[0]["dbname"] = "Employees"

# Sample query
user_query = """
SELECT emp_no, salary, from_date
FROM salaries
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df.head(5)



Connected to database Employees successfully.
Connection to database Employees closed.


Unnamed: 0,emp_no,salary,from_date,database
0,10001,60117,1986-06-26,Employees
1,10001,62102,1987-06-26,Employees
2,10001,66074,1988-06-25,Employees
3,10001,66596,1989-06-25,Employees
4,10001,66961,1990-06-25,Employees


In [99]:
db_configs[0]["dbname"] = "Employees"

# Sample query
user_query = """
SELECT emp_no, salary, MAX(from_date)
FROM salaries
GROUP BY emp_no, salary
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df.head(5)



Connected to database Employees successfully.
Connection to database Employees closed.


Unnamed: 0,emp_no,salary,max,database
0,10001,60117,1986-06-26,Employees
1,10001,62102,1987-06-26,Employees
2,10001,66074,1988-06-25,Employees
3,10001,66596,1989-06-25,Employees
4,10001,66961,1990-06-25,Employees


### UNION


```sql
SELECT col1, SUM(col2)
FROM table
GROUP BY col1

UNION

SELECT SUM(col2)
FROM table
```


### UNION ALL


```sql
SELECT col1, SUM(col2)
FROM table
GROUP BY col1

UNION ALL

SELECT SUM(col2)
FROM table
```

In [101]:
db_configs[0]["dbname"] = "Store"

# Sample query
user_query = """
SELECT NULL AS "prod_id", sum(ol.quantity)
FROM orderlines AS ol

UNION

SELECT prod_id AS "prod_id", sum(ol.quantity)
FROM orderlines AS ol
GROUP BY prod_id
ORDER BY prod_id DESC
"""

params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df.head(5)



Connected to database Store successfully.
Connection to database Store closed.


Unnamed: 0,prod_id,sum,database
0,,120719,Store
1,10000.0,9,Store
2,9999.0,13,Store
3,9998.0,3,Store
4,9997.0,16,Store


In [106]:
db_configs[0]["dbname"] = "Store"

# Sample query
user_query = """
-- SELECT NULL AS "prod_id", sum(ol.quantity)
-- FROM orderlines AS ol
-- 
-- UNION

SELECT prod_id AS "prod_id", orderlineid, sum(ol.quantity)
FROM orderlines AS ol
GROUP BY 
    GROUPING SETS (
        (),
        (prod_id),
        (orderlineid)
    )
ORDER BY prod_id DESC, orderlineid DESC
"""

params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df.head(5)



Connected to database Store successfully.
Connection to database Store closed.


Unnamed: 0,prod_id,orderlineid,sum,database
0,,,120719,Store
1,,9.0,2708,Store
2,,8.0,5377,Store
3,,7.0,8032,Store
4,,6.0,10780,Store


```sql
SELECT  EXTRACT (YEAR FROM orderdate) AS "year",
        EXTRACT (MONTH FROM orderdate) AS "month",
        EXTRACT (DAY FROM orderdate) AS "day",
        sum(ol.quantity)
FROM orderlines AS ol
GROUP BY
    GROUPING SETS (
        (EXTRACT (YEAR FROM orderdate)),
        (
            (EXTRACT (YEAR FROM orderdate)),
            (EXTRACT (MONTH FROM orderdate))
        ),
        (
            (EXTRACT (YEAR FROM orderdate)),
            (EXTRACT (MONTH FROM orderdate)),
            (EXTRACT (DAY FROM orderdate))
        ),
        (
            EXTRACT (MONTH FROM orderdate),
            EXTRACT (DAY FROM orderdate)
        ),
        (EXTRACT (MONTH FROM orderdate)),
        (EXTRACT (DAY FROM orderdate)),
        ()
    )
ORDER BY
    EXTRACT (YEAR FROM orderdate),
    EXTRACT (MONTH FROM orderdate),
    EXTRACT (DAY FROM orderdate)
```

In [108]:
db_configs[0]["dbname"] = "Store"

# Sample query
user_query = """
SELECT  EXTRACT (YEAR FROM orderdate) AS "year",
        EXTRACT (MONTH FROM orderdate) AS "month",
        EXTRACT (DAY FROM orderdate) AS "day",
        sum(ol.quantity)
FROM orderlines AS ol
GROUP BY
    GROUPING SETS (
        (EXTRACT (YEAR FROM orderdate)),
        (
            (EXTRACT (YEAR FROM orderdate)),
            (EXTRACT (MONTH FROM orderdate))
        ),
        (
            (EXTRACT (YEAR FROM orderdate)),
            (EXTRACT (MONTH FROM orderdate)),
            (EXTRACT (DAY FROM orderdate))
        ),
        (
            EXTRACT (MONTH FROM orderdate),
            EXTRACT (DAY FROM orderdate)
        ),
        (EXTRACT (MONTH FROM orderdate)),
        (EXTRACT (DAY FROM orderdate)),
        ()
    )
ORDER BY
    EXTRACT (YEAR FROM orderdate),
    EXTRACT (MONTH FROM orderdate),
    EXTRACT (DAY FROM orderdate)
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df.head(5)



Connected to database Store successfully.
Connection to database Store closed.


Unnamed: 0,year,month,day,sum,database
0,2004,1,1,329,Store
1,2004,1,2,266,Store
2,2004,1,3,315,Store
3,2004,1,4,351,Store
4,2004,1,5,420,Store


In [110]:
db_configs[0]["dbname"] = "Store"

# Sample query
user_query = """
SELECT
    EXTRACT(YEAR FROM orderdate) AS year,
    EXTRACT(MONTH FROM orderdate) AS month,
    EXTRACT(DAY FROM orderdate) AS day,
    SUM(ol.quantity) AS total_quantity
FROM orderlines AS ol
GROUP BY GROUPING SETS (
    (EXTRACT(YEAR FROM orderdate)),
    (EXTRACT(YEAR FROM orderdate), EXTRACT(MONTH FROM orderdate)),
    (EXTRACT(YEAR FROM orderdate), EXTRACT(MONTH FROM orderdate), EXTRACT(DAY FROM orderdate)),
    (EXTRACT(MONTH FROM orderdate), EXTRACT(DAY FROM orderdate)),
    (EXTRACT(MONTH FROM orderdate)),
    (EXTRACT(DAY FROM orderdate)),
    ()
)
ORDER BY
    year, month, day

"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df.head(5)



Connected to database Store successfully.
Connection to database Store closed.


Unnamed: 0,year,month,day,total_quantity,database
0,2004,1,1,329,Store
1,2004,1,2,266,Store
2,2004,1,3,315,Store
3,2004,1,4,351,Store
4,2004,1,5,420,Store


In [112]:
db_configs[0]["dbname"] = "Store"

# Sample query
user_query = """
SELECT  EXTRACT (YEAR FROM orderdate) AS "year",
        EXTRACT (MONTH FROM orderdate) AS "month",
        EXTRACT (DAY FROM orderdate) AS "day",
        sum(ol.quantity)
FROM orderlines AS ol
GROUP BY
    ROLLUP (
        EXTRACT (YEAR FROM orderdate),
        EXTRACT (MONTH FROM orderdate),
        EXTRACT (DAY FROM orderdate)
    )
ORDER BY
    EXTRACT (YEAR FROM orderdate),
    EXTRACT (MONTH FROM orderdate),
    EXTRACT (DAY FROM orderdate)
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df.head(5)



Connected to database Store successfully.
Connection to database Store closed.


Unnamed: 0,year,month,day,sum,database
0,2004,1,1,329,Store
1,2004,1,2,266,Store
2,2004,1,3,315,Store
3,2004,1,4,351,Store
4,2004,1,5,420,Store


---

---

### WHAT WE LEARNED SO FAR

1. GROUPING DATA IS USEFUL
2. GROUPING HAPPENS AFTER `FROM/WHERE`
3. HAVING IS A SPECIAL FILTER FOR GROUPS
4. GROUPING SETS AND ROLLUPS ARE USEFUL
   FOR MULTIPLE GROUPINGS IN A SINGLE QUERY
5. GROUPING DATA IS NOT A SILVER BULLET


```sql
SELECT d.dept_name, ROUND(AVG(salary))
FROM salaries
JOIN dept_name as de USING(emp_no)
JOIN departments as de USING(dept_no)
GROUP BY dept_no, dept_name
```

## WINDOW FUNCTIONS

WINDOW FUNCTIONS CREATE A NEW COLUMN BASED ON FUNCTIONS PERFORMED ON A SUBSET OR "WINDOW" OF THE DATA

```sql
window_function(arg1, arg2,..) OVER (
    [PARTITION BY partition_expression]
    [ORDER BY sort_expression [ASC | DESC]] [NULLS {FIRST | LAST}]
)
```

---

---

In [113]:
db_configs[0]["dbname"] = "Employees"

# Sample query
user_query = """
SELECT MAX(salary) FROM salaries
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df.head(5)



Connected to database Employees successfully.
Connection to database Employees closed.


Unnamed: 0,max,database
0,158220,Employees


In [None]:
db_configs[0]["dbname"] = "Employees"

# Sample query
user_query = """
SELECT 
    *,
    MAX(salary) OVER()
FROM salaries
-- WHERE salary < 70000
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df.head(5)



Connected to database Employees successfully.
Connection to database Employees closed.


Unnamed: 0,emp_no,salary,from_date,to_date,max,database
0,10001,60117,1986-06-26,1987-06-26,158220,Employees
1,10001,62102,1987-06-26,1988-06-25,158220,Employees
2,10001,66074,1988-06-25,1989-06-25,158220,Employees
3,10001,66596,1989-06-25,1990-06-25,158220,Employees
4,10001,66961,1990-06-25,1991-06-25,158220,Employees


## PARTITION BY:

DIVIDE ROWS INTO GROUPS TO APPLY THE FUNCTION AGAINST (OPTIONAL)

```sql
AVG(s.salary) OVER () as "average global salary"
```

```sql
SELECT 
    *,
    d.dept_name,
    AVG(salary) OVER()
FROM salaries
JOIN dept_emp AS de USING (emp_no)
JOIN departments AS d USING (dept_no)
```

```sql
SELECT 
    *,
    d.dept_name,
    AVG(salary) OVER(
        PARTITION BY d.dept_name
    )
FROM salaries
JOIN dept_emp AS de USING (emp_no)
JOIN departments AS d USING (dept_no)
```


## ORDER BY 

```sql
SELECT emp_no,
        COUNT(salary) OVER (
            ORDER BY emp_no
        )
FROM salaries
```

### FRAME CLAUSE 

WHEN USING A FRAME CLAUSE
IN A WINDOW FUNCTION WE CAN 
CREATE A SUB-RANGE OR FRAME

![alt text](image.png)

In [116]:
db_configs[0]["dbname"] = "Employees"

# Sample query
user_query = """
SELECT emp_no,
        salary,
        COUNT(salary) OVER (
            PARTITION BY emp_no
            ORDER BY salary
            RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
        )
FROM salaries
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df.head(5)



Connected to database Employees successfully.
Connection to database Employees closed.


Unnamed: 0,emp_no,salary,count,database
0,10001,60117,17,Employees
1,10001,62102,17,Employees
2,10001,66074,17,Employees
3,10001,66596,17,Employees
4,10001,66961,17,Employees


In [117]:
db_configs[0]["dbname"] = "Employees"

# Sample query
user_query = """
SELECT emp_no,
        salary,
        COUNT(salary) OVER (
            PARTITION BY emp_no
            ORDER BY salary
            ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
        )
FROM salaries
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df.head(5)



Connected to database Employees successfully.
Connection to database Employees closed.


Unnamed: 0,emp_no,salary,count,database
0,10001,60117,17,Employees
1,10001,62102,17,Employees
2,10001,66074,17,Employees
3,10001,66596,17,Employees
4,10001,66961,17,Employees


In [None]:
db_configs[0]["dbname"] = "Employees"

# Sample query
user_query = """
s
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df.head(5)



Error executing query to fetch column names: syntax error at or near ")"
LINE 2: ) AS subquery LIMIT 0;
        ^
Error connecting to database Employees: syntax error at or near ")"
LINE 2: ) AS subquery LIMIT 0;
        ^
Connection to database Employees closed.
Error processing database Employees: syntax error at or near ")"
LINE 2: ) AS subquery LIMIT 0;
        ^
No data was retrieved from the databases.


In [78]:
db_configs[0]["dbname"] = "Employees"

# Sample query
user_query = """
"""
params = None  # You can add parameters if needed

# Execute the function
result_df = run_query_with_dynamic_columns(
    db_configs, user_query, chunk_size=10000, params=params
)

result_df.head(5)



Error executing query to fetch column names: syntax error at or near ")"
LINE 2: ) AS subquery LIMIT 0;
        ^
Error connecting to database Employees: syntax error at or near ")"
LINE 2: ) AS subquery LIMIT 0;
        ^
Connection to database Employees closed.
Error processing database Employees: syntax error at or near ")"
LINE 2: ) AS subquery LIMIT 0;
        ^
No data was retrieved from the databases.
