# 1 - Getting started

## First commands

Getting DuckDB running is as simple as pip installing the package `duckdb` and importing it.

In [1]:
import duckdb

If you don't need to persist the database after you're done with your session, you can immediately run queries against the database with `duckdb.sql`.

In [2]:
query = """
SELECT 'Hello World!'
"""
res = duckdb.sql(query)
print(type(res))
print(res)

<class 'duckdb.duckdb.DuckDBPyRelation'>
┌────────────────┐
│ 'Hello World!' │
│    varchar     │
├────────────────┤
│ Hello World!   │
└────────────────┘



If and when you need to access the query results with Python, you can convert the result to
- Python object with `res.fetchall()`
- a Pandas DataFrame with `res.df()` or `res.to_df()`

In [3]:
ls = res.fetchall()
print(ls)

[('Hello World!',)]


In [4]:
df = res.df()
print(type(df))
df

<class 'pandas.core.frame.DataFrame'>


Unnamed: 0,'Hello World!'
0,Hello World!


You can naturally create tables, insert values, create views and so on like in any database. 

In [5]:
query = """
CREATE OR REPLACE TABLE test_table (
    int_col INTEGER,
    str_col VARCHAR
);
CREATE OR REPLACE TABLE another_table (
    int_col INTEGER
)
"""
duckdb.sql(query)
duckdb.sql("SHOW TABLES")

┌───────────────┐
│     name      │
│    varchar    │
├───────────────┤
│ another_table │
│ test_table    │
└───────────────┘

In [6]:
query = """
INSERT INTO test_table (int_col, str_col) VALUES
    (1, 'Hello'),
    (3, 'World'),
    (2, ' ')
"""
duckdb.sql(query)
duckdb.sql("FROM test_table ORDER BY int_col")

┌─────────┬─────────┐
│ int_col │ str_col │
│  int32  │ varchar │
├─────────┼─────────┤
│       1 │ Hello   │
│       2 │         │
│       3 │ World   │
└─────────┴─────────┘

In [7]:
query = """
CREATE OR REPLACE VIEW test_view AS (
    FROM test_table
    WHERE str_col != ' '
)
"""
duckdb.sql(query)
duckdb.sql("FROM test_view ORDER BY int_col")

┌─────────┬─────────┐
│ int_col │ str_col │
│  int32  │ varchar │
├─────────┼─────────┤
│       1 │ Hello   │
│       3 │ World   │
└─────────┴─────────┘

Note that in the DuckDB SQL dialect you can omit `SELECT *`. You can also
- reorder `SELECT` and `FROM`, i.e. you can query `FROM table SELECT cols`,
- exclude columns instead of listing all of the columns you want, i.e. `SELECT * EXCLUDE(cols, we, do, not, want) FROM table`,
- group by all non-aggregated columns, i.e. `SELECT ... FROM table GROUP BY ALL`.
See the [DuckDB documentation](https://duckdb.org/docs/sql/introduction) for the SQL syntax.

## Persisting the database

## Extensions

[Extensions](https://duckdb.org/docs/extensions/overview.html) allow you to add functionality to DuckDB. To see the list of extensions, you can use the `duckdb_extensions()` SQL function.

In [8]:
duckdb.sql("FROM duckdb_extensions()")

┌──────────────────┬─────────┬───────────┬──────────────────────┬──────────────────────────────────┬───────────────────┐
│  extension_name  │ loaded  │ installed │     install_path     │           description            │      aliases      │
│     varchar      │ boolean │  boolean  │       varchar        │             varchar              │     varchar[]     │
├──────────────────┼─────────┼───────────┼──────────────────────┼──────────────────────────────────┼───────────────────┤
│ arrow            │ false   │ false     │                      │ A zero-copy data integration b…  │ []                │
│ autocomplete     │ false   │ false     │                      │ Adds support for autocomplete …  │ []                │
│ aws              │ false   │ false     │                      │ Provides features that depend …  │ []                │
│ azure            │ false   │ false     │                      │ Adds a filesystem abstraction …  │ []                │
│ excel            │ false   │ f

In this tutorial we will need the `postgres` extension, or `postgres_scanner` more specifically. If the extension is listed as not installed, let's install and load it now since we'll use it later.

In [9]:
duckdb.sql("INSTALL postgres")
duckdb.sql("LOAD postgres")

In [10]:
duckdb.sql("FROM duckdb_extensions()")

┌──────────────────┬─────────┬───────────┬──────────────────────┬──────────────────────────────────┬───────────────────┐
│  extension_name  │ loaded  │ installed │     install_path     │           description            │      aliases      │
│     varchar      │ boolean │  boolean  │       varchar        │             varchar              │     varchar[]     │
├──────────────────┼─────────┼───────────┼──────────────────────┼──────────────────────────────────┼───────────────────┤
│ arrow            │ false   │ false     │                      │ A zero-copy data integration b…  │ []                │
│ autocomplete     │ false   │ false     │                      │ Adds support for autocomplete …  │ []                │
│ aws              │ false   │ false     │                      │ Provides features that depend …  │ []                │
│ azure            │ false   │ false     │                      │ Adds a filesystem abstraction …  │ []                │
│ excel            │ false   │ f

Note that you can also install and load extensions with the Python API functions `duckdb.install_extension` and `duckdb.load_extension`.

# 2 - Dataframes

In [11]:
import pandas as pd
import polars as pl

# 3 - Working with files

# 4 - Interacting with databases