Example: Loading data into a Data Warehouse (BigQuery)
First, install the dependencies, define the source, then change the destination name and run the pipeline.


In [None]:
#pip install dlt[bigquery]

Let's use our NY Taxi API and load data from the source into destination.

In [None]:
import dlt
from dlt.sources.helpers.rest_client import RESTClient
from dlt.sources.helpers.rest_client.paginators import PageNumberPaginator


@dlt.resource(name="rides", write_disposition="replace")
def ny_taxi():
    client = RESTClient(
        base_url="https://us-central1-dlthub-analytics.cloudfunctions.net",
        paginator=PageNumberPaginator(
            base_page=1,
            total_path=None
        )
    )

    for page in client.paginate("data_engineering_zoomcamp_api"):
        yield page

Choosing a destination

Switching between data warehouses (BigQuery, Snowflake, Redshift) or data lakes (S3, Google Cloud Storage, Parquet files) in dlt is incredibly straightforward — simply modify the destination parameter in your pipeline configuration.

For example:

In [None]:
pipeline = dlt.pipeline(
    pipeline_name='taxi_data',
    destination='duckdb', # <--- to test pipeline locally
    dataset_name='taxi_rides',
)

pipeline = dlt.pipeline(
    pipeline_name='taxi_data',
    destination='bigquery', # <--- to run pipeline in production
    dataset_name='taxi_rides',
)

This flexibility allows you to easily transition from local development to production-grade environments.

💡 No need to rewrite your pipeline — dlt adapts automatically!

Set Credentials

The next logical step is to set credentials using dlt's TOML providers or environment variables (ENVs).

In [None]:
import dlt
import toml

# Load the secrets.toml file
config = toml.load('.dlt/secrets.toml')

# Extract BigQuery credentials
bigquery_credentials = config['destination']['bigquery']['credentials']

Run the pipeline:

In [None]:
pipeline = dlt.pipeline(
    pipeline_name="taxi_data",
    destination="bigquery",
    dataset_name="taxi_rides",
    dev_mode=True,
)

info = pipeline.run(ny_taxi)
print(info)

💡 What’s different?

dlt automatically adapts the schema to fit BigQuery.
Partitioning & clustering can be applied for performance optimization.
Efficient batch loading ensures scalability.