# How to write multi-batch `BatchRequest` - Configured `Sql` Example
* A `BatchRequest` facilitates the return of one or more `batch(es)` of data from a configured `Datasource`. To find more about `Batches`, please refer to the [related documentation](https://docs.greatexpectations.io/docs/guides/connecting_to_your_data/how_to_get_one_or_more_batches_of_data_from_a_configured_datasource#1-construct-a-batchrequest). 
* A `BatchRequest` can return 0 or more Batches of data depending on the underlying data, and how it is configured. This guide will help you configure `BatchRequests` to return multiple batches, which can be used by
   1. Self-Initializing Expectations to estimate parameters
   2. DataAssistants to profile your data and create and Expectation suite with self-intialized parameters.
   
* Note : Multi-batch BatchRequests are not supported in `RuntimeDataConnector`.

In [37]:
import great_expectations as gx
from great_expectations.core.yaml_handler import YAMLHandler
from great_expectations.core.batch import BatchRequest
import os
from ruamel import yaml

* Load `DataContext`

In [38]:
data_context: gx.DataContext = gx.get_context()

## Sql Example

### Example Database

Imagine we have a database of 1 table, with `yellow_tripdata_sample_2020`, corresponding to all 12 months' `taxi_trip` data for 2020.


In [3]:
# connect to postgres DB, and print the existing tables
pg_hostname = os.getenv("GE_TEST_LOCAL_DB_HOSTNAME", "localhost")
CONNECTION_STRING = f"postgresql+psycopg2://postgres:@{pg_hostname}/test_ci"
from sqlalchemy import create_engine
from sqlalchemy import inspect

engine = create_engine(CONNECTION_STRING)
insp = inspect(engine)
print(insp.get_table_names())

['yellow_tripdata_sample_2019', 'yellow_tripdata_sample_2020']


## Example Configuration

In our example, we add a `Datasource` named `taxi_multi_batch_sql_datasource` with 1 table. We also have a `ConfiguredAssetSqlDataConnector` named `configured_data_connector_multi_batch_asset`.

The DataConnector contains 2 `assets`, both associated with the `table_name` named`yellow_tripdata_sample_2020`.

The asset `yellow_tripdata_sample_2020_full` contains no other parameter other than the `table_name` and optional `schema_name`, which mean the whole table will be loaded as one Batch in the asset. 

The asset `yellow_tripdata_sample_2020_by_year_and_month` contains `table_name` and `schema_name`, as well as a splitter configuration. The splitter we use is `split_on_year_and_month`, which creates Batches according to the `pickup_datetime` column which is of type timestamp in the database schema.

**Note**: This example only uses `splitters` but sampling can also be used. For more information, please refer to the document [How to configure a DataConnector for splitting and sampling tables in SQL](https://docs.greatexpectations.io/docs/guides/connecting_to_your_data/advanced/how_to_configure_a_dataconnector_for_splitting_and_sampling_tables_in_sql)

In [34]:
datasource_config = {
    "name": "taxi_multi_batch_sql_datasource",
    "class_name": "Datasource",
    "module_name": "great_expectations.datasource",
    "execution_engine": {
        "module_name": "great_expectations.execution_engine",
        "class_name": "SqlAlchemyExecutionEngine",
        "connection_string": CONNECTION_STRING,
    },
    "data_connectors": {
        "configured_data_connector_multi_batch_asset": {
            "class_name": "ConfiguredAssetSqlDataConnector",
            "assets": {
                "yellow_tripdata_sample_2020_full": {
                    "table_name": "yellow_tripdata_sample_2020",
                    "schema_name": "public",
                },
                "yellow_tripdata_sample_2020_by_year_and_month": {
                    "table_name": "yellow_tripdata_sample_2020",
                    "schema_name": "public",
                    "splitter_method": "split_on_year_and_month",
                    "splitter_kwargs": {
                        "column_name": "pickup_datetime",
                    },
                },
            },
        },
    },
}

data_context.test_yaml_config(yaml.dump(datasource_config))

Attempting to instantiate class from config...
	Instantiating as a Datasource, since class_name is Datasource
	Successfully instantiated Datasource


ExecutionEngine class name: SqlAlchemyExecutionEngine
Data Connectors:
	configured_data_connector_multi_batch_asset : ConfiguredAssetSqlDataConnector

	Available data_asset_names (2 of 2):
		yellow_tripdata_sample_2020_by_year_and_month (3 of 5): [{'pickup_datetime': {'year': 2020, 'month': 5}}, {'pickup_datetime': {'year': 2020, 'month': 4}}, {'pickup_datetime': {'year': 2020, 'month': 3}}]
		yellow_tripdata_sample_2020_full (1 of 1): [{}]

	Unmatched data_references (0 of 0):[]



<great_expectations.datasource.new_datasource.Datasource at 0x7ff1c46ec160>

We see we have successfully configured this because the output shows a 2 data assets
- `yellow_tripdata_sample_2020_full` associated with 1 batch. 
- `yellow_tripdata_sample_2020_by_year_and_month` with 12 batches, each associated with a different month in our `pickup_datetime` column. 

In [5]:
# add_datasource only if it doesn't already exist in our configuration
try:
    data_context.get_datasource(datasource_config["name"])
except ValueError:
    data_context.add_datasource(**datasource_config)

## BatchRequest

Depending on how we configured our assets, when you send a `BatchRequest`, you will retrieve a different number of `Batches`

Single Batch returned by `yellow_tripdata_sample_2020_full`

In [6]:
single_batch_batch_request: BatchRequest = BatchRequest(
    datasource_name="taxi_multi_batch_sql_datasource",
    data_connector_name="configured_data_connector_multi_batch_asset",
    data_asset_name="yellow_tripdata_sample_2020_full",
)

In [7]:
batch_list = data_context.get_batch_list(batch_request=single_batch_batch_request)

In [8]:
batch_list

[<great_expectations.core.batch.Batch at 0x7ff1c43336a0>]

Multi Batch returned by `yellow_tripdata_sample_2020_by_year_and_month`

In [31]:
multi_batch_batch_request: BatchRequest = BatchRequest(
    datasource_name="taxi_multi_batch_sql_datasource",
    data_connector_name="configured_data_connector_multi_batch_asset",
    data_asset_name="yellow_tripdata_sample_2020_by_year_and_month",
)

In [32]:
multi_batch_batch_list = data_context.get_batch_list(
    batch_request=multi_batch_batch_request
)

In [33]:
multi_batch_batch_list  # 12 batches

[<great_expectations.core.batch.Batch at 0x7ff1c46f4730>,
 <great_expectations.core.batch.Batch at 0x7ff1c429b940>,
 <great_expectations.core.batch.Batch at 0x7ff1c45794f0>,
 <great_expectations.core.batch.Batch at 0x7ff1c469bac0>,
 <great_expectations.core.batch.Batch at 0x7ff1c42aa370>]

You can also get a single Batch from a multi-batch DataConnector by passing in `data_connector_query`. 

In [12]:
single_batch_batch_request_from_multi: BatchRequest = BatchRequest(
    datasource_name="taxi_multi_batch_sql_datasource",
    data_connector_name="configured_data_connector_multi_batch_asset",
    data_asset_name="yellow_tripdata_sample_2020_by_year_and_month",
    data_connector_query={
        "batch_filter_parameters": {"pickup_datetime": {"year": 2020, "month": 1}}
    },
)

In [13]:
batch_list = data_context.get_batch_list(
    batch_request=single_batch_batch_request_from_multi
)

In [14]:
batch_list  # has a length of 1, as expected

[<great_expectations.core.batch.Batch at 0x7ff1dbe703d0>]

Let's review our batch:

In [15]:
batch = batch_list[
    0
]  # our single filtered batch with 'batch_identifiers': {'pickup_datetime': '2020-01'}

In [16]:
batch.to_dict()

{'data': '<great_expectations.execution_engine.sqlalchemy_batch_data.SqlAlchemyBatchData object at 0x7ff1c42fb340>',
 'batch_request': {'datasource_name': 'taxi_multi_batch_sql_datasource',
  'data_connector_name': 'configured_data_connector_multi_batch_asset',
  'data_asset_name': 'yellow_tripdata_sample_2020_by_year_and_month',
  'limit': None,
  'batch_spec_passthrough': None,
  'data_connector_query': {'batch_filter_parameters': {'pickup_datetime': {'year': 2020,
     'month': 1}}}},
 'batch_definition': {'datasource_name': 'taxi_multi_batch_sql_datasource',
  'data_connector_name': 'configured_data_connector_multi_batch_asset',
  'data_asset_name': 'yellow_tripdata_sample_2020_by_year_and_month',
  'batch_identifiers': {'pickup_datetime': {'year': 2020, 'month': 1}}},
 'batch_spec': {'data_asset_name': 'yellow_tripdata_sample_2020_by_year_and_month',
  'table_name': 'yellow_tripdata_sample_2020',
  'batch_identifiers': {'pickup_datetime': {'year': 2020, 'month': 1}},
  'class_name

# Using auto-initializing `Expectations` to generate parameters

We will generate a `Validator` using our `multi_batch_batch_list`

In [17]:
multi_batch_batch_list = data_context.get_batch_list(
    batch_request=multi_batch_batch_request
)

In [18]:
example_suite = data_context.add_expectation_suite(
    expectation_suite_name="example_sql_suite"
)

In [19]:
validator = data_context.get_validator_using_batch_list(
    batch_list=multi_batch_batch_list, expectation_suite=example_suite
)

When you run methods on the validator, it will typically run on the most recent batch (index `-1`), even if the Validator has access to a longer Batch list. For example, notice that rows below are all associated with `pickup_datetime` being `9` (September, 2020). This is because the datetime values are stored lexicographically, meaning `1` and `11`, `12` values will appear **before** `2` and `3`.

For simplicity, let's get a `validator` with the December `Batch`, which is in index `"3"` (after `1`, `10`, `11`). Notice that we are also casting the value as a `list` using the square brackets. 

In [20]:
validator = data_context.get_validator_using_batch_list(
    batch_list=[multi_batch_batch_list[3]], expectation_suite=example_suite
)

In [21]:
validator.head()

Calculating Metrics:   0%|          | 0/1 [00:00<?, ?it/s]

Unnamed: 0,vendor_id,pickup_datetime,dropoff_datetime,passenger_count,trip_distance,rate_code_id,store_and_fwd_flag,pickup_location_id,dropoff_location_id,payment_type,fare_amount,extra,mta_tax,tip_amount,tolls_amount,improvement_surcharge,total_amount,congestion_surcharge
0,2.0,2020-04-03 16:23:46,2020-04-03 16:34:42,5.0,2.26,1.0,N,263,41,1.0,10.0,1.0,0.5,2.86,0.0,0.3,17.16,2.5
1,1.0,2020-04-29 09:45:35,2020-04-29 09:48:36,0.0,0.4,1.0,N,43,151,1.0,4.0,0.0,0.5,0.0,0.0,0.3,4.8,0.0
2,1.0,2020-04-27 19:21:48,2020-04-27 19:29:07,1.0,2.8,1.0,N,140,107,1.0,10.0,3.5,0.5,3.55,0.0,0.3,17.85,2.5
3,2.0,2020-04-23 18:01:29,2020-04-23 18:06:14,1.0,1.03,1.0,N,142,238,1.0,6.0,1.0,0.5,2.58,0.0,0.3,12.88,2.5
4,1.0,2020-04-29 09:02:24,2020-04-29 09:08:48,1.0,0.8,1.0,N,161,100,1.0,6.0,2.5,0.5,2.3,0.0,0.3,11.6,2.5


### Typical Workflow
A `batch_list` becomes really useful when you are calculating parameters for auto-initializing Expectations, as they use a `RuleBasedProfiler` under-the-hood to calculate parameters.

Let's say we don't know the `min_value` and `max_value` for `expect_column_median_to_be_between()` so we "guess" at the `min_value` and `max_value`.

In [22]:
validator = data_context.get_validator_using_batch_list(
    batch_list=multi_batch_batch_list, expectation_suite=example_suite
)

In [23]:
validator.expect_column_median_to_be_between(
    column="trip_distance", min_value=0, max_value=1
)

Calculating Metrics:   0%|          | 0/9 [00:00<?, ?it/s]

{
  "success": false,
  "result": {
    "observed_value": 1.99
  },
  "meta": {},
  "exception_info": {
    "raised_exception": false,
    "exception_traceback": null,
    "exception_message": null
  }
}

The observed value of `trip_distance` for our `yellow_tripdata_sample_2020` going to be `1.75`, which means the Expectation fails. We guessed wrong - but we can do better!

Now we run the same expectation again, but this time with `auto=True`. This means the median values are going to calculated across the `batch_list` associated with the `Validator` (ie 12 Batches for `yellow_tripdata_sample_2020`), which gives the min value of `1.6` and the max value of `1.99`.

In [24]:
validator.expect_column_median_to_be_between(column="trip_distance", auto=True)




Generating Expectations:   0%|          | 0/1 [00:00<?, ?it/s]

Profiling Dataset:         0%|          | 0/1 [00:00<?, ?it/s]

Calculating Metrics:   0%|          | 0/9 [00:00<?, ?it/s]

{
  "success": true,
  "expectation_config": {
    "expectation_type": "expect_column_median_to_be_between",
    "kwargs": {
      "column": "trip_distance",
      "min_value": 1.6,
      "max_value": 1.99,
      "strict_min": false,
      "strict_max": false
    },
    "meta": {
      "auto_generated_at": "20220913T024338.419229Z",
      "great_expectations_version": "0.15.22+29.g8fc1586df"
    }
  },
  "result": {
    "observed_value": 1.99
  },
  "meta": {},
  "exception_info": {
    "raised_exception": false,
    "exception_traceback": null,
    "exception_message": null
  }
}

The auto=True will also automatically run the Expectation against the most recent Batch (which has an observed value of `1.75`) and the Expectation will pass.

You can now save the `ExpectationSuite`.

In [25]:
validator.save_expectation_suite()

### Running the `ExpectationSuite` against single `Batch`

Now the ExpectationSuite we built using all batches can be used to validate single batches using a Checkpoint. For example, we can run this checkpoint on new data when it comes in next month. In our example, let's validatidate a different batch from February 2020, using the ExpectationSuite we built from `yellow_tripdata_sample_2020`.



In [26]:
single_batch_batch_request_from_multi: BatchRequest = BatchRequest(
    datasource_name="taxi_multi_batch_sql_datasource",
    data_connector_name="configured_data_connector_multi_batch_asset",
    data_asset_name="yellow_tripdata_sample_2020_by_year_and_month",
    data_connector_query={
        "batch_filter_parameters": {"pickup_datetime": {"year": 2020, "month": 2}}
    },
)

In [27]:
checkpoint_config = {
    "name": "my_checkpoint",
    "config_version": 1,
    "class_name": "SimpleCheckpoint",
    "validations": [
        {
            "batch_request": single_batch_batch_request_from_multi,
            "expectation_suite_name": "example_sql_suite",
        }
    ],
}
data_context.add_checkpoint(**checkpoint_config)

{
  "action_list": [
    {
      "name": "store_validation_result",
      "action": {
        "class_name": "StoreValidationResultAction"
      }
    },
    {
      "name": "store_evaluation_params",
      "action": {
        "class_name": "StoreEvaluationParametersAction"
      }
    },
    {
      "name": "update_data_docs",
      "action": {
        "class_name": "UpdateDataDocsAction",
        "site_names": []
      }
    }
  ],
  "batch_request": {},
  "class_name": "Checkpoint",
  "config_version": 1.0,
  "evaluation_parameters": {},
  "module_name": "great_expectations.checkpoint",
  "name": "my_checkpoint",
  "profilers": [],
  "runtime_configuration": {},
  "validations": [
    {
      "batch_request": {
        "datasource_name": "taxi_multi_batch_sql_datasource",
        "data_connector_name": "configured_data_connector_multi_batch_asset",
        "data_asset_name": "yellow_tripdata_sample_2020_by_year_and_month",
        "data_connector_query": {
          "batch_filter_par

In [28]:
results = data_context.run_checkpoint(checkpoint_name="my_checkpoint")

Calculating Metrics:   0%|          | 0/9 [00:00<?, ?it/s]

In [29]:
results.success

True

# Appendix


## Other Parameters for `ConfiguredAssetSqlDataConnector`

The signature of the `ConfiguredAssetSqlDataConnector` also contains the following parameters: 

The following required parameters:
* `name`: The name of this DataConnector.
* `datasource_name`: The name of the Datasource that contains it.
* `execution_engine`: the type of ExecutionEngine to use.
* `assets`: The dictionary containing the asset configurations.

The `assets` dictionary can contain the following keys and values:
* `table_name`: string that defines the `table_name` associated with the asset. If table_name is omitted, then the `table_name` defaults to the asset name.
* `schema_name`: optional string that defines the `schema` for the asset.
* `include_schema_name`: A `bool` that determines, "Should the `data_asset_name` include the `schema` as a prefix?"
* `splitter_method`: string that names method to split the target table into multiple `Batches`.
* `splitter_kwargs`: a dict containing arguments to pass to `splitter_method`.
* `sampling_method`: string that names method to downsample within a target `Batch`.
* `sampling_kwargs` : dictionary with keyword arguments to pass to `sampling_method`.
* `batch_spec_passthrough`: dictionary with keys that will be added directly to `batch_spec`.


For more information on `splitters` and `samplers` please consider the following documentation: [How to configure a DataConnector for splitting and sampling tables in SQL](https://docs.greatexpectations.io/docs/guides/connecting_to_your_data/advanced/how_to_configure_a_dataconnector_for_splitting_and_sampling_tables_in_sql)

## Configuring Splitters at `DataConnector` and `Asset`-Level

For `ConfiguredAssetSqlDataConnectors`, the `splitter_method` and `splitter_kwargs` can be configured at the `DataConnector`-level or `Asset`-level. 

#### Configuration at `DataConnector`-level


Here is a configuration with the splitter method `split_on_year_and_month` configured at the `DataConnector`-level for a `DataConnector` with 2 `Assets`, `yellow_tripdata_sample_2020_by_year_and_month` and `yellow_tripdata_sample_2020`

In [30]:
datasource_config = {
    "name": "taxi_multi_batch_sql_datasource",
    "class_name": "Datasource",
    "module_name": "great_expectations.datasource",
    "execution_engine": {
        "module_name": "great_expectations.execution_engine",
        "class_name": "SqlAlchemyExecutionEngine",
        "connection_string": CONNECTION_STRING,
    },
    "data_connectors": {
        "configured_data_connector_multi_batch_asset": {
            "class_name": "ConfiguredAssetSqlDataConnector",
            "splitter_method": "split_on_year_and_month",
            "splitter_kwargs": {
                "column_name": "pickup_datetime",
            },
            "assets": {
                "yellow_tripdata_sample_2020_by_year_and_month": {
                    "table_name": "yellow_tripdata_sample_2020",
                    "schema_name": "public",
                },
                "yellow_tripdata_sample_2020": {
                    "table_name": "yellow_tripdata_sample_2020",
                    "schema_name": "public",
                },
            },
        },
    },
}

data_context.test_yaml_config(yaml.dump(datasource_config))

Attempting to instantiate class from config...
	Instantiating as a Datasource, since class_name is Datasource
	Successfully instantiated Datasource


ExecutionEngine class name: SqlAlchemyExecutionEngine
Data Connectors:
	configured_data_connector_multi_batch_asset : ConfiguredAssetSqlDataConnector

	Available data_asset_names (2 of 2):
		yellow_tripdata_sample_2020 (3 of 5): [{'pickup_datetime': {'year': 2020, 'month': 5}}, {'pickup_datetime': {'year': 2020, 'month': 4}}, {'pickup_datetime': {'year': 2020, 'month': 3}}]
		yellow_tripdata_sample_2020_by_year_and_month (3 of 5): [{'pickup_datetime': {'year': 2020, 'month': 5}}, {'pickup_datetime': {'year': 2020, 'month': 4}}, {'pickup_datetime': {'year': 2020, 'month': 3}}]

	Unmatched data_references (0 of 0):[]



<great_expectations.datasource.new_datasource.Datasource at 0x7ff1c4564070>

As you can see, both `Assets`, `yellow_tripdata_sample_2020_by_year_and_month` **and** `yellow_tripdata_sample_2020` have the splitter method applied to it, meaning they both have 12 Batches as a result of splitting by `year` and `month`.

#### Configuration at `DataConnector`-level **and** `Asset`-level

Next we have a similar example, but with a second `splitter_method` also configured at the `Asset`-level. This time we will configure a second `splitter_method`, `split_on_year_and_month_and_day`, for the Asset `yellow_tripdata_sample_2020_by_year_and_month_and_day`. In this case, the `Asset`-level configuration will **override** the configuration at the `DataConnector`-level and produce 366 Batches as a result of splitting by `year`, `month` and `day`.

In [None]:
datasource_config = {
    "name": "taxi_multi_batch_sql_datasource",
    "class_name": "Datasource",
    "module_name": "great_expectations.datasource",
    "execution_engine": {
        "module_name": "great_expectations.execution_engine",
        "class_name": "SqlAlchemyExecutionEngine",
        "connection_string": CONNECTION_STRING,
    },
    "data_connectors": {
        "configured_data_connector_multi_batch_asset": {
            "class_name": "ConfiguredAssetSqlDataConnector",
            "splitter_method": "split_on_year_and_month",
            "splitter_kwargs": {
                "column_name": "pickup_datetime",
            },
            "assets": {
                "yellow_tripdata_sample_2020_by_year_and_month": {
                    "table_name": "yellow_tripdata_sample_2020",
                    "schema_name": "public",
                },
                "yellow_tripdata_sample_2020_by_year_and_month_and_day": {
                    "table_name": "yellow_tripdata_sample_2020",
                    "schema_name": "public",
                    "splitter_method": "split_on_year_and_month_and_day",
                    "splitter_kwargs": {
                        "column_name": "pickup_datetime",
                    },
                },
            },
        },
    },
}

data_context.test_yaml_config(yaml.dump(datasource_config))

As you can see, `yellow_tripdata_sample_2020_by_year_and_month` and `yellow_tripdata_sample_2020_by_year_and_month_and_day` each have a different number of Batches resulting from their different `splitter` configurations. 

* `yellow_tripdata_sample_2020_by_year_and_month` has 12 Batches. 
* `yellow_tripdata_sample_2020_by_year_and_month_and_day` has 366 Batches.

# Loading Data into Postgresql Database

* The following code can be used to build the postgres database used in this notebook. It is included (and commented out) for reference.
* In order to load the data into a local `postgresql` database, please feel free to use the `docker-compose.yml` file available at `great_expectations/assets/docker/postgresql/`. 

### To spin up the `postgresql` database
* Have [Docker Desktop](https://www.docker.com/products/docker-desktop/) running locally.
* Navigate to `great_expectations/assets/docker/postgresql/`
* Type `docker-compose up`
* Then uncomment and run the following snippet

In [36]:
# from tests.test_utils import load_data_into_test_database
# from typing import List
# import sqlalchemy as sa
# import pandas as pd
# pg_hostname = os.getenv("GE_TEST_LOCAL_DB_HOSTNAME", "localhost")
# CONNECTION_STRING = f"postgresql+psycopg2://postgres:@{pg_hostname}/test_ci"
#
# data_paths: List[str] = [
#      "../../../test_sets/taxi_yellow_tripdata_samples/yellow_tripdata_sample_2020-01.csv",
#      "../../../test_sets/taxi_yellow_tripdata_samples/yellow_tripdata_sample_2020-02.csv",
#      "../../../test_sets/taxi_yellow_tripdata_samples/yellow_tripdata_sample_2020-03.csv",
#      "../../../test_sets/taxi_yellow_tripdata_samples/yellow_tripdata_sample_2020-04.csv",
#      "../../../test_sets/taxi_yellow_tripdata_samples/yellow_tripdata_sample_2020-05.csv",
#      "../../../test_sets/taxi_yellow_tripdata_samples/yellow_tripdata_sample_2020-06.csv",
#      "../../../test_sets/taxi_yellow_tripdata_samples/yellow_tripdata_sample_2020-07.csv",
#      "../../../test_sets/taxi_yellow_tripdata_samples/yellow_tripdata_sample_2020-08.csv",
#      "../../../test_sets/taxi_yellow_tripdata_samples/yellow_tripdata_sample_2020-09.csv",
#      "../../../test_sets/taxi_yellow_tripdata_samples/yellow_tripdata_sample_2020-10.csv",
#      "../../../test_sets/taxi_yellow_tripdata_samples/yellow_tripdata_sample_2020-11.csv",
#      "../../../test_sets/taxi_yellow_tripdata_samples/yellow_tripdata_sample_2020-12.csv",
# ]
#
#
# engine = sa.create_engine(CONNECTION_STRING)
# connection = engine.connect()
# table_name = "yellow_tripdata_sample_2020"
# res = connection.execute(f"DROP TABLE IF EXISTS {table_name}")
#
# for data_path in data_paths:
#     # This utility is not for general use. It is only to support testing.
#     load_data_into_test_database(
#         table_name="yellow_tripdata_sample_2020",
#         csv_path=data_path,
#         connection_string=CONNECTION_STRING,
#         load_full_dataset=True,
#         drop_existing_table=False,
#         convert_colnames_to_datetime=["pickup_datetime", "dropoff_datetime"]
#     )

Adding to existing table yellow_tripdata_sample_2020 and adding data from ['../../../test_sets/taxi_yellow_tripdata_samples/yellow_tripdata_sample_2020-01.csv']
Adding to existing table yellow_tripdata_sample_2020 and adding data from ['../../../test_sets/taxi_yellow_tripdata_samples/yellow_tripdata_sample_2020-02.csv']
Adding to existing table yellow_tripdata_sample_2020 and adding data from ['../../../test_sets/taxi_yellow_tripdata_samples/yellow_tripdata_sample_2020-03.csv']
Adding to existing table yellow_tripdata_sample_2020 and adding data from ['../../../test_sets/taxi_yellow_tripdata_samples/yellow_tripdata_sample_2020-04.csv']
Adding to existing table yellow_tripdata_sample_2020 and adding data from ['../../../test_sets/taxi_yellow_tripdata_samples/yellow_tripdata_sample_2020-05.csv']
Adding to existing table yellow_tripdata_sample_2020 and adding data from ['../../../test_sets/taxi_yellow_tripdata_samples/yellow_tripdata_sample_2020-06.csv']
Adding to existing table yellow_tr