# Amazon Athena Notebook

This notebook demonstrates how to load documents from Amazon Athena using the `AthenaLoader` class.

>[Amazon Athena](https://aws.amazon.com/athena/) is a serverless, interactive analytics service built
>on open-source frameworks, supporting open-table and file formats. Athena provides a simplified,
>flexible way to analyze petabytes of data where it lives. Analyze data or build applications
>from an Amazon Simple Storage Service (S3) data lake and 30 data sources, including on-premises data
>sources or other cloud systems using SQL or Python. Athena is built on open-source Trino
>and Presto engines and Apache Spark frameworks, with no provisioning or configuration effort required.

Before running this notebook, make sure to follow the [instructions to set up an AWS account](https://docs.aws.amazon.com/athena/latest/ug/setting-up.html) and install the required python library.

```bash
! pip install boto3
```

> ⚠️ **Note**: Replace `my_database`, `my_bucket`, `my_table`, and `my_profile` with your own values.



In [None]:
import sys

def print_error_and_exit(message):
    """Print error message and exit the notebook."""
    print(f'[!] {message}
', file=sys.stderr)
    exit(1)

def check_aws_credentials():
    """Check if AWS credentials are set up correctly."""
    try:
        import boto3
        boto3.client('s3')
    except Exception as e:
        print_error_and_exit(f'Error initializing AWS SDK: {str(e)}')


In [None]:
check_aws_credentials()

from langchain_community.document_loaders.athena import AthenaLoader

In [None]:
database_name = "my_database"
s3_output_path = "s3://my_bucket/query_results/"
query = "SELECT * FROM my_table"
profile_name = "my_profile"

loader = AthenaLoader(
    query=query,
    database=database_name,
    s3_output_uri=s3_output_path,
    profile_name=profile_name,
)

try:
    documents = loader.load()
    print("Documents loaded successfully:")
    print(documents)
except Exception as e:
    print_error_and_exit(f'Error loading documents: {str(e)}')


## Example with metadata columns

In [None]:
database_name = "my_database"
s3_output_path = "s3://my_bucket/query_results/"
query = "SELECT * FROM my_table"
profile_name = "my_profile"
metadata_columns = ["_row", "_created_at"]

loader = AthenaLoader(
    query=query,
    database=database_name,
    s3_output_uri=s3_output_path,
    profile_name=profile_name,
    metadata_columns=metadata_columns,
)

try:
    documents = loader.load()
    print("Documents with metadata loaded successfully:")
    print(documents)
except Exception as e:
    print_error_and_exit(f'Error loading documents with metadata: {str(e)}')
