Unable to easily mock Athena result CSV #7702

dmarra · 2024-05-17T18:49:32Z

We are able to mock Athena query results via:

   resp = requests.post(
        "http://motoapi.amazonaws.com/moto-api/static/athena/query-results",
        json=query_results,
    )
    assert resp.status_code == 201

But there is no way to mock the output of get_execution_results. this poses a few problems:

unable to have any control over request status (in cases where you want to test that retries are happening for instance)
unable to have any control over the output location

I am more interested in the latter. This would be a lot more useful if we could somehow assign results via a CSV, and have the query-results mocked from that. But in the simplest case, being able to control that datapoint to point to a file we have put to a mock bucket
would go a long way in the case where the code under test needs to access the file directly.

Here is an example of how I had to hack this into my tests:

from moto.athena import athena_backends
from moto.athena.models import Execution
from moto.core import DEFAULT_ACCOUNT_ID

@pytest.fixture
@mock_aws
def mock_athena_results():
    # place the mocked athena results in "s3"
    s3 = boto3.client("s3", region_name=os.getenv("AWS_DEFAULT_REGION"))
    bucket_name = 'test-bucket'
    key = f"query_results/output.csv"
  
    with open('some/csv/file.csv', "r") as f:
        contents = f.read()
        s3.put_object(Bucket=bucket_name, Key=key, Body=contents)

    # Hack moto to mock the execution results to have the correct path to our csv results
    exec = Execution(
        query="SELECT * FROM notused",
        config={
            "OutputLocation": f"s3://test-bucket/query_results/output.csv"
        },
        context={
            'Database': 'default'
        },
        workgroup={},
        execution_parameters={}
    )
    # moto tries to mock the file location to _look_ correct, but we need it to be an actual result
    exec.config['OutputLocation'] = f"s3://test-bucket/query_results/output.csv"

    # This is a hack to get moto to return the correct execution object by having a factory that
    # pretends to be a list of executions
    class MockExecutionFactory:
        def __get__(self, instance, owner):
            return [
                exec
            ]

        def __set__(self, instance, value):
            pass

        def __setitem__(self, key, value):
            pass

        def __getitem__(self, key):
            return exec

    athena_backends[DEFAULT_ACCOUNT_ID]["us-east-1"].executions = MockExecutionFactory()
    return

The above works, but its pretty janky. I suppose this is more of a feature request than anything. Or even better yet, having a way to control execution id (not with seeding, thats still too unpredictable and fragile) such that it can be explicitly set.

The text was updated successfully, but these errors were encountered:

bblommers · 2024-05-19T11:20:49Z

Hi @dmarra, thanks for raising this.

My assumption is that the application takes whatever OutputLocation is given, downloads the result from S3 and does whatever it needs to do with the result.

Would it work if Moto would actually write the configured query results to S3? Then the actual OutputLocation doesn't matter, and the only test setup is to configure the expected results via the static/athena/query-results endpoint.

dmarra · 2024-05-20T11:54:40Z

That could work! The only issue with that is that the Athena output JSON is.... perhaps a bit overengineered. It's a pain to mock many rows of data in that format. But that being said, if this is the easiest way to get the enhancement in, then I think its way better than nothing. Makes sense too, since it works with the current design.

dmarra · 2024-05-20T11:57:09Z

To clarify:

It would have to respect the bucket given as output (but the file name could be the auto-generated execution ID)
The output format is CSV. I don't think Athena has a way to change that, so it should be fine to just always have the output in CSV format in the bucket

bblommers added the enhancement label May 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to easily mock Athena result CSV #7702

Unable to easily mock Athena result CSV #7702

dmarra commented May 17, 2024 •

edited

bblommers commented May 19, 2024

dmarra commented May 20, 2024

dmarra commented May 20, 2024

Unable to easily mock Athena result CSV #7702

Unable to easily mock Athena result CSV #7702

Comments

dmarra commented May 17, 2024 • edited

bblommers commented May 19, 2024

dmarra commented May 20, 2024

dmarra commented May 20, 2024

dmarra commented May 17, 2024 •

edited