Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lambda Execution Issues #33

Open
puzzlepeaches opened this issue Feb 4, 2024 · 0 comments
Open

Lambda Execution Issues #33

puzzlepeaches opened this issue Feb 4, 2024 · 0 comments

Comments

@puzzlepeaches
Copy link

Hey there! Awesome library! I am running into some issues. I hope the community here can help me troubleshoot them. I am attempting to run hrequests in Lambda to interact with specific web pages when a function URL is called.

I am using the AWS SDK to deploy a Docker container similar to the following to ECR -> Lambda:

FROM mcr.microsoft.com/playwright/python:v1.34.0-jammy

# Include global arg in this stage of the build
ARG FUNCTION_DIR

RUN mkdir -p ${FUNCTION_DIR}

COPY app.py ${FUNCTION_DIR}

WORKDIR /app

COPY ./mytool/pyproject.toml ./mytool/poetry.lock /app/

COPY ./mytool/. /app

# Install dependencies using poetry
RUN pip install --no-cache-dir poetry awslambdaric aws-xray-sdk sh \
    && poetry config virtualenvs.create false \
    && poetry install --no-interaction --no-ansi

RUN python -m playwright install-deps
RUN python -m playwright install

WORKDIR ${FUNCTION_DIR}

ENTRYPOINT [ "/usr/bin/python", "-m", "awslambdaric" ]
CMD [ "app.handler" ]

An app.py file similar to the following is then called using said function URL via awslambdaric:

def handler(event, context):
    logger.debug(msg=f"Initial event: {event}")

    headers = event["headers"]
    header_validation = validate_headers(headers)

    input = headers["x-input"]
    try:
        command = headers["x-command"].split()
        command.extend(input.split())
    except Exception as e:
        logger.error(msg=f"Error parsing command: {e}")
        return {
            "statusCode": 500,
            "body": f"Error parsing command: {e}",
        }

    parsed = []
    try:
        logger.debug(msg=f"Running command: {command}")

        # Set HOME=/tmp to avoid writing to the container filesystem
        # Set LD_LIBRARY_PATH to include /usr/lib64 to avoid issues with the AWS X-Ray daemon
        os.environ["HOME"] = "/tmp"
        os.environ["LD_LIBRARY_PATH"] = "/usr/lib64"

        results = subprocess.run(command, capture_output=True, text=True, env=os.environ.copy())
        logger.debug(msg=f"Results stdout: {results.stdout}")
        logger.debug(msg=f"Results stderr: {results.stderr}")
        logger.debug(msg=f"Command exited with code: {results.returncode}")

    except subprocess.TimeoutExpired as e:
        logger.error(msg=f"Command timed out: {e}")
        return {
            "statusCode": 408,  # HTTP status code for Request Timeout
            "body": json.dumps({
                "stdout": str(e.stdout),
                "stderr": str(e.stderr),
                "e": str(e),
                "error": "Command timed out"
            }),
        }
    except Exception as e:
        logger.error(msg=f"Error executing command: {e}")
        return {
            "statusCode": 500,
            "body": f"Error executing command: {e}",
        }

    try:
        for line in results.stdout.splitlines():
            parsed_json = json.loads(line)
            logger.debug(msg=f"Output: {parsed_json}")
            parsed.append(parsed_json)
    except Exception as e:
        logger.error(msg=f"Error parsing output: {e}")
        return {
            "statusCode": 500,
            "body": f"Error parsing output: {e}",
        }
    
    xray_recorder.end_segment()

    return {"statusCode": 200, "body": json.dumps(parsed)}

This app.py code is calling a separate tool I have created that utilizes hrequests for navigation and interaction with web pages. When calling the app.py file with the function URL, however, the following error is returned from hrequests specifically:

Exception in thread Thread-1 (spawn_main):
Traceback (most recent call last):
  File "/usr/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.10/threading.py", line 953, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/lib/python3.10/dist-packages/hrequests/browser.py", line 128, in spawn_main
    asyncio.new_event_loop().run_until_complete(self.main())
  File "/usr/lib/python3.10/asyncio/base_events.py", line 646, in run_until_complete
    return future.result()
  File "/usr/local/lib/python3.10/dist-packages/hrequests/browser.py", line 135, in main
    self.context = await self.client.new_context(
  File "/usr/local/lib/python3.10/dist-packages/hrequests/playwright_mock/playwright_mock.py", line 38, in new_context
    _browser = await context.new_context(
  File "/usr/local/lib/python3.10/dist-packages/hrequests/playwright_mock/context.py", line 6, in new_context
    context = await inst.main_browser.new_context(
  File "/usr/local/lib/python3.10/dist-packages/playwright/async_api/_generated.py", line 14154, in new_context
    await self._impl_obj.new_context(
  File "/usr/local/lib/python3.10/dist-packages/playwright/_impl/_browser.py", line 127, in new_context
    channel = await self._channel.send("newContext", params)
  File "/usr/local/lib/python3.10/dist-packages/playwright/_impl/_connection.py", line 61, in send
    return await self._connection.wrap_api_call(
  File "/usr/local/lib/python3.10/dist-packages/playwright/_impl/_connection.py", line 482, in wrap_api_call
    return await cb()
  File "/usr/local/lib/python3.10/dist-packages/playwright/_impl/_connection.py", line 97, in inner_send
    result = next(iter(done)).result()
playwright._impl._api_types.Error: Target page, context or browser has been closed

Some notes on what has already been attempted:

  • The container image runs just fine on my local system with similar resource allocations specified
  • I can call my tool remotely, and it appears to run partially before hitting this exception
  • I have increased memory allocation to the Lambda function several times without success.
  • My tool is always hitting the lambda timeout value set no matter how high so I suspect this error is occurring and locking the application entirely.

I am not experienced with playwright and headless browser usage, so any help would be greatly appreciated. I understand this is not directly related to hrequests, but I hope the community here is familiar enough with the frameworks to assist. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant