FeatureGroup().ingest() throws "OSError" - "Function not implemented" inside Lambda Function

**Describe the bug**
This error is thrown when running a Docker Lambda Function in the first step of my State Machine. Please see below for additional information.

**To reproduce**
1. Create a Lambda Function that uses a Docker Image configured similarly to `System Information`.
2. Include `from sagemaker.feature_store.feature_group import FeatureGroup`, `FeatureGroup().ingest()` in your function code.
3. Test the function - I was able to reproduce the error in the Lambda Console via the `Test` tab.

**Expected behavior**
Feature Group should successfully ingest the data.

**Screenshots or logs**
```
{
  "error": "OSError",
  "cause": {
    "errorMessage": "[Errno 38] Function not implemented",
    "errorType": "OSError",
    "requestId": <REDACTED>,
    "stackTrace": [
      "  File \"/var/task/app.py\", line 298, in lambda_handler\n    master()\n",
      "  File \"/var/task/app.py\", line 268, in master\n    <REDACTED_FEATURE_GROUP_NAME>.ingest(data_frame=df2, max_workers=1, wait=True)\n",
      "  File \"/var/task/sagemaker/feature_store/feature_group.py\", line 627, in ingest\n    manager.run(data_frame=data_frame, wait=wait, timeout=timeout)\n",
      "  File \"/var/task/sagemaker/feature_store/feature_group.py\", line 371, in run\n    self._run_multi_process(data_frame=data_frame, wait=wait, timeout=timeout)\n",
      "  File \"/var/task/sagemaker/feature_store/feature_group.py\", line 297, in _run_multi_process\n    self._processing_pool = ProcessingPool(self.max_processes, init_worker)\n",
      "  File \"/var/task/pathos/multiprocessing.py\", line 111, in __init__\n    self._serve()\n",
      "  File \"/var/task/pathos/multiprocessing.py\", line 123, in _serve\n    _pool = Pool(nodes)\n",
      "  File \"/var/task/multiprocess/pool.py\", line 191, in __init__\n    self._setup_queues()\n",
      "  File \"/var/task/multiprocess/pool.py\", line 343, in _setup_queues\n    self._inqueue = self._ctx.SimpleQueue()\n",
      "  File \"/var/task/multiprocess/context.py\", line 113, in SimpleQueue\n    return SimpleQueue(ctx=self.get_context())\n",
      "  File \"/var/task/multiprocess/queues.py\", line 345, in __init__\n    self._rlock = ctx.Lock()\n",
      "  File \"/var/task/multiprocess/context.py\", line 68, in Lock\n    return Lock(ctx=self.get_context())\n",
      "  File \"/var/task/multiprocess/synchronize.py\", line 168, in __init__\n    SemLock.__init__(self, SEMAPHORE, 1, 1, ctx=ctx)\n",
      "  File \"/var/task/multiprocess/synchronize.py\", line 63, in __init__\n    sl = self._semlock = _multiprocessing.SemLock(\n"
    ]
  }
}
```

**System information**
A description of your system. Please provide:
- **SageMaker Python SDK version**: 2.70.0
- **Framework name (eg. PyTorch) or algorithm (eg. KMeans)**: N/A
- **Framework version**: N/A
- **Python version**: 3.9
- **CPU or GPU**: CPU
- **Custom Docker image (Y/N)**: Y - `public.ecr.aws/lambda/python:3.9`

**Additional context**
1. No issues running this locally in debugger (not in Docker container).
2. I have seen similar errors related to `_multiprocessing.SemLock` in the kedro repository where they were creating Lambda Functions connected via a State Machine, which they were (afaik) unable to resolve or circumvent, so a resolution to this issue might be applicable / helpful for many users of different packages.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

FeatureGroup().ingest() throws "OSError" - "Function not implemented" inside Lambda Function #2844

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

FeatureGroup().ingest() throws "OSError" - "Function not implemented" inside Lambda Function #2844

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions