Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API triggered document ingestion on Windows causes permission error #1227

Closed
jaydavid opened this issue Nov 13, 2023 · 4 comments · Fixed by #1260 or #1280
Closed

API triggered document ingestion on Windows causes permission error #1227

jaydavid opened this issue Nov 13, 2023 · 4 comments · Fixed by #1260 or #1280

Comments

@jaydavid
Copy link

jaydavid commented Nov 13, 2023

When running locally on a Windows 11 machine, I am able to interact with the UI and upload files without issue.

However, when I attempt to connect to the server via another application using the API ingestion endpoint /v1/ingest I am seeing a permissions issue with the temporary file that's created to facilitate the ingestion. A snippet of the error:

  File "C:\repos\privateGPT\private_gpt\server\ingest\ingest_router.py", line 38, in ingest
    ingested_documents = service.ingest(file.filename, file.file.read())
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\repos\privateGPT\private_gpt\server\ingest\ingest_service.py", line 118, in ingest
    path_to_tmp.write_bytes(file_data)
  File "C:\Users\Jay\AppData\Local\Programs\Python\Python311\Lib\pathlib.py", line 1067, in write_bytes
    with self.open(mode='wb') as f:
         ^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Jay\AppData\Local\Programs\Python\Python311\Lib\pathlib.py", line 1044, in open
    return io.open(self, mode, buffering, encoding, errors, newline)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
PermissionError: [Errno 13] Permission denied: 'C:\\Users\\Jay\\AppData\\Local\\Temp\\tmpxq8kqnc9'

Here is the demo code that I'm using to exercise the API. This is running on the node side of a next.js app.

    let file = await fileRepo.get(fileId);
    let formData = new FormData();
    formData.append("file", base64toBlob(file.data), file.name);
    console.log("ingesting file...");
    const response = await fetch("http://127.0.0.1:8001/v1/ingest", {
      method: "POST",
      body: formData,
    });

I am able to correct this issue locally if I make the following change, which I believe is due to the way that python requisitions temporary files on a Windows machine. The change is made in the /private_gpt/server/ingest/ingest_service.py file.

Current:

                with tempfile.NamedTemporaryFile() as tmp:
                    path_to_tmp = Path(tmp.name)
                    if isinstance(file_data, bytes):
                        path_to_tmp.write_bytes(file_data)
                    else:
                        path_to_tmp.write_text(str(file_data))
                    documents = reader.load_data(path_to_tmp)

Updated:

                with tempfile.NamedTemporaryFile(delete=False) as tmp:
                    tmp.close()
                    path_to_tmp = Path(tmp.name)
                    if isinstance(file_data, bytes):
                        path_to_tmp.write_bytes(file_data)
                    else:
                        path_to_tmp.write_text(str(file_data))
                    documents = reader.load_data(path_to_tmp)

I'm sure there are other implications to this change, and this is likely not a complete solution, but I wanted to at least get this issue tracked if this can help anyone else.

@pabloogc
Copy link
Collaborator

Thanks for suggesting the fix, I will open a PR with it.

@pabloogc
Copy link
Collaborator

@jaydavid can you check if #1260 solves it?

@jaydavid
Copy link
Author

jaydavid commented Nov 17, 2023

@pabloogc

Unfortunately, the branch does not resolve the issue. Here's a copy of the console output I get when I interact via the API

12:18:43.557 [INFO    ]            uvicorn.access - 127.0.0.1:53592 - "POST /v1/ingest HTTP/1.1" 500
12:18:43.557 [ERROR   ]             uvicorn.error - Exception in ASGI application
Traceback (most recent call last):
  File "C:\Users\Jay\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-9Rr4HheO-py3.11\Lib\site-packages\uvicorn\protocols\http\httptools_impl.py", line 426, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Jay\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-9Rr4HheO-py3.11\Lib\site-packages\uvicorn\middleware\proxy_headers.py", line 84, in __call__
    return await self.app(scope, receive, send)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Jay\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-9Rr4HheO-py3.11\Lib\site-packages\fastapi\applications.py", line 292, in __call__
    await super().__call__(scope, receive, send)
  File "C:\Users\Jay\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-9Rr4HheO-py3.11\Lib\site-packages\starlette\applications.py", line 122, in __call__
    await self.middleware_stack(scope, receive, send)
  File "C:\Users\Jay\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-9Rr4HheO-py3.11\Lib\site-packages\starlette\middleware\errors.py", line 184, in __call__
    raise exc
  File "C:\Users\Jay\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-9Rr4HheO-py3.11\Lib\site-packages\starlette\middleware\errors.py", line 162, in __call__
    await self.app(scope, receive, _send)
  File "C:\Users\Jay\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-9Rr4HheO-py3.11\Lib\site-packages\starlette\middleware\exceptions.py", line 79, in __call__
    raise exc
  File "C:\Users\Jay\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-9Rr4HheO-py3.11\Lib\site-packages\starlette\middleware\exceptions.py", line 68, in __call__
    await self.app(scope, receive, sender)
  File "C:\Users\Jay\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-9Rr4HheO-py3.11\Lib\site-packages\fastapi\middleware\asyncexitstack.py", line 20, in __call__
    raise e
  File "C:\Users\Jay\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-9Rr4HheO-py3.11\Lib\site-packages\fastapi\middleware\asyncexitstack.py", line 17, in __call__
    await self.app(scope, receive, send)
  File "C:\Users\Jay\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-9Rr4HheO-py3.11\Lib\site-packages\starlette\routing.py", line 718, in __call__
    await route.handle(scope, receive, send)
  File "C:\Users\Jay\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-9Rr4HheO-py3.11\Lib\site-packages\starlette\routing.py", line 276, in handle
    await self.app(scope, receive, send)
  File "C:\Users\Jay\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-9Rr4HheO-py3.11\Lib\site-packages\starlette\routing.py", line 66, in app
    response = await func(request)
               ^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Jay\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-9Rr4HheO-py3.11\Lib\site-packages\fastapi\routing.py", line 273, in app
    raw_response = await run_endpoint_function(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Jay\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-9Rr4HheO-py3.11\Lib\site-packages\fastapi\routing.py", line 192, in run_endpoint_function
    return await run_in_threadpool(dependant.call, **values)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Jay\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-9Rr4HheO-py3.11\Lib\site-packages\starlette\concurrency.py", line 41, in run_in_threadpool
    return await anyio.to_thread.run_sync(func, *args)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Jay\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-9Rr4HheO-py3.11\Lib\site-packages\anyio\to_thread.py", line 33, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Jay\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-9Rr4HheO-py3.11\Lib\site-packages\anyio\_backends\_asyncio.py", line 877, in run_sync_in_worker_thread
    return await future
           ^^^^^^^^^^^^
  File "C:\Users\Jay\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt-9Rr4HheO-py3.11\Lib\site-packages\anyio\_backends\_asyncio.py", line 807, in run
    result = context.run(func, *args)
             ^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\strasz-repos\privateGPT\private_gpt\server\ingest\ingest_router.py", line 38, in ingest
    ingested_documents = service.ingest(file.filename, file.file.read())
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\strasz-repos\privateGPT\private_gpt\server\ingest\ingest_service.py", line 125, in ingest
    path_to_tmp.unlink()
  File "C:\Users\Jay\AppData\Local\Programs\Python\Python311\Lib\pathlib.py", line 1147, in unlink
    os.unlink(self)
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\Jay\\AppData\\Local\\Temp\\tmpe1n9k975'

If I add a tmp.close() in the beginning of the try block, it does resolve the issue - but I am not sure if this has implications outside of a Windows environment. I hope this info is helpful.

@pabloogc
Copy link
Collaborator

interesting, maybe the issue is caused by the path lib and works fine with a regular close

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants