Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upsert API fails with graph config while performing after /delete #435

Closed
akset2X opened this issue Feb 21, 2023 · 4 comments
Closed

Upsert API fails with graph config while performing after /delete #435

akset2X opened this issue Feb 21, 2023 · 4 comments
Assignees
Labels
bug Something isn't working
Milestone

Comments

@akset2X
Copy link

akset2X commented Feb 21, 2023

Getting the following weird error when I do upsert operation from the /upsert API. I am doing it with graph configuration.

#config.yml
# Index file path
path: ./tmp/index

# Allow indexing of documents
writable: True

# Enbeddings index
embeddings:
  path: sentence-transformers/all-MiniLM-L6-v2
  content: True
  functions:
  - name: graph
    function: graph.attribute
  expressions:
  - name: category
    expression: graph(indexid, 'category')
  - name: topic
    expression: graph(indexid, 'topic')
  - name: topicrank
    expression: graph(indexid, 'topicrank')
  graph:
    limit: 15
    minscore: 0.1
    topics:
      categories:
      - Society & Culture
      - Science & Mathematics
      - Health
      - Education & Reference
      - Computers & Internet
      - Sports
      - Business & Finance
      - Entertainment & Music
      - Family & Relationships
      - Politics & Government

"/index" API is working fine, but I want to upsert index, so that I can keep my old emeddings. Is there any solution?

INFO: Application startup complete.
INFO: Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)
INFO: 127.0.0.1:57206 - "POST /batchsearch HTTP/1.1" 200 OK
INFO: 127.0.0.1:57206 - "POST /delete HTTP/1.1" 200 OK
INFO: 127.0.0.1:57206 - "POST /add HTTP/1.1" 200 OK
INFO: 127.0.0.1:57206 - "GET /upsert HTTP/1.1" 500 Internal Server Error
ERROR: Exception in ASGI application
Traceback (most recent call last):
File "c:\users\ak\appdata\local\programs\python\python39\lib\site-packages\uvicorn\protocols\http\h11_impl.py", line 407, in run_asgi
result = await app( # type: ignore[func-returns-value]
File "c:\users\ak\appdata\local\programs\python\python39\lib\site-packages\uvicorn\middleware\proxy_headers.py", line 78, in call
return await self.app(scope, receive, send)
File "c:\users\ak\appdata\local\programs\python\python39\lib\site-packages\fastapi\applications.py", line 270, in call
await super().call(scope, receive, send)
File "c:\users\ak\appdata\local\programs\python\python39\lib\site-packages\starlette\applications.py", line 124, in call
await self.middleware_stack(scope, receive, send)
File "c:\users\ak\appdata\local\programs\python\python39\lib\site-packages\starlette\middleware\errors.py", line 184, in call
raise exc
File "c:\users\ak\appdata\local\programs\python\python39\lib\site-packages\starlette\middleware\errors.py", line 162, in call
await self.app(scope, receive, _send)
File "c:\users\ak\appdata\local\programs\python\python39\lib\site-packages\starlette\middleware\exceptions.py", line 79, in call
raise exc
File "c:\users\ak\appdata\local\programs\python\python39\lib\site-packages\starlette\middleware\exceptions.py", line 68, in call
await self.app(scope, receive, sender)
File "c:\users\ak\appdata\local\programs\python\python39\lib\site-packages\fastapi\middleware\asyncexitstack.py", line 21, in call
raise e
File "c:\users\ak\appdata\local\programs\python\python39\lib\site-packages\fastapi\middleware\asyncexitstack.py", line 18, in call
await self.app(scope, receive, send)
File "c:\users\ak\appdata\local\programs\python\python39\lib\site-packages\starlette\routing.py", line 706, in call
await route.handle(scope, receive, send)
File "c:\users\ak\appdata\local\programs\python\python39\lib\site-packages\starlette\routing.py", line 276, in handle
await self.app(scope, receive, send)
File "c:\users\ak\appdata\local\programs\python\python39\lib\site-packages\starlette\routing.py", line 66, in app
response = await func(request)
File "c:\users\ak\appdata\local\programs\python\python39\lib\site-packages\fastapi\routing.py", line 237, in app
raw_response = await run_endpoint_function(
File "c:\users\ak\appdata\local\programs\python\python39\lib\site-packages\fastapi\routing.py", line 165, in run_endpoint_function
return await run_in_threadpool(dependant.call, **values)
File "c:\users\ak\appdata\local\programs\python\python39\lib\site-packages\starlette\concurrency.py", line 41, in run_in_threadpool
return await anyio.to_thread.run_sync(func, *args)
File "c:\users\ak\appdata\local\programs\python\python39\lib\site-packages\anyio\to_thread.py", line 31, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "c:\users\ak\appdata\local\programs\python\python39\lib\site-packages\anyio_backends_asyncio.py", line 937, in run_sync_in_worker_thread
return await future
File "c:\users\ak\appdata\local\programs\python\python39\lib\site-packages\anyio_backends_asyncio.py", line 867, in run
result = context.run(func, *args)
File "c:\users\ak\appdata\local\programs\python\python39\lib\site-packages\txtai\api\routers\embeddings.py", line 85, in upsert
application.get().upsert()
File "c:\users\ak\appdata\local\programs\python\python39\lib\site-packages\txtai\api\base.py", line 80, in upsert
super().upsert()
File "c:\users\ak\appdata\local\programs\python\python39\lib\site-packages\txtai\app\base.py", line 400, in upsert
self.embeddings.upsert(self.documents)
File "c:\users\ak\appdata\local\programs\python\python39\lib\site-packages\txtai\embeddings\base.py", line 180, in upsert
self.graph.upsert(Search(self, True))
File "c:\users\ak\appdata\local\programs\python\python39\lib\site-packages\txtai\graph\base.py", line 415, in upsert
self.infertopics()
File "c:\users\ak\appdata\local\programs\python\python39\lib\site-packages\txtai\graph\base.py", line 541, in infertopics
topic = Counter(self.attribute(x, "topic") for x in ids).most_common(1)[0][0]
TypeError: 'NoneType' object is not iterable

What I think is.. it is related to #421
I think this issue may be caused by /delete API, because when I freshly did /add and /upsert a batch of text content, it worked fine. But when I /delete some of it, then /add and /upsert. I faced the mentioned issue.

@akset2X akset2X changed the title Upsert API fails with graph config Upsert API fails with graph config while performing after /delete Feb 21, 2023
@davidmezzetti
Copy link
Member

Ok, thank you for the additional context on this and linking to #421.

@akset2X
Copy link
Author

akset2X commented Mar 2, 2023

Any update here ?

Is there any option to configure topic generation to avoid stopwords such as "has", "you", "yourself" etc or some tagging options to get only proper nouns NN, JJ ? Please let me know, if there is no such option for now, it would be great if we could get those in future.

@davidmezzetti
Copy link
Member

Sorry, it's on my list to look at this after the 5.4 release goes out.

@davidmezzetti davidmezzetti self-assigned this Mar 10, 2023
@davidmezzetti davidmezzetti added this to the v5.5.0 milestone Mar 10, 2023
@davidmezzetti
Copy link
Member

I'll be checking in a fix for this shortly. There was a bug when deleting from the index and topics went to 0.

Regarding stopwords, see the stopwords configuration option: https://neuml.github.io/txtai/embeddings/configuration/#topics

@davidmezzetti davidmezzetti added the bug Something isn't working label Mar 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants