You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It looks like from this comment if Tika is not working, it falls back to beautifulsoup, which is the case here (textractor.checkjava() is False). Would you expect the output from beautifulsoup to be useless like this?
The text was updated successfully, but these errors were encountered:
I was trying to run 52_Build_RAG_pipelines_with_txtai.ipynb and was getting garbled output from the
Textractor
.(I have also found that the path to article.pdf must be an absolute path)
Here is a sample of the output:
It looks like from this comment if Tika is not working, it falls back to beautifulsoup, which is the case here (
textractor.checkjava()
is False). Would you expect the output from beautifulsoup to be useless like this?The text was updated successfully, but these errors were encountered: