Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to import files on MacOS #1252

Open
maxvamp12 opened this issue Jan 2, 2024 · 3 comments
Open

Unable to import files on MacOS #1252

maxvamp12 opened this issue Jan 2, 2024 · 3 comments
Assignees
Labels
MacOS (M1/M2) Issues related to MacOS M1/M2

Comments

@maxvamp12
Copy link

maxvamp12 commented Jan 2, 2024

Environment settings and versions

Sonoma 14.3
Python 3.10.13
conda 23.7.4
h20gpt-osx-m1-gpu : Nov 8, 2023 build
2021 M1 Max 16"
64GB Memory
TheBloke/Llama-2-7b-chat-fp16
./h2ogpt-osx-m1-gpu --user_path=/Volumes/Mac\ Development/AI/h2ogpt/h2ogpt-runtime/data/
chromadb v0.4.21
Chrome Version 120.0.6099.129 (Official Build) (arm64) MacOS

Package Version


accelerate 0.25.0
annotated-types 0.6.0
anyio 4.2.0
asgiref 3.7.2
backoff 2.2.1
bcrypt 4.1.2
cachetools 5.3.2
certifi 2023.11.17
charset-normalizer 3.3.2
chroma-hnswlib 0.7.3
chromadb 0.4.21
click 8.1.7
coloredlogs 15.0.1
Deprecated 1.2.14
exceptiongroup 1.2.0
faiss-cpu 1.7.4
fastapi 0.108.0
filelock 3.13.1
flatbuffers 23.5.26
fsspec 2023.12.2
google-auth 2.25.2
googleapis-common-protos 1.62.0
grpcio 1.60.0
h11 0.14.0
httptools 0.6.1
huggingface-hub 0.20.1
humanfriendly 10.0
idna 3.6
importlib-metadata 6.11.0
importlib-resources 6.1.1
Jinja2 3.1.2
kubernetes 28.1.0
MarkupSafe 2.1.3
mmh3 4.0.1
monotonic 1.6
mpmath 1.3.0
networkx 3.2.1
numpy 1.26.2
oauthlib 3.2.2
onnxruntime 1.16.3
opentelemetry-api 1.22.0
opentelemetry-exporter-otlp-proto-common 1.22.0
opentelemetry-exporter-otlp-proto-grpc 1.22.0
opentelemetry-instrumentation 0.43b0
opentelemetry-instrumentation-asgi 0.43b0
opentelemetry-instrumentation-fastapi 0.43b0
opentelemetry-proto 1.22.0
opentelemetry-sdk 1.22.0
opentelemetry-semantic-conventions 0.43b0
opentelemetry-util-http 0.43b0
overrides 7.4.0
packaging 23.2
pip 23.3.1
posthog 3.1.0
protobuf 4.25.1
psutil 5.9.7
pulsar-client 3.3.0
pyasn1 0.5.1
pyasn1-modules 0.3.0
pydantic 2.5.3
pydantic_core 2.14.6
PyPika 0.48.9
python-dateutil 2.8.2
python-dotenv 1.0.0
PyYAML 6.0.1
requests 2.31.0
requests-oauthlib 1.3.1
rsa 4.9
safetensors 0.4.1
setuptools 68.2.2
six 1.16.0
sniffio 1.3.0
starlette 0.32.0.post1
sympy 1.12
tenacity 8.2.3
tokenizers 0.15.0
torch 2.1.2
tqdm 4.66.1
typer 0.9.0
typing_extensions 4.9.0
urllib3 1.26.18
uvicorn 0.25.0
uvloop 0.19.0
watchfiles 0.21.0
websocket-client 1.7.0
websockets 12.0
wheel 0.41.2
wrapt 1.16.0
zipp 3.17.0
(

PYTHON PATH:
PYTHONPATH: /var/folders/7n/9837s8hx6gg3h7q2pyct_sth0000gn/T/_MEIJReen8
Path_1: /var/folders/7n/9837s8hx6gg3h7q2pyct_sth0000gn/T/_MEIJReen8
NLTK_DATA: /var/folders/7n/9837s8hx6gg3h7q2pyct_sth0000gn/T/_MEIJReen8/nltk_data
PATH: /Users/xxxx/anaconda3/envs/h20/bin:/Users/xxxxxxxx/anaconda3/condabin:/opt/homebrew/anaconda3/bin:/usr/local/anaconda3/bin:/opt/homebrew/bin:/opt/homebrew/sbin:/Library/Frameworks/Python.framework/Versions/3.10/bin:/usr/local/bin:/System/Cryptexes/App/usr/bin:/usr/bin:/bin:/usr/sbin:/sbin:/var/run/com.apple.security.cryptexd/codex.system/bootstrap/usr/local/bin:/var/run/com.apple.security.cryptexd/codex.system/bootstrap/usr/bin:/var/run/com.apple.security.cryptexd/codex.system/bootstrap/usr/appleinternal/bin:/Library/Apple/usr/bin:/Applications/VMware Fusion.app/Contents/Public:/usr/local/share/dotnet:~/.dotnet/tools:/Library/Frameworks/Mono.framework/Versions/Current/Commands:/Volumes/XBOX Work Space/Maven/apache-maven-3.8.3/bin:/Users/xxxxx/dotnet:/Users/xxxxxxx/Library/Application Support/JetBrains/Toolbox/scripts:/var/folders/7n/9837s8hx6gg3h7q2pyct_sth0000gn/T/_MEIJReen8/poppler/bin/:/var/folders/7n/9837s8hx6gg3h7q2pyct_sth0000gn/T/_MEIJReen8/poppler/lib/:/var/folders/7n/9837s8hx6gg3h7q2pyct_sth0000gn/T/_MEIJReen8/Tesseract-OCR
Path_3: /var/folders/7n/9837s8hx6gg3h7q2pyct_sth0000gn/T/_MEIJReen8/h2ogpt/src
Path_3: /var/folders/7n/9837s8hx6gg3h7q2pyct_sth0000gn/T/_MEIJReen8/h2ogpt/iterators
Path_3: /var/folders/7n/9837s8hx6gg3h7q2pyct_sth0000gn/T/_MEIJReen8/h2ogpt/gradio_utils
Path_3: /var/folders/7n/9837s8hx6gg3h7q2pyct_sth0000gn/T/_MEIJReen8/h2ogpt/metrics
Path_3: /var/folders/7n/9837s8hx6gg3h7q2pyct_sth0000gn/T/_MEIJReen8/h2ogpt/models
Path_3: /var/folders/7n/9837s8hx6gg3h7q2pyct_sth0000gn/T/_MEIJReen8/h2ogpt/.

### Description of the issue and things tried

Anytime I try to import a document, I get an error of

/Volumes/Mac Development/AI/h2ogpt/h2ogpt-runtime/data/Teaching of the Mystics.pdf Exception: [Errno 2] No such file or directory: '/private/var/folders/7n/9837s8hx6gg3h7q2pyct_sth0000gn/T/_MEIHdUO3A/unstructured/nlp/english-words.txt'

or sometimes I get module chromadb.telemetry not loaded. Same with Faiss, even though both are installed.

No module named 'chromadb.telemetry.posthog'

I have changed the user data directory to see if any DB is ever created as well as checked for the file english-words.txt, which does seem to be missing. The file permissions are all accessible. No databases appear in the db_dir_UserData, however DB folders do appear to be created in the db_nonusers director. There are no db files under this db_nonusers sub folders.

All of the actions are being performed in the Chrome GUI. I have not tried to use the CLI.

@Mathanraj-Sharma Mathanraj-Sharma added the MacOS (M1/M2) Issues related to MacOS M1/M2 label Jan 2, 2024
@pseudotensor
Copy link
Collaborator

@Mathanraj-Sharma please help.

@maxvamp12
Copy link
Author

I was a bit brain dead when writing the bug. YAY Winter colds!!!! Here are the repro steps. It was pretty straight forward.

I did move my users directory to something that would not be impacted by file system permissions. Hence the additional cli arg. I did not see any files created in the temp folders.

I am running everything from the CLI, no finder clicks.... from the non-root volume folder "/Volumes/Mac Development/AI/h2ogpt/h2ogpt-runtime/"

Repro Steps:

  • launch a conda instance for Python3.10
  • Execute the mac binary with this commandliine : ./h2ogpt-osx-m1-gpu --user_path=/Volumes/Mac\ Development/AI/h2ogpt/h2ogpt-runtime/data/
  • Load a small LAMA 7B model. I tried several the effect was the same.
  • Use the upload box on the chat window to import a file.
  • Check the documents page for the error. You will see either No module named 'chromadb.telemetry.posthog' or the directory not found error.
  • Verify the english-words.txt file exists ( it does not )
  • verify permissions on all app touchable working folders ( including list above )
    • changed to 777
    • tried runnning app as sudu ( no change )

I could not think of much more to try, other than verifying the needed packages are installed ( listed above ) I can provide more logs if needed.

@Mathanraj-Sharma
Copy link
Member

@maxvamp12 could you please try with the latest artifacts https://github.com/h2oai/h2ogpt#macos-cpum1m2-with-full-document-qa-capability

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
MacOS (M1/M2) Issues related to MacOS M1/M2
Projects
None yet
Development

No branches or pull requests

3 participants