Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can not install sciassist #32

Open
qolina opened this issue Oct 17, 2023 · 6 comments
Open

Can not install sciassist #32

qolina opened this issue Oct 17, 2023 · 6 comments

Comments

@qolina
Copy link
Collaborator

qolina commented Oct 17, 2023

Commands used

conda create --name assist python=3.8
conda activate assist
pip install sciassist

Error message

DEPRECATION: pytorch-lightning 1.7.7 has a non-standard dependency specifier torch>=1.9.*. pip 23.3 will enforce this behaviour change. A possible replacement is to upgrade to a newer version of pytorch-lightning or contact the author to suggest that they release a version with a conforming dependency specifiers. Discussion can be found at pypa/pip#12063
Installing collected packages: chardet, six, PyYAML, pyparsing, pyasn1, packaging, oauthlib, multiprocess, jinja2, idna, click, certifi, attrs, async-timeout, sentry-sdk, pyasn1-modules, responses
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
nbconvert 7.8.0 requires traitlets>=5.1, which is not installed.
Successfully installed PyYAML-6.0.1 async-timeout-4.0.3 attrs-23.1.0 certifi-2023.7.22 chardet-3.0.4 click-8.1.7 idna-2.8 jinja2-3.1.2 multiprocess-0.70.12.2 oauthlib-3.2.2 packaging-23.2 pyasn1-0.5.0 pyasn1-modules-0.3.0 pyparsing-3.1.1 responses-0.10.15 sentry-sdk-1.9.0 six-1.16.0

sciassist is not installed!

@dyxohjl666
Copy link
Collaborator

dyxohjl666 commented Oct 17, 2023

I test on mce, ecp, and NSCC:
mce: got the same warnings as your records, but it seems SciAssist works well. You can import SciAssist in python console. (The latest pytorch is incompatible to mce's gpu, so specify it to 1.12.0)
ecp: no problem except "DEPRECATION" warnings from pytorch-lightning. When import SciAssist, there'are some "Import Error: No module named xx" . It seems that the default python version is 2.x and all of them come from Linxiao's from transformers import *. I'm not sure whether it's related to the server's setting, but python3 -m pip install SciAssist works well.
NSCC: same with 2.
Todo:

  • 1. Change Linxiao's code

  • 2. Specify the torch version to 1.12.0 explicitly in the requirements.txt. We may add this notes to README, to remind users to install a version compatible to their machine.

  1. Pytorch-lightning 1.7 still works well in our toolkit. I don't recommend to update it now because we are not sure the impact yet.

  2. I think there should be some problems with the server themselves, as many error files are in "/usr/share" ,and if one doesn't have root account, it's hard to discover the causes.

@qolina
Copy link
Collaborator Author

qolina commented Nov 29, 2023

With Sciassist=0.1.1

The mce server:

~$ lsb_release -a

Distributor ID: Ubuntu
Description: Ubuntu 20.04.6 LTS
Release: 20.04

~$ nvidia-smi

NVIDIA-SMI 470.199.02 Driver Version: 470.199.02 CUDA Version: 11.4

Installation:

conda create --name assist python=3.8
conda activate assist
pip install sciassist

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
requests-oauthlib 1.3.1 requires oauthlib>=3.0.0, which is not installed.
Successfully installed PyYAML-6.0.1 async-timeout-4.0.3 attrs-23.1.0 beautifulsoup4-4.9.3 certifi-2023.11.17 chardet-3.0.4 click-8.1.7 exceptiongroup-1.2.0 idna-2.8 iniconfig-2.0.0 jinja2-3.1.2 lightning-utilities-0.10.0 multiprocess-0.70.12.2 numpy-1.24.4 packaging-23.2 pluggy-1.3.0 protobuf-3.20.3 pyparsing-3.1.1 pytest-7.4.3 python-magic-0.4.27 pytorch-lightning-2.0.9.post0 requests-2.22.0 responses-0.18.0 safetensors-0.4.1 sciassist-0.1.1 sentry-sdk-1.9.0 six-1.16.0 tomli-2.0.1 transformers-4.30.2 urllib3-1.25.11

Reflection to (#32 (comment)): 1) no torch installed here torch is installed together with pytorch lightning torch.version is '2.1.0+cu121', 2) pytorch lightning is a recent version 2.0.9, 3) the mentioned oauthlib is installed.

Try inference

from SciAssist import Summarization
summerizer = Summarization(device="gpu")
res = summerizer.predict(text, type="str")
print(res)

Failed to import transformers.models.llama.tokenization_llama_fast because of the following error (look up to see its traceback):
tokenizers>=0.13.3 is required for a normal functioning of this module, but found tokenizers==0.12.1.

Change version

pip install pytorch-lightning==1.7.1
Inference again

RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False
Wrong torch version, cannot recognize cuda.

conda install pytorch==1.12.0 torchvision==0.13.0 torchaudio==0.12.0 cudatoolkit=11.6 -c pytorch -c conda-forge
Do 'pip install torch==1.12.0+cu113 torchvision==0.13.0+cu113 torchaudio==0.12.0 --extra-index-url https://download.pytorch.org/whl/cu113'

Correct Inference with reference string parsing and summarization!!

In summary, as you mentioned, torch should match the local machine (1.12.0 for our case), lightning=1.7.1 works for SciAssist.

I agree that we shall recommend users install their own torch before installing SciAssist.

Todo:
Try different lightning versions based on the correct torch.
Lightning: 1.8.0, 1.9.0, 2.0.0, 2.1.0 success in inference.

Try on MacOS, Windows system.

@qolina
Copy link
Collaborator Author

qolina commented Nov 29, 2023

Installation on MacOS 14.1.1

Install miniconda from https://docs.conda.io/projects/miniconda/en/latest/

Install torch

'pip3 install torch torchvision torchaudio'

Successfully installed MarkupSafe-2.1.3 certifi-2023.11.17 charset-normalizer-3.3.2 filelock-3.13.1 fsspec-2023.10.0 idna-3.6 jinja2-3.1.2 mpmath-1.3.0 networkx-3.1 numpy-1.24.4 pillow-10.1.0 requests-2.31.0 sympy-1.12 torch-2.1.1 torchaudio-2.1.1 torchvision-0.16.1 typing-extensions-4.8.0 urllib3-2.1.0

Install SciAssist

Successfully installed GitPython-3.1.40 PyPDF2-2.10.9 PyYAML-6.0.1 aiohttp-3.9.1 aiosignal-1.3.1 antlr4-python3-runtime-4.9.3 async-timeout-4.0.3 attrs-23.1.0 beautifulsoup4-4.9.3 cffi-1.16.0 chardet-3.0.4 click-8.1.7 commonmark-0.9.1 cryptography-41.0.7 cycler-0.12.1 datasets-2.2.2 dill-0.3.4 docker-pycreds-0.4.0 exceptiongroup-1.2.0 fonttools-4.45.1 frozenlist-1.4.0 gitdb-4.0.11 huggingface-hub-0.19.4 hydra-core-1.3.2 idna-2.8 importlib-resources-6.1.1 iniconfig-2.0.0 joblib-1.3.2 kiwisolver-1.4.5 lightning-utilities-0.10.0 lxml-4.9.3 matplotlib-3.5.3 multidict-6.0.4 multiprocess-0.70.12.2 nltk-3.8.1 omegaconf-2.2.3 packaging-23.2 pandas-1.4.4 pathtools-0.1.2 pdfminer.six-20221105 pluggy-1.3.0 promise-2.3 protobuf-3.20.3 psutil-5.9.6 pyarrow-14.0.1 pycparser-2.21 pygments-2.17.2 pyparsing-3.1.1 pytest-7.4.3 python-dateutil-2.8.2 python-magic-0.4.27 pytorch-crf-0.7.2 pytorch-lightning-2.0.9.post0 pytz-2023.3.post1 regex-2023.10.3 requests-2.22.0 responses-0.18.0 rich-12.4.4 sacremoses-0.1.1 safetensors-0.4.1 sciassist-0.1.1 scikit-learn-1.3.2 scipy-1.10.1 seaborn-0.11.2 sentry-sdk-1.9.0 seqeval-1.2.2 setproctitle-1.3.3 shortuuid-1.0.11 six-1.16.0 smmap-5.0.1 soupsieve-2.5 threadpoolctl-3.2.0 tokenizers-0.13.3 tomli-2.0.1 torchcrf-1.1.0 torchmetrics-0.11.4 tqdm-4.66.1 transformers-4.30.2 urllib3-1.25.11 wandb-0.12.21 xxhash-3.4.1 yarl-1.9.3 zipp-3.17.0

Inference

Reference string parsing and summarization test passed!

Storage/Memory usage recoding for base models

Miniconda cache 1.5G
Model checkpoints cache 2.7G
Memory: 803MB for reference string parsing, 1.3G for summarization

@qolina qolina changed the title Can not pip install sciassist Can not install sciassist Nov 29, 2023
@JavonTeo
Copy link
Collaborator

Installation on WSL Ubuntu 22.04.1 LTS

~$ lsb_release -a

No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 22.04.1 LTS
Release: 22.04

nvidia-smi

NVIDIA-SMI 535.112 Driver Version: 537.42 CUDA Version: 12.2

Installation (Python 3.10.12):

python3 -m venv SciAssist
source SciAssist/bin/activate
pip install SciAssist

Successfully installed GitPython-3.1.40 MarkupSafe-2.1.3 PyPDF2-2.10.9 PyYAML-6.0.1 aiohttp-3.9.1 aiosignal-1.3.1 antlr4-python3-runtime-4.9.3 async-timeout-4.0.3 attrs-23.1.0 beautifulsoup4-4.9.3 certifi-2023.11.17 cffi-1.16.0 chardet-3.0.4 charset-normalizer-3.3.2 click-8.1.7 commonmark-0.9.1 cryptography-41.0.7 cycler-0.12.1 datasets-2.2.2 dill-0.3.4 docker-pycreds-0.4.0 exceptiongroup-1.2.0 filelock-3.13.1 fonttools-4.45.1 frozenlist-1.4.0 fsspec-2023.10.0 gitdb-4.0.11 huggingface-hub-0.19.4 hydra-core-1.3.2 idna-2.8 iniconfig-2.0.0 jinja2-3.1.2 joblib-1.3.2 kiwisolver-1.4.5 lightning-utilities-0.10.0 lxml-4.9.3 matplotlib-3.5.3 mpmath-1.3.0 multidict-6.0.4 multiprocess-0.70.12.2 networkx-3.2.1 nltk-3.8.1 numpy-1.26.2 nvidia-cublas-cu12-12.1.3.1 nvidia-cuda-cupti-cu12-12.1.105 nvidia-cuda-nvrtc-cu12-12.1.105 nvidia-cuda-runtime-cu12-12.1.105 nvidia-cudnn-cu12-8.9.2.26 nvidia-cufft-cu12-11.0.2.54 nvidia-curand-cu12-10.3.2.106 nvidia-cusolver-cu12-11.4.5.107 nvidia-cusparse-cu12-12.1.0.106 nvidia-nccl-cu12-2.18.1 nvidia-nvjitlink-cu12-12.3.101 nvidia-nvtx-cu12-12.1.105 omegaconf-2.2.3 packaging-23.2 pandas-1.4.4 pathtools-0.1.2 pdfminer.six-20221105 pillow-10.1.0 pluggy-1.3.0 promise-2.3 protobuf-3.20.3 psutil-5.9.6 pyarrow-14.0.1 pycparser-2.21 pygments-2.17.2 pyparsing-3.1.1 pytest-7.4.3 python-dateutil-2.8.2 python-magic-0.4.27 pytorch-crf-0.7.2 pytorch-lightning-2.0.9.post0 pytz-2023.3.post1 regex-2023.10.3 requests-2.22.0 responses-0.18.0 rich-12.4.4 sacremoses-0.1.1 safetensors-0.4.1 sciassist-0.1.1 scikit-learn-1.3.2 scipy-1.11.4 seaborn-0.11.2 sentry-sdk-1.9.0 seqeval-1.2.2 setproctitle-1.3.3 setuptools-69.0.2 shortuuid-1.0.11 six-1.16.0 smmap-5.0.1 soupsieve-2.5 sympy-1.12 threadpoolctl-3.2.0 tokenizers-0.13.3 tomli-2.0.1 torch-2.1.1 torchcrf-1.1.0 torchmetrics-0.11.4 tqdm-4.66.1 transformers-4.30.2 triton-2.1.0 typing-extensions-4.8.0 urllib3-1.25.11 wandb-0.12.21 xxhash-3.4.1 yarl-1.9.3

setup_grobid

BUILD SUCCESSFUL in 54s
30 actionable tasks: 25 executed, 5 up-to-date
Grobid is installed.

run_grobid

environments/SciAssist/lib/python3.10/site-packages/transformers/generation_utils.py:24: FutureWarning: Importing GenerationMixin from src/transformers/generation_utils.py is deprecated and will be removed in Transformers v5. Import as from transformers import GenerationMixin instead.
warnings.warn(
Xformers is not installed correctly. If you want to use memory_efficient_attention to accelerate training use the following command to install Xformers
pip install xformers.
Grobid is running now.

Try Summarization and Reference String Parsing test on pdf

from SciAssist import Summarization
pipeline = Summarization()
res = pipeline.predict('examples_H01-1042.pdf', type="pdf", num_beams=4, num_return_sequences=2)
print(res["summary"])
from SciAssist import ReferenceStringParsing
ref_parser = ReferenceStringParsing()
res = ref_parser.predict("examples_H01-1042.pdf", type="pdf")
print(res)

environments/SciAssist/lib/python3.10/site-packages/transformers/generation_utils.py:24: FutureWarning: Importing GenerationMixin from src/transformers/generation_utils.py is deprecated and will be removed in Transformers v5. Import as from transformers import GenerationMixin instead.
warnings.warn(
Xformers is not installed correctly. If you want to use memory_efficient_attention to accelerate training use the following command to install Xformers
pip install xformers.
Loading the model...
...

summarization and rsp test passed!

Testing summary

Even though I did not install torch or pytorch-lightning before installing SciAssist, it could still run properly. Hence I believe users can do pip install SciAssist straightaway. However, note that when testing, I ran into FutureWarning, telling me to pip install xformers.

I tried pip install xformers:

Successfully installed torch-2.1.0 xformers-0.0.22.post7

When running test, it is successful but same problem:

environments/SciAssist2/lib/python3.10/site-packages/transformers/generation_utils.py:24: FutureWarning: Importing GenerationMixin from src/transformers/generation_utils.py is deprecated and will be removed in Transformers v5. Import as from transformers import GenerationMixin instead.
warnings.warn(
Loading the model...

@qolina
Copy link
Collaborator Author

qolina commented Nov 29, 2023

Thanks for testing different versions of Ubuntu system and CUDA, and test grobid which I forgot.

Description: Ubuntu 22.04.1 LTS
NVIDIA-SMI 535.112 Driver Version: 537.42 CUDA Version: 12.2

Installation (Python 3.10.12):

setup_grobid

I notice your machine has advanced CUDA version 12.2, which matches with the default Pytorch installed by 'pip install sciassist'. It gives errors when you have an older version of CUDA and a non-compatible Pytorch. And yes, the version of pytorch-lightning is not the reason for errors.
I also have these warning issues, which are ignored so far.

Testing summary

Even though I did not install torch or pytorch-lightning before installing SciAssist, it could still run properly. Hence I believe users can do pip install SciAssist straightaway. However, note that when testing, I ran into FutureWarning, telling me to pip install xformers.

I tried pip install xformers:

Successfully installed torch-2.1.0 xformers-0.0.22.post7

When running test, it is successful but same problem:

environments/SciAssist2/lib/python3.10/site-packages/transformers/generation_utils.py:24: FutureWarning: Importing GenerationMixin from src/transformers/generation_utils.py is deprecated and will be removed in Transformers v5. Import as from transformers import GenerationMixin instead.
warnings.warn(
Loading the model...

@JavonTeo
Copy link
Collaborator

Installation on Windows

nvidia-smi

NVIDIA-SMI 536.99 Driver Version: 536.99 CUDA Version: 12.2

Installation (Python 3.11.5)

Tried

python -m venv .env
.env\Scripts\activate
pip install SciAssist
python -m venv .env
.env\Scripts\activate
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
python -m venv .env
.env\Scripts\activate
pip3 install torch torchvision torchaudio

Got same error:

AppData\Local\Temp\pip-build-env-bcle4ruo\overlay\Lib\site-packages\setuptools\dist.py:674: SetuptoolsDeprecationWarning: The namespace_packages parameter is deprecated.
!!

          ********************************************************************************
          Please replace its usage with implicit namespaces (PEP 420).

          See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages for details.
          ********************************************************************************

  !!
    ep.load()(self, ep.name, value)

  Edit mplsetup.cfg to change the build options; suppress output with --quiet.

  BUILDING MATPLOTLIB
        python: yes [3.11.5 | packaged by Anaconda, Inc. | (main, Sep 11 2023,
                    13:26:23) [MSC v.1916 64 bit (AMD64)]]
      platform: yes [win32]
         tests: no  [skipping due to configuration]
        macosx: no  [Mac OS-X only]

running build_ext
Extracting /project/freetype/freetype2/2.6.1/freetype-2.6.1.tar.gz
Building freetype in build\freetype-2.6.1
msbuild build\freetype-2.6.1\builds\windows\vc2010\freetype.sln /t:Clean;Build /p:Configuration=Release;Platform=x64
error: command 'msbuild' failed: None
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for matplotlib
Failed to build matplotlib
ERROR: Could not build wheels for matplotlib, which is required to install pyproject.toml-based projects

I upgraded pip and setuptools to pip 23.3.1 and setuptools 69.0.2 but still same error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants