Skip to content
This repository has been archived by the owner on Mar 23, 2023. It is now read-only.

BERT Data Preprocessing #172

Open
JizeZhangCS opened this issue Aug 27, 2022 · 3 comments
Open

BERT Data Preprocessing #172

JizeZhangCS opened this issue Aug 27, 2022 · 3 comments

Comments

@JizeZhangCS
Copy link

🐛 Describe the bug

NVIDIA DeepLearningExamples removed LDDL from DLE tools on Aug 16, 2022. Therefore, the guide on https://github.com/hpcaitech/ColossalAI-Examples/tree/main/language/bert/preprocessing fails to work in the following aspects:

  1. pip install git+https://github.com/NVIDIA/DeepLearningExamples.git#subdirectory=Tools/lddl won't work. The solution could be either using the new url, i.e. pip install git+https://github.com/NVIDIA/lddl.git, or finding lddl from the history version https://github.com/NVIDIA/DeepLearningExamples/tree/29f5b7ab059025e4ead512e54037eddbdf740f19.
  2. after installing lddl, using pip install boto3 would lead to a version conflict, which is of unknown effect on the whole process.
  3. in the preprocessing part, both phase 1 and phase 2 wouldn't work. The details would be provided later.
  4. changing lddl source from the new url to the history version wouldn't solve the problem 3, not installing boto3 also wouldn't help.

Environment

python=3.8
pytorch=1.12.1
cudatoolkit=10.2.89
cuda=10.2

@JizeZhangCS
Copy link
Author

More details about problem 2:
install lddl first, then install boto3:
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. awscli 1.22.55 requires botocore==1.24.0, but you have botocore 1.27.61 which is incompatible. awscli 1.22.55 requires s3transfer<0.6.0,>=0.5.0, but you have s3transfer 0.6.0 which is incompatible.

after install boto3, reinstall lddl:
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. boto3 1.24.61 requires botocore<1.28.0,>=1.27.61, but you have botocore 1.24.0 which is incompatible. boto3 1.24.61 requires s3transfer<0.7.0,>=0.6.0, but you have s3transfer 0.5.2 which is incompatible.

@JizeZhangCS
Copy link
Author

JizeZhangCS commented Aug 27, 2022

More details about problem 3:
output.txt

@FrankLeeeee
Copy link
Contributor

You can try pip install with the --no-dependencies flag to ignore the dependency conflict.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants