Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Не пересобирается Dockerfile #26

Open
aqqosh opened this issue Jun 2, 2020 · 5 comments
Open

Не пересобирается Dockerfile #26

aqqosh opened this issue Jun 2, 2020 · 5 comments

Comments

@aqqosh
Copy link

aqqosh commented Jun 2, 2020

Добрый день!

Без проблем удалось запустить Вашего бота из docker-образа последнего релиза. Однако, хочется добавить в Dockerfile ряд собственных инструкций: запустить бота на GPU, попробовать подключить библиотеку deeppavlov с моделями odqa, и т.п.

При попытке просто собрать образ из докерфайла который лежит в репозитории(ничего не изменяя), получаю трэйсбэк:

Traceback (most recent call last):
  File "../ruchatbot/frontend/console_chatbot.py", line 11, in <module>
    from ruchatbot.bot.bot_profile import BotProfile
  File "/chatbot/ruchatbot/__init__.py", line 1, in <module>
    from .qa_machine import create_qa_bot
  File "/chatbot/ruchatbot/qa_machine.py", line 13, in <module>
    from .bot.simple_answering_machine import SimpleAnsweringMachine
  File "/chatbot/ruchatbot/bot/simple_answering_machine.py", line 18, in <module>
    from ruchatbot.bot.answer_builder import AnswerBuilder
  File "/chatbot/ruchatbot/bot/answer_builder.py", line 18, in <module>
    from ruchatbot.bot.xgb_yes_no_model import XGB_YesNoModel
ModuleNotFoundError: No module named 'ruchatbot.bot.xgb_yes_no_model'

Не могли бы Вы поправить, чтобы образ можно было собрать самостоятельно? Или подсказать, как решить проблему?
Спасибо.

@Koziev
Copy link
Owner

Koziev commented Jun 2, 2020

Привет,

Не могли бы Вы поправить, чтобы образ можно было собрать самостоятельно?

Спасибо за сообщение. Я забыл закомитить изменения в исходниках, поэтому была ошибка. Сейчас закомитил. Попробуйте собрать.

Но есть еще один нюанс. В строке 34 в образ добавляется содержимое архива с репозиторием ruword2tags. То есть надо этот реп заранее выкачать локально и как-то так:

tar -cvzf ruword2tags.tar.gz --exclude "ruword2tags/.git/*"  -C /home/inkoziev/github/ ruword2tags

@aqqosh
Copy link
Author

aqqosh commented Jun 2, 2020

Спасибо!
Теперь ошибка поменялась:

Using TensorFlow backend.
2020-06-02 14:25:55 DEBUG    root  - Bot loading...
2020-06-02 14:25:55 INFO     Wordchar2VectorModel  - Loading Wordchar2VectorModel model files
2020-06-02 14:25:55 WARNING  tensorflow  - From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:517: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.

2020-06-02 14:25:55 WARNING  tensorflow  - From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:74: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead.

2020-06-02 14:25:55 WARNING  tensorflow  - From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:4138: The name tf.random_uniform is deprecated. Please use tf.random.uniform instead.

2020-06-02 14:25:55 WARNING  tensorflow  - From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:3976: The name tf.nn.max_pool is deprecated. Please use tf.nn.max_pool2d instead.

Traceback (most recent call last):
  File "../ruchatbot/frontend/console_chatbot.py", line 165, in <module>
    main()
  File "../ruchatbot/frontend/console_chatbot.py", line 91, in main
    text_utils.load_embeddings(w2v_dir=w2v_folder, wc2v_dir=models_folder)
  File "/chatbot/ruchatbot/bot/text_utils.py", line 68, in load_embeddings
    self.word_embeddings.load_models(w2v_dir)
  File "/chatbot/ruchatbot/bot/word_embeddings.py", line 35, in load_models
    self.wordchar2vector_model.load(models_folder)
  File "/chatbot/ruchatbot/bot/wordchar2vector_model.py", line 39, in load
    self.model.load_weights(weights_path)
  File "/usr/local/lib/python3.6/dist-packages/keras/engine/network.py", line 1157, in load_weights
    with h5py.File(filepath, mode='r') as f:
  File "/usr/local/lib/python3.6/dist-packages/h5py/_hl/files.py", line 394, in __init__
    swmr=swmr)
  File "/usr/local/lib/python3.6/dist-packages/h5py/_hl/files.py", line 170, in make_fid
    fid = h5f.open(name, flags, fapl=fapl)
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "h5py/h5f.pyx", line 85, in h5py.h5f.open
OSError: Unable to open file (unable to open file: name = '../tmp/wordchar2vector.model', errno = 2, error message = 'No such file or directory', flags = 0, o_flags = 0)

Где взять wordchar2vector.model?

@Koziev
Copy link
Owner

Koziev commented Jun 2, 2020

Где взять wordchar2vector.model?

Проще всего вытащить их из образа. Там еще куча моделей будет нужна, всё лежит в папке tmp.

@aqqosh
Copy link
Author

aqqosh commented Jun 3, 2020

Отлично, образ собрался и запустился, Вика начинает разговаривать.
На некоторых ответах вываливается ошибка с cuDNN.

B:> Предлагаю заняться спортом
H:> Давай

2020-06-03 10:15:25 INFO     SimpleAnsweringMachine  - push_phrase interlocutor="test" phrase="Давай"
2020-06-03 10:15:26.117265: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10.0
2020-06-03 10:15:26.245631: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7
2020-06-03 10:15:26.611224: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-06-03 10:15:26.615839: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-06-03 10:15:26.619779: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-06-03 10:15:26.623461: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
Traceback (most recent call last):
  File "../ruchatbot/frontend/console_chatbot.py", line 165, in <module>
    main()
  File "../ruchatbot/frontend/console_chatbot.py", line 161, in main
    bot.push_phrase(user_id, question)
  File "/chatbot/ruchatbot/bot/bot_personality.py", line 63, in push_phrase
    force_question_answering=self.force_question_answering)
  File "/chatbot/ruchatbot/bot/simple_answering_machine.py", line 814, in push_phrase
    interpreted_phrase = self.interpret_phrase(bot, session, question, internal_issuer)
  File "/chatbot/ruchatbot/bot/simple_answering_machine.py", line 288, in interpret_phrase
    if self.req_interpretation.require_interpretation(raw_phrase, self.text_utils):
  File "/chatbot/ruchatbot/bot/nn_req_interpretation.py", line 65, in require_interpretation
    y_pred = self.model.predict(x=X_batch, verbose=0)
  File "/usr/local/lib/python3.6/dist-packages/keras/engine/training.py", line 1169, in predict
    steps=steps)
  File "/usr/local/lib/python3.6/dist-packages/keras/engine/training_arrays.py", line 294, in predict_loop
    batch_outs = f(ins_batch)
  File "/usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py", line 2715, in __call__
    return self._call(inputs)
  File "/usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py", line 2675, in _call
    fetched = self._callable_fn(*array_vals)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1458, in __call__
    run_metadata_ptr)
tensorflow.python.framework.errors_impl.UnknownError: 2 root error(s) found.
  (0) Unknown: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
         [[{{node shared_conv_2_1/convolution}}]]
         [[output/Softmax/_1151]]
  (1) Unknown: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
         [[{{node shared_conv_2_1/convolution}}]]
0 successful operations.
0 derived errors ignored.

@Koziev
Copy link
Owner

Koziev commented Jun 3, 2020

На некоторых ответах вываливается ошибка с cuDNN.

выглядит так, будто tensorflow-gpu не дружит с установленной версией либы cuda dnn

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants