New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
databricks.proto error when importing MLflow after tensorflow #33
Comments
In theory, this line should have resolved your problem. https://github.com/databricks/mlflow/blob/master/mlflow/__init__.py#L5. I noticed that you were able to run the mlflow tutorial from https://github.com/databricks/mlflow/issues/18. Were you running on a different installation there? Could you let us know what the output of |
@andrewmchen Thank you for your answer. Indeed, I had tried to set this environment variable, but it did not resolve the error. I am working on the same installation. The difference is that I am not using any of the example code provided by MLflow, but instead trying to import MLflow in my own project. Here is the output of
|
I'm running into the same problem. @andrewmchen I believe the issue that fix is referring to is for fixing Not sure if this is any help, just my two cents. |
Actually, if I manually set |
@zegerhoogeboom, I'm surprised you need to manually set that environment variable. The intention is |
Setting the env variable before the import resolved the error for me as well: import os
os.environ["PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION"] = "python"
import mlflow |
@jeanineharb, @zegerhoogeboom thanks for the feedback. What's the output of You can change the source of the installed mlflow by changing the file located at (exchanging the pyc extension for py).
|
Doing
at https://github.com/databricks/mlflow/blob/master/mlflow/__init__.py#L5 prints "python". |
I'm not sure how helpful this is but a colleague just had the same problem. He didn't have |
Did you import some submodule of mlflow before mlflow itself? Maybe that's the problem if that submodule does not cause the initialization code in mlflow/init.py to run. |
That seems plausible but I can't seem to be able to replicate the issue by first importing a submodule before mlflow itself. If I can provide any more information, I'm happy to help. |
@zegerhoogeboom would you be able to share a minimal code example to repro the problem? That'd help us a lot! |
Also, perhaps this diff can fix your problem? https://github.com/databricks/mlflow/pull/74/files Would be great if you can verify if it works. |
@andrewmchen I can't seem to be able to reproduce the issue "on-demand". Manually setting the environment variable definitely worked for me but somehow it now also works if I remove it. Perhaps @jeanineharb is able to create a reproducible case? If we can't reproduce the issue then perhaps the issue can be kept open for a while in the case someone else bumps into the problem. If not, close it? |
@andrewmchen I have updated my installation to include the diff you proposed. However, it doesn't seem to fix the error. On the other hand, while trying to create a reproducible case, I changed the import order, and putting |
@jeanineharb Would you mind sharing the reproducible case? We'd like to fix the problem no matter what order you import mlfllow in. |
@andrewmchen Finally managed to reproduce it: from time import time
from keras.callbacks import TensorBoard
from keras.models import Sequential, model_from_json
from keras.layers import Conv2D, MaxPooling2D, Dropout, Flatten, Dense
from keras.constraints import maxnorm
from keras.applications import ResNet50, InceptionResNetV2, Xception, InceptionV3, DenseNet201
from keras import backend as K
from keras.optimizers import SGD
from keras.preprocessing.image import ImageDataGenerator
import mlflow
class HelloWorld:
def __init__(self, arg):
mlflow.start_run()
mlflow.log_param("class", "HelloWorld")
mlflow.log_metric("time", time())
def main():
h = HelloWorld()
if __name__ == '__main__':
main() Running this code produces the following error: Using TensorFlow backend.
Traceback (most recent call last):
File "helloworld.py", line 11, in <module>
import mlflow
File "/usr/local/lib/python3.5/dist-packages/mlflow-0.1.0-py3.5.egg/mlflow/__init__.py", line 4, in <module>
import mlflow.projects as projects # noqa
File "/usr/local/lib/python3.5/dist-packages/mlflow-0.1.0-py3.5.egg/mlflow/projects.py", line 18, in <module>
from mlflow.entities.param import Param
File "/usr/local/lib/python3.5/dist-packages/mlflow-0.1.0-py3.5.egg/mlflow/entities/param.py", line 2, in <module>
from mlflow.protos.service_pb2 import Param as ProtoParam
File "/usr/local/lib/python3.5/dist-packages/mlflow-0.1.0-py3.5.egg/mlflow/protos/service_pb2.py", line 20, in <module>
import mlflow.protos.databricks_pb2 as databricks__pb2
File "/usr/local/lib/python3.5/dist-packages/mlflow-0.1.0-py3.5.egg/mlflow/protos/databricks_pb2.py", line 27, in <module>
dependencies=[google_dot_protobuf_dot_descriptor__pb2.DESCRIPTOR,scalapb_dot_scalapb__pb2.DESCRIPTOR,])
File "/usr/local/lib/python3.5/dist-packages/google/protobuf/descriptor.py", line 829, in __new__
return _message.default_pool.AddSerializedFile(serialized_pb)
TypeError: Couldn't build proto file into descriptor pool!
Invalid proto descriptor for file "databricks.proto":
databricks.proto: Import "scalapb/scalapb.proto" has not been loaded. |
Great! I can repro now. Thanks @jeanineharb and @zegerhoogeboom. |
Great! All props go to @jeanineharb! |
My pleasure, thank you for open-sourcing this great project! |
Should be fixed by https://github.com/databricks/mlflow/pull/74. |
The root cause here is that The fix is to make |
Fixed in master and in future |
Tracking API consistency
Add news item on SAIS talks and add Splice machine logo
Sync criteo master
When I try to import MLflow in my project, I get the following error:
Any idea where the issue may come from? Thank you!
The text was updated successfully, but these errors were encountered: