Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Python] A file with this name is already in the pool. #3002

Closed
rossengeorgiev opened this issue Apr 22, 2017 · 12 comments
Closed

[Python] A file with this name is already in the pool. #3002

rossengeorgiev opened this issue Apr 22, 2017 · 12 comments

Comments

@rossengeorgiev
Copy link

This problem arises from the linux binary wheels for protobuf==3.2.0. Using the pure python implementation works fine.

In my case I proto file that collides with name of proto file from another package. Importing both python packages is not possible as I get this error.

Steps to replicate

Directory tree:

./module1
./module1/messages.proto
./module1/messages_pb2.py
./module1/__init__.py
./module2
./module2/messages.proto
./module2/messages_pb2.py
./module2/__init__.py

./module1/messages.proto

syntax = "proto2";
package module1;

./module2/messages.proto

syntax = "proto2";
package module2;

Trying to import:

python -c "import module1.messages_pb2, module2.messages_pb2"
------
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "module2/messages_pb2.py", line 22, in <module>
    serialized_pb=_b('\n\x0emessages.proto\x12\x07module2')
  File "/home/vagrant/env/local/lib/python2.7/site-packages/google/protobuf/descriptor.py", line 824, in __new__
    return _message.default_pool.AddSerializedFile(serialized_pb)
TypeError: Couldn't build proto file into descriptor pool!
Invalid proto descriptor for file "messages.proto":
  messages.proto: A file with this name is already in the pool.
@feifangit
Copy link

I ran into this issue too. I have to make the .proto files in unique file names.

@ndwhelan
Copy link

Ran in to this issue. If anyone finds this, I resolved it by running pip uninstall protobuf and then pip install --no-binary protobuf protobuf. That made sure that only the source was installed, and, as mentioned above, for some reason, things work fine in the pure python version.

@trianta2
Copy link

I found that doing

export PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION='python'

before running my Python code resolved the issue. I got the flag from the bottom of https://developers.google.com/protocol-buffers/docs/reference/python-generated

@xfxyjwf
Copy link
Contributor

xfxyjwf commented Aug 28, 2017

The error (file already in pool) indicates an actual problem you need to fix on the .proto files. A simple solution is to put one of the .proto file in a subdirectory.

Right now you can hide the problem by switching to the pure python implementation (which doesn't check conflicts), but that won't really solve the issue. You are going to run into the same problem if you are using a different language, or if we fix the pure python implementation to check conflict as well.

@rossengeorgiev
Copy link
Author

@xfxyjwf I think you are confused. The original issue cannot be fixed by doing anything the protos, except renaming the duplicate message.

The pure python implementation uses the directory for namespacing, and that works. Just as described in the documentation https://developers.google.com/protocol-buffers/docs/proto#packages

The problem is that the C++ implementation doesn't do that, nor does it respect the package directive. This issue is not noted in the documentation at all. In fact, the documentation advises on using package in protos (see my example at the top), but that does not work in C++ implementation. So, clearly a bug.

@xfxyjwf
Copy link
Contributor

xfxyjwf commented Aug 29, 2017

@rossengeorgiev Can you regenerate the *_pb2.py files by invoking:

$ protoc --python_out=. module1/messages.proto module2/messages.proto

? That should solve the issue you are having. Note that the python import path must match the file path you passed to protoc.

@rossengeorgiev
Copy link
Author

rossengeorgiev commented Sep 23, 2017

I managed to figure out what caused the original issue. The clue was in the error. The descriptor name is derived from the relative path.

serialized_pb=_b('\n\x0emessages.proto\x12\x07module2')

That means I compiled the code by going into each directory and running protoc --python_out=. messages.proto. If I run it as you suggest, or even as protoc --python_out=. module1/messages.proto; protoc --python_out=. module2/messages.proto; it will work regardless. The descriptor name will always be different as it is derived from the relative path.

serialized_pb=_b('\n\x16module1/messages.proto\x12\x07module1')

In addition if we have duplicate messages like so:

./module1/messages.proto

syntax = "proto2";
message mymessage { }

./module2/messages.proto

syntax = "proto2";
message mymessage { }  

If we invoke protoc like so, we get an error and nothing gets compiled.

protoc --python_out=. module2/messages.proto module1/messages.proto
module1/messages.proto:3:9: "mymessage" is already defined in file "module2/messages.proto".

Alternatively, we could do:

 protoc --python_out=. module2/messages.proto
 protoc --python_out=. module1/messages.proto

No error and compiles fine. However we get an error when we try to import in the C++ protobuf package, and no error in the pure one.

$ python -c "import module1.messages_pb2, module2.messages_pb2"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "module2/messages_pb2.py", line 22, in <module>
    serialized_pb=_b('\n\x16module2/messages.proto\"\x0b\n\tmymessage')
  File "/home/vagrant/env/local/lib/python2.7/site-packages/google/protobuf/descriptor.py", line 824, in __new__
    return _message.default_pool.AddSerializedFile(serialized_pb)
TypeError: Couldn't build proto file into descriptor pool!
Invalid proto descriptor for file "module2/messages.proto":
  mymessage: "mymessage" is already defined in file "module1/messages.proto".

At this point we have two options. Use pure python protobuf package, or set package directive in our proto files. I think this covers everything. Thanks @xfxyjwf

@kr-ish
Copy link

kr-ish commented May 31, 2019

@rossengeorgiev could you clarify what you mean by "set package directive in our proto files"? I'm having this same issue and I'm trying to solve it without using the pure python protobuf package

@kr-ish
Copy link

kr-ish commented May 31, 2019

I was working with tensorflow protos that I was trying to compile and use outside of tensorflow- there were a couple of issues but the main issue seems to have been not making sure I had uninstalled tensorflow and from my python environment entirely because one of my services was using a tf service which was causing the conflict- after I refactored that to not use tf and got rid of tf for good measure, I stopped seeing this issue

felipesanches added a commit to googlefonts/lang that referenced this issue Feb 18, 2022
to workaround this kind of problem when using the module
on projects that also import `fonts_public_pb2.py`:
protocolbuffers/protobuf#3002
felipesanches added a commit to googlefonts/lang that referenced this issue Feb 18, 2022
to workaround this kind of problem when using the module
on projects that also import `fonts_public_pb2.py`:
protocolbuffers/protobuf#3002
felipesanches added a commit to googlefonts/lang that referenced this issue Feb 18, 2022
to workaround this kind of problem when using the module
on projects that also import `fonts_public_pb2.py`:
protocolbuffers/protobuf#3002
@brary
Copy link

brary commented Jan 6, 2023

Hi, I was facing this issue on adding a dependency and tried using --no-binary=protobuf. But post adding it, my existing pipeline has started failing. I am getting this exception on same protobuf version which I was using earlier(3.20.3)

Traceback (most recent call last):
 File "/tmp/tmpmgkm82oc/installed_wheels/48d30b6cca1d176ff8918486633b04f54a436a15435696d2baba61d4595d690c/protobuf-3.20.3-py2.py3-none-any.whl/google/protobuf/internal/python_message.py", line 1128, in MergeFromString
  if self._InternalParse(serialized, 0, length) != length:
 File "/tmp/tmpmgkm82oc/installed_wheels/48d30b6cca1d176ff8918486633b04f54a436a15435696d2baba61d4595d690c/protobuf-3.20.3-py2.py3-none-any.whl/google/protobuf/internal/python_message.py", line 1164, in InternalParse
  (tag_bytes, new_pos) = local_ReadTag(buffer, pos)
 File "/tmp/tmpmgkm82oc/installed_wheels/48d30b6cca1d176ff8918486633b04f54a436a15435696d2baba61d4595d690c/protobuf-3.20.3-py2.py3-none-any.whl/google/protobuf/internal/decoder.py", line 174, in ReadTag
  while buffer[pos] & 0x80:
TypeError: unsupported operand type(s) for &: 'bytes' and 'int' 

@dkbarn
Copy link

dkbarn commented Jan 4, 2024

Years later and it's still unclear from this conversation what the proper solution is. Avoiding use of the binary wheels by forcing pip to install the pure python implementation is a workaround. What is the proper fix?

@rossengeorgiev said:

At this point we have two options. Use pure python protobuf package, or set package directive in our proto files. I think this covers everything.

Setting the package directive does not fix the issue. I currently have this setup:

./foo/messages.proto

syntax = "proto2";
package foo;

./bar/messages.proto

syntax = "proto2";
package bar;

And I still get this error:

TypeError: Couldn't build proto file into descriptor pool!
Invalid proto descriptor for file "messages.proto":
  messages.proto: A file with this name is already in the pool.

@tlifschitz
Copy link

Agreed @dkbarn, still having the same issue, this is definitely not solved. The pure python implementation workaround it is obviously much slower at runtime. @jtattermusch @BSBandme

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests