Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only check filenames when end with .py in _CalledFromGeneratedFile() #4262

Merged
merged 5 commits into from Feb 9, 2018

Conversation

anandolee
Copy link
Contributor

@anandolee anandolee commented Jan 31, 2018

Cython is not using .py file. Only check filenames when end with .py in _CalledFromGeneratedFile()

Fixes #2896

@arthur-tacca
Copy link

Many thanks for creating this pull request. I have just tested it and unfortunately it doesn't fix the problem; I still get the error "Descriptors should not be created directly...". The problem is this piece of code earlier in that function:

if (frame->f_globals != frame->f_locals) {
  // Not at global module scope
  return false;
}

Unfortunately, when I was debugging this before it hadn't occured to me to print this out in the stack frame. Here is a typical stack with everything being tested for, with Cython (first is the topmost level of the stack):

In _CalledFromGeneratedFile, stacklevel = 0...
   [0] isGlob: 1, !NULL: 1, fn is str: 1, fn: /home/arthur/envs/protocytest/lib/python3.5/site-packages/mypkg/myprot_pb2.py
   [1] isGlob: 0, !NULL: 1, fn is str: 1, fn: <frozen importlib._bootstrap>
   [2] isGlob: 0, !NULL: 1, fn is str: 1, fn: <frozen importlib._bootstrap_external>
   [3] isGlob: 0, !NULL: 1, fn is str: 1, fn: <frozen importlib._bootstrap>
   [4] isGlob: 0, !NULL: 1, fn is str: 1, fn: <frozen importlib._bootstrap>
   [5] isGlob: 0, !NULL: 1, fn is str: 1, fn: <frozen importlib._bootstrap>
   [6] isGlob: 1, !NULL: 1, fn is str: 1, fn: /home/arthur/envs/protocytest/lib/python3.5/site-packages/mypkg/foo.py
   [7] isGlob: 0, !NULL: 1, fn is str: 1, fn: <frozen importlib._bootstrap>
   [8] isGlob: 0, !NULL: 1, fn is str: 1, fn: <frozen importlib._bootstrap_external>
   [9] isGlob: 0, !NULL: 1, fn is str: 1, fn: <frozen importlib._bootstrap>
   [10] isGlob: 0, !NULL: 1, fn is str: 1, fn: <frozen importlib._bootstrap>
   [11] isGlob: 0, !NULL: 1, fn is str: 1, fn: <frozen importlib._bootstrap>
   [12] isGlob: 1, !NULL: 1, fn is str: 1, fn: run.py
Checking filename: /home/arthur/envs/protocytest/lib/python3.5/site-packages/mypkg/badprot.py

It seems to me that this check serves two purposes, so here are similar stacks from those two situations but WITHOUT Cython (where the check stands a chance of working):

(1) The check finds where a _pb2.py file has been renamed. If the check is bypassed this usually works but is presumably unsupported and certainly would break imports between generated files. Here is an example stack trace:

In _CalledFromGeneratedFile, stacklevel = 1...
   [0] isGlob: 0, !NULL: 1, fn is str: 1, fn: /home/arthur/envs/protocytest/lib/python3.5/site-packages/protobuf-3.5.1-py3.5-linux-x86_64.egg/google/protobuf/descriptor.py
   [1] isGlob: 1, !NULL: 1, fn is str: 1, fn: /home/arthur/envs/protocytest/lib/python3.5/site-packages/mypkg/badprot.py
   [2] isGlob: 0, !NULL: 1, fn is str: 1, fn: <frozen importlib._bootstrap>
   [3] isGlob: 0, !NULL: 1, fn is str: 1, fn: <frozen importlib._bootstrap_external>
   [4] isGlob: 0, !NULL: 1, fn is str: 1, fn: <frozen importlib._bootstrap>
   [5] isGlob: 0, !NULL: 1, fn is str: 1, fn: <frozen importlib._bootstrap>
   [6] isGlob: 0, !NULL: 1, fn is str: 1, fn: <frozen importlib._bootstrap>
   [7] isGlob: 1, !NULL: 1, fn is str: 1, fn: /home/arthur/envs/protocytest/lib/python3.5/site-packages/mypkg/badfoo.py
   [8] isGlob: 0, !NULL: 1, fn is str: 1, fn: <frozen importlib._bootstrap>
   [9] isGlob: 0, !NULL: 1, fn is str: 1, fn: <frozen importlib._bootstrap_external>
   [10] isGlob: 0, !NULL: 1, fn is str: 1, fn: <frozen importlib._bootstrap>
   [11] isGlob: 0, !NULL: 1, fn is str: 1, fn: <frozen importlib._bootstrap>
   [12] isGlob: 0, !NULL: 1, fn is str: 1, fn: <frozen importlib._bootstrap>
   [13] isGlob: 1, !NULL: 1, fn is str: 1, fn: run.py
Checking filename: /home/arthur/envs/protocytest/lib/python3.5/site-packages/mypkg/badprot.py

(2) The check stops people manually creating Descriptor objects, which seems to be the original intention. This seems like an odd check to me because the constructor takes so many arguments it is basically impossible to do by accident. Anyway, here is an example stack trace:

In _CalledFromGeneratedFile, stacklevel = 1...
   [0] isGlob: 0, !NULL: 1, fn is str: 1, fn: /home/arthur/envs/protocytest/lib/python3.5/site-packages/protobuf-3.5.1-py3.5-linux-x86_64.egg/google/protobuf/descriptor.py
   [1] isGlob: 0, !NULL: 1, fn is str: 1, fn: manual_create_desc.py
   [2] isGlob: 0, !NULL: 1, fn is str: 1, fn: manual_create_desc.py
   [3] isGlob: 1, !NULL: 1, fn is str: 1, fn: manual_create_desc.py
Not at global scope

So we can NOT change the check to return true if we are not at global scope, because it will miss people manually creating Descriptor objects, which probably usually happens inside functions or methods.

So here are my suggestions:

  • Remove the check of global scope altogether, and rely on the filename check.
  • Change all the other return statements (e.g. filename not a string). in this function to return true . This indicates an exceptional situation, perhaps along the lines Cythoning.
  • EXCEPT you can't do this for the 7 characters check, because it misses e.g pb.py. I think this check needs to be a little more complicated:
    • If the filename length >= 3 characters and ends in .py:
    • AND if (the length <7 OR does not end with _pb2.py)
    • then return false (the only such return in the function)

Would that be reasonable?

Thanks again for your efforts on this and really sorry my suggestions aren't in the form of a pull request.

@anandolee
Copy link
Contributor Author

anandolee commented Feb 7, 2018

arthur-tacca@, Thanks for your investigate. But I am little confusing with what situation makes the Cython check fail.

The first stack example with Cython is checking myprot_pb2.py (stacklevel = 0, global) which should be passing.
The second WITHOUT Cython is checking badprot.py (stacklevel = 1, global) which should fail.
Third one WITHOUT Cython is checking manual_create_desc.py (stacklevel = 1, not global) which should fail.
They are all correct behaviors.

Are you able to give a stack example when Cython check fail? Stack frame for '_CalledFromGeneratedFile, stacklevel = 1' with Cython would help

@arthur-tacca
Copy link

Ugh, I'm really sorry. The text in my comment was correct, but that first stack trace was wrong – I must have pasted in the wrong one. That stack trace is where NOTHING is Cythoned! There is a top-level script foo.py, which imports a mypkg.foo , which imports mypkg.myprot_pb2, all of which are pure Python (you can see foo.py, mypkg/foo.py and mypkg/myprot_pb2.py in the stack trace). This is a correct usage, and passes the check, as you said.

Here are stack traces from that should work but fail. They come from a program run with a single import of a _pb2, but there are actually three calls to _CalledFromGeneratedFile that fail; I forced the result to true so I could see the later calls. It is the same as the situation I just described for the stack trace in my original comment, except that the two modules in mypkg are Cythoned.

In _CalledFromGeneratedFile, stacklevel = 1...
   [0] isGlob: 0, !NULL: 1, fn is str: 1, fn: /home/arthur/envs/protocytest/lib/python3.5/site-packages/protobuf-3.5.1-py3.5-linux-x86_64.egg/google/protobuf/descriptor.py
   [1] isGlob: 0, !NULL: 1, fn is str: 1, fn: <frozen importlib._bootstrap>
   [2] isGlob: 0, !NULL: 1, fn is str: 1, fn: <frozen importlib._bootstrap_external>
   [3] isGlob: 0, !NULL: 1, fn is str: 1, fn: <frozen importlib._bootstrap>
   [4] isGlob: 0, !NULL: 1, fn is str: 1, fn: <frozen importlib._bootstrap>
   [5] isGlob: 0, !NULL: 1, fn is str: 1, fn: <frozen importlib._bootstrap>
   [6] isGlob: 0, !NULL: 1, fn is str: 1, fn: <frozen importlib._bootstrap>
   [7] isGlob: 0, !NULL: 1, fn is str: 1, fn: <frozen importlib._bootstrap>
   [8] isGlob: 0, !NULL: 1, fn is str: 1, fn: <frozen importlib._bootstrap_external>
   [9] isGlob: 0, !NULL: 1, fn is str: 1, fn: <frozen importlib._bootstrap>
   [10] isGlob: 0, !NULL: 1, fn is str: 1, fn: <frozen importlib._bootstrap>
   [11] isGlob: 0, !NULL: 1, fn is str: 1, fn: <frozen importlib._bootstrap>
   [12] isGlob: 0, !NULL: 1, fn is str: 1, fn: <frozen importlib._bootstrap>
   [13] isGlob: 1, !NULL: 1, fn is str: 1, fn: run.py
Not at global scope
Checking filename: <frozen importlib._bootstrap>

In _CalledFromGeneratedFile, stacklevel = 1...
   [0] isGlob: 0, !NULL: 1, fn is str: 1, fn: /home/arthur/envs/protocytest/lib/python3.5/site-packages/protobuf-3.5.1-py3.5-linux-x86_64.egg/google/protobuf/descriptor.py
   [1] isGlob: 0, !NULL: 1, fn is str: 1, fn: <frozen importlib._bootstrap>
   [2] isGlob: 0, !NULL: 1, fn is str: 1, fn: <frozen importlib._bootstrap_external>
   [3] isGlob: 0, !NULL: 1, fn is str: 1, fn: <frozen importlib._bootstrap>
   [4] isGlob: 0, !NULL: 1, fn is str: 1, fn: <frozen importlib._bootstrap>
   [5] isGlob: 0, !NULL: 1, fn is str: 1, fn: <frozen importlib._bootstrap>
   [6] isGlob: 0, !NULL: 1, fn is str: 1, fn: <frozen importlib._bootstrap>
   [7] isGlob: 0, !NULL: 1, fn is str: 1, fn: <frozen importlib._bootstrap>
   [8] isGlob: 0, !NULL: 1, fn is str: 1, fn: <frozen importlib._bootstrap_external>
   [9] isGlob: 0, !NULL: 1, fn is str: 1, fn: <frozen importlib._bootstrap>
   [10] isGlob: 0, !NULL: 1, fn is str: 1, fn: <frozen importlib._bootstrap>
   [11] isGlob: 0, !NULL: 1, fn is str: 1, fn: <frozen importlib._bootstrap>
   [12] isGlob: 0, !NULL: 1, fn is str: 1, fn: <frozen importlib._bootstrap>
   [13] isGlob: 1, !NULL: 1, fn is str: 1, fn: run.py
Not at global scope
Checking filename: <frozen importlib._bootstrap>

In _CalledFromGeneratedFile, stacklevel = 0...
   [0] isGlob: 0, !NULL: 1, fn is str: 1, fn: <frozen importlib._bootstrap>
   [1] isGlob: 0, !NULL: 1, fn is str: 1, fn: <frozen importlib._bootstrap_external>
   [2] isGlob: 0, !NULL: 1, fn is str: 1, fn: <frozen importlib._bootstrap>
   [3] isGlob: 0, !NULL: 1, fn is str: 1, fn: <frozen importlib._bootstrap>
   [4] isGlob: 0, !NULL: 1, fn is str: 1, fn: <frozen importlib._bootstrap>
   [5] isGlob: 0, !NULL: 1, fn is str: 1, fn: <frozen importlib._bootstrap>
   [6] isGlob: 0, !NULL: 1, fn is str: 1, fn: <frozen importlib._bootstrap>
   [7] isGlob: 0, !NULL: 1, fn is str: 1, fn: <frozen importlib._bootstrap_external>
   [8] isGlob: 0, !NULL: 1, fn is str: 1, fn: <frozen importlib._bootstrap>
   [9] isGlob: 0, !NULL: 1, fn is str: 1, fn: <frozen importlib._bootstrap>
   [10] isGlob: 0, !NULL: 1, fn is str: 1, fn: <frozen importlib._bootstrap>
   [11] isGlob: 0, !NULL: 1, fn is str: 1, fn: <frozen importlib._bootstrap>
   [12] isGlob: 1, !NULL: 1, fn is str: 1, fn: run.py
Not at global scope
Checking filename: <frozen importlib._bootstrap>

@anandolee
Copy link
Contributor Author

Thanks for the updated stack traces. I've changed the PR, can you help to test and review?

@arthur-tacca
Copy link

I notice that instead of removing the global check, you have kept it in, but moved after the check that the filename ends in .py. Good idea. The code looks good to me.

I have tested your most recent commit (7b44b6d) with a Cythoned _pb2.py (same situation as my earlier comments). The check passes and the import works i.e. your changes fix the problem. Thanks!

@anandolee
Copy link
Contributor Author

anandolee commented Feb 8, 2018

Hi haberman@,
Can you help to review if the PR is good to merge? arthur-tacca@ has tested it in Cython.

@@ -123,6 +119,10 @@ bool _CalledFromGeneratedFile(int stacklevel) {
PyErr_Clear();
return false;
}
if ((filename_size < 3) || (strcmp(&filename[filename_size - 3], ".py") != 0)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we use C++ string manipulation instead? strcmp() scares me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original code is using strcmp. I also searched the code base, strcmp has been used several times. If we want to use string manipulation instead, I'd prefer to in another change.

@amauryfa
Copy link

amauryfa commented Feb 9, 2018

LGTM. This check is only to prevent surprises, so that descriptors appear immutable.
I would not object to remove or disable this function completely, when code is compiled by Cython.

@anandolee anandolee merged commit e34ec60 into master Feb 9, 2018
@xfxyjwf xfxyjwf added the python label Jun 22, 2018
zhoutwo pushed a commit to zhoutwo/protobuf that referenced this pull request Apr 25, 2020
…rotocolbuffers#4262)

* Cython's stack does not have .py file name. Only check filenames when end with .py for _CalledFromGeneratedFile()
@fowles fowles deleted the cython_protobuf branch April 6, 2021 17:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants