Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default namespace not being used #233

Closed
rstuart opened this issue May 29, 2013 · 11 comments
Closed

Default namespace not being used #233

rstuart opened this issue May 29, 2013 · 11 comments

Comments

@rstuart
Copy link
Contributor

rstuart commented May 29, 2013

Consider this python program:

#!/usr/bin/python -W default
from spyne.application import Application
from spyne.decorator import rpc
from spyne.protocol.soap import Soap11
from spyne.model.primitive import Unicode
from spyne.model.complex import ComplexModel
from spyne.server.wsgi import WsgiApplication
from spyne.service import ServiceBase

class T(ComplexModel):
    i = Unicode

class S(ServiceBase):
    @rpc(T, _returns=T)
    def m(a): return a

def main():
    import wsgiref.simple_server
    server = wsgiref.simple_server.make_server('127.0.0.1', 8888, application)
    return server.serve_forever()

spyne_application = Application(
            [S], 'my_namespace',
            in_protocol=Soap11(validator='lxml'),
            out_protocol=Soap11())
application = WsgiApplication(spyne_application)

if __name__=='__main__':
    import sys; sys.exit(main())

Running it produces:

Traceback (most recent call last):
  File "so.py", line 31, in <module>
      out_protocol=Soap11())
  File "/usr/lib/python2.7/dist-packages/spyne/application.py", line 104, in __init__
      self.in_protocol.set_app(self)
  File "/usr/lib/python2.7/dist-packages/spyne/protocol/xml/_base.py", line 344, in set_app
      xml_schema.build_validation_schema()
  File "/usr/lib/python2.7/dist-packages/spyne/interface/xml_schema/_base.py", line 189, in build_validation_schema
      self.validation_schema = etree.XMLSchema(etree.parse(f))
  File "lxml.etree.pyx", line 2953, in lxml.etree.parse (src/lxml/lxml.etree.c:56204)
  File "parser.pxi", line 1555, in lxml.etree._parseDocument (src/lxml/lxml.etree.c:82511)
  File "parser.pxi", line 1585, in lxml.etree._parseFilelikeDocument (src/lxml/lxml.etree.c:82832)
  File "parser.pxi", line 1468, in lxml.etree._parseDocFromFilelike (src/lxml/lxml.etree.c:81688)
  File "parser.pxi", line 1024, in lxml.etree._BaseParser._parseDocFromFilelike (src/lxml/lxml.etree.c:78735)
  File "parser.pxi", line 569, in lxml.etree._ParserContext._handleParseResultDoc (src/lxml/lxml.etree.c:74472)
  File "parser.pxi", line 650, in lxml.etree._handleParseResult (src/lxml/lxml.etree.c:75363)
  File "parser.pxi", line 590, in lxml.etree._raiseParseError (src/lxml/lxml.etree.c:74696)
  lxml.etree.XMLSyntaxError: xmlns:s0: Empty XML namespace is not allowed, line 1, column 73

Which is because in the generated tns.xsd xmlns:s0="" appears at line 1, column 269 (column 73 in the above error message is wrong):

<xs:schema xmlns:wsa="http://schemas.xmlsoap.org/ws/2003/03/addressing" xmlns:tns="my_namespace" xmlns:plink="http://schemas.xmlsoap.org/ws/2003/05/partner-link/" xmlns:xop="http://www.w3.org/2004/08/xop/include" xmlns:senc="http://schemas.xmlsoap.org/soap/encoding/" xmlns:s0="" xmlns:s12env="http://www.w3.org/2003/05/soap-envelope/" xmlns:s12enc="http://www.w3.org/2003/05/soap-encoding/" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:wsdl="http://schemas.xmlsoap.org/wsdl/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:senv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:soap="http://schemas.xmlsoap.org/wsdl/soap/" targetNamespace="my_namespace" elementFormDefault="qualified">
  <xs:import namespace="" schemaLocation="s0.xsd"/>
  <xs:complexType name="mResponse">
    <xs:sequence>
      <xs:element name="mResult" type="s0:T" minOccurs="0" nillable="true"/>
    </xs:sequence>
  </xs:complexType>
  <xs:complexType name="m"/>
  <xs:element name="m" type="tns:m"/>
  <xs:element name="mResponse" type="tns:mResponse"/>
</xs:schema>

The temporary directory spyne created (in my case /tmp/spynevMB5tX) wasn't cleaned up. That happened to be helpful for writing up the issue, but leaving junk lying around like that won't make me friends with my sysadmin. I guess it was deliberate, otherwise you just would have used lxml.etree.fromstring().

I also don't know enough about xml, spyne or SOAP to say whether producing an invalid WSDL like that is a reasonable thing to do, but it sure is annoying. As a suggestion, when no namespace is selected it would be real nice if you just used the default namespace passed to spyne.application.Application(). Or to put it another way, the above program behaved like this one, which does result in every complex type using the default namespace passed to Application():

#!/usr/bin/python -W default
from spyne.application import Application
from spyne.const.xml_ns import DEFAULT_NS
from spyne.decorator import rpc
from spyne.protocol.soap import Soap11
from spyne.model.primitive import Unicode
from spyne.model.complex import ComplexModel
from spyne.server.wsgi import WsgiApplication
from spyne.service import ServiceBase

class T(ComplexModel):
    __namespace__ = DEFAULT_NS
    i = Unicode

class S(ServiceBase):
    __namespace__ = DEFAULT_NS
    @rpc(T, _returns=T)
    def m(a): return a

def main():
    import wsgiref.simple_server
    server = wsgiref.simple_server.make_server('127.0.0.1', 8888, application)
    return server.serve_forever()

spyne_application = Application(
            [S], 'my_namespace',
            in_protocol=Soap11(validator='lxml'),
            out_protocol=Soap11())
application = WsgiApplication(spyne_application)

Figuring out this would do what I want took hours of reading the reference doco, and when that didn't help pouring through the source code. Unfortunately it didn't occur to me to look too closely at the examples. If I had, I may have noticed mention of the undocumented __namespace__ that appears in __namespace__ = 'spyne.examples.user_manager'. I think assigning something to __namespace__ is currently required to get a working ComplexModel type, so mentioning that you have to do it in the tutorials and defining what __namespace___ does in the doco would be helpful.

Apart from this namespace spyne has been a pleasure to use.

@plq
Copy link
Member

plq commented May 29, 2013

Hello,

First, thank you for your detailed report and constructive approach.

Let's look at this one by one:

  1. The temporary schema directory is only left when the schema generation fails exactly like your case. The reason is to help the developer look at the generated schema to see what's wrong with it. Such errors are not supposed to creep into production. I'd say that if your daemon restart fails, your problems are way bigger than a pissed-off sysadmin :)
  2. Yes, every type in Spyne has to have a [type] name and must belong to a namespace. It's mentioned in the first chapter of the manual[1](look at the note under "Application" bullet point) and not strictly an XML-related (but xml-inspired) requirement. The implicit assignment is always done to imported objects but not objects in the __main__ package because '__main__' is not really a namespace. That's why you need to assign explicit namespaces to all objects inside the '__main__' package. Assigning default namespace to all objects with no explicit namespace declaration could cause collisions. "Explicit is better than implicit", so I chose not to do it. That's not set in stone though -- My view is that Spyne belongs to all of us and I'm doing some kind of a "community service" by maintaining Spyne. So I don't think it's my place to ignore community's voice. But you need to persuade me :)
  3. In all honesty, I thought __namespace__ was documented. It's all over in the examples and the types chapter of the manual. But then again, it's the users' views that matter here, and if you think you can improve the documentation, send your pull requests my way.
  4. Instead of using spyne.const.xml_ns.DEFAULT_NS (which is an internal-use-only marker) you should've set e.g. DEFAULT_NS = 'http://example.com/kickass/ns' and used that as namespace value throughout your project. Again -- explicit is better than implicit.

All this said, here's how I'll close this issue report:

  1. Make sure no empty namespace goes to lxml.
  2. Document the behaviour of implicit namespace assignment. (the story with __module__ and when cls.__module__.startsiwth("_"))

Anything to add?

Best regards,

plq added a commit to plq/spyne that referenced this issue May 29, 2013
@plq
Copy link
Member

plq commented May 29, 2013

FYI, This is how your test case fails now:

Traceback (most recent call last):
  File "tc.py", line 26, in <module>
    out_protocol=Soap11())
  File "/home/plq/src/github/plq/spyne/spyne/application.py", line 100, in __init__
    self.interface = Interface(self)
  File "/home/plq/src/github/plq/spyne/spyne/interface/_base.py", line 61, in __init__
    self.populate_interface()
  File "/home/plq/src/github/plq/spyne/spyne/interface/_base.py", line 210, in populate_interface
    self.get_tns())
  File "/home/plq/src/github/plq/spyne/spyne/model/complex.py", line 642, in resolve_namespace
    v.resolve_namespace(v, default_ns, tags)
  File "/home/plq/src/github/plq/spyne/spyne/model/complex.py", line 621, in resolve_namespace
    ModelBase.resolve_namespace(cls, default_ns, tags)
  File "/home/plq/src/github/plq/spyne/spyne/model/_base.py", line 269, in resolve_namespace
    raise ValueError("You need to explicitly set %r.__namespace__" % cls)
ValueError: You need to explicitly set <class '__main__.T'>.__namespace__

@rstuart
Copy link
Contributor Author

rstuart commented May 30, 2013

(note to self: do not reply to github issues with email.)

On Wed, 2013-05-29 at 08:54 -0700, Burak Arslan wrote:

Hello,

First, thank you for your detailed report and constructive approach.

Let's look at this one by one:

 1. The temporary schema directory is only left when the schema
    generation fails exactly like your case. The reason is to help
    the developer look at the generated schema to see what's wrong
    with it. Such errors are not supposed to creep into
    production. I'd say that if your daemon restart fails, your
    problems are way bigger than a pissed-off sysadmin :)

Fair enough.

 1. Yes, every type in Spyne has to have a [type] name and must
    belong to a namespace. It's mentioned in the first chapter of
    the manual1 and not strictly an XML-related (but xml-inspired)
    requirement. The implicit assignment is always done to
    imported objects but not objects in the __main__ package
    because '__main__' is not really a namespace. That's why you
    need to assign explicit namespaces to all objects inside the
    '__main__' package. Assigning default namespace to all objects
    with no explicit namespace declaration could cause collisions.
    "Explicit is better than implicit", so I chose not to do it.
    That's not set in stone though -- My view is that Spyne
    belongs to all of us and I'm doing some kind of a "community
    service" by maintaining Spyne. So I don't think it's my place
    to ignore community's voice. But you need to persuade me :)

I am not familiar enough with SOAP or XML name spaces to understand this completely. So lets say we have:

my_package/
  x1.py:
    class Foo(ComplexModel): pass

  x2.py:
    class Foo(ComplexModel): pass

I presume the issue is that you need to generate unique XML names for these two Foo's, so you generate these targetNamespace's:

 my_package.x1, giving my_package.x1.Foo,
 my_package.x2, giving my_package.x2.Foo

and this thus two Foo's end up with XML different names.

Having now read a bit about XML namespaces, my_package.x1 isn't a very good XML namespace name. Namespaces are generally URL's. URL's have the nice property of being guaranteed to be unique. These generated generated ones have much less chance of being unique.

Still, the generated namespace names do serve a purpose. I take it they are required so you have to have something, and they are unique within the application which is all that matters. Or at least I assume those
two statements are true.

So lets look at two use cases. The first one is I do care about the namespace used. In this case explicitly setting __namespace__ for each class is the most general solution. It is a bit repetitive if you define several types in the one Python module that live in the same namespace. I thought you had solved that by using the appending the module name to the tns passed to Application. In retrospect I don't have a clue why I thought that. What happens with the Applications namespace is pretty clearly documented in the first chapter, as you
point out.

Still, being able to define a unique prefix for every namespace sounds like a nice short cut. If I could pass something like tns_prefix="http://my.domain.com/path/program" to Application(), and then have Foo() class above generate a namespace like "http://my.domain.com/path/program.my_package.x1" I could use your
namespace generation code, have a guaranteed unique namespace name, and avoid all the repetition.

The second use case is the Python programmer doesn't care about namespaces. I suspect this is the normal case. In that case you solution of just using the unadorned module name is fine, as you just need something unique with the application. In Python, the main program does have a name, "main". It is unique. Why not use it? The programmer by definition doesn't care what name you use, just as long at it works. It will work, won't it?

 1. In all honesty, I thought __namespace__ was documented. It's
    all over in the examples and the types chapter of the manual.
    But then again, it's the users' views that matter here, and if
    you think you can improve the documentation, send your pull
    requests my way.

You can't wriggle out of it that easily. I write doco too. Here is an example: http://www.stuart.id.au/russell/files/lrparsing/doc/html

So I am fully aware of how much effort it takes. __namespace__ only appears in examples - even in the chapter on types. Nowhere do you define what it does. That's would be fine for the tutorial if the programmer didn't have to know they exist in order to write working code. Right now he can't. You could fix that by using "main" for the main programs namespace but regardless, it's not fine for the reference documentation if you expect people to be able to use the feature without reading the source.

On Wed, 2013-05-29 at 09:05 -0700, Burak Arslan wrote:

FYI, This is how your test case fails now:

Traceback (most recent call last):
  File "zort.py", line 26, in <module>
    out_protocol=Soap11())
  File "/home/plq/src/github/plq/spyne/spyne/application.py", line 100, in __init__
    self.interface = Interface(self)
  File "/home/plq/src/github/plq/spyne/spyne/interface/_base.py", line 61, in __init__
    self.populate_interface()
  File "/home/plq/src/github/plq/spyne/spyne/interface/_base.py", line 210, in populate_interface
    self.get_tns())
  File "/home/plq/src/github/plq/spyne/spyne/model/complex.py", line 642, in resolve_namespace
    v.resolve_namespace(v, default_ns, tags)
  File "/home/plq/src/github/plq/spyne/spyne/model/complex.py", line 621, in resolve_namespace
    ModelBase.resolve_namespace(cls, default_ns, tags)
  File "/home/plq/src/github/plq/spyne/spyne/model/_base.py", line 269, in resolve_namespace
    raise ValueError("You need to explicitly set %r.__namespace__" % cls)
ValueError: You need to explicitly set <class '__main__.T'>.__namespace__

A definite improvement. As a programmer, on seeing that error my first step would be to go searching for __namespace__ in the reference doco to see what I should set it to. And when that failed, I would have ended up pretty much doing what I did anyway - dredging through the source code.

@plq
Copy link
Member

plq commented May 30, 2013

(github's email parsing used to be way better!)

On 05/30/13 04:22, rstuart wrote:

Still, the generated namespace names do serve a purpose. I take it they
are required so you have to have something, and they are unique within
the application which is all that matters. Or at least I assume those
two statements are true.

Correct. As you say, If you care about the namespaces you expose, you should set them as proper uris. If not, the generated ones are not that good, but the best we can do.

So lets look at two use cases. The first one is I do care about the
namespace used. In this case explicitly setting namespace for each
class is the most general solution. It is a bit repetitive if you
define several types in the one Python module that live in the same
namespace.

You can set a default namespace, see the last example before Array here: http://spyne.io/docs/2.10/manual/03_types.html#complex

Still, being able to define a unique prefix for every namespace sounds
like a nice short cut. If I could pass something like
tns_prefix="http://my.domain.com/path/program" to Application(), and
then have Foo() class above generate a namespace like
"http://my.domain.com/path/program.my_package.x1" I could use your
namespace generation code, have a guaranteed unique namespace name, and
avoid all the repetition.

Yes I could, but that's not really what is done in the XML world. People either don't care about namespaces or want to have very strict control over them. Also;

  1. The module in which that class is defined is pretty much irrelevant to the api user.
  2. People would confure tns_prefix with namespace prefixes.
  3. Any change to how I generate namespaces is going to break everybody's api. I mean, like, every single person who uses an api exposed by Spyne. To me, it's one of the most delicate piece of logic in this package.

So, no, that's not going to happen. That behaviour is pretty much set in stone now.

The second use case is the Python programmer doesn't care about
namespaces. I suspect this is the normal case. In that case you
solution of just using the unadorned module name is fine, as you just
need something unique with the application. In Python, the main program
does have a name, "main". It is unique. Why not use it? The
programmer by definition doesn't care what name you use, just as long at
it works. It will work, won't it?

Well... yes, it'll work. My arguments here are nothing more than "it's ugly" and "it's not a real namespace".

Ok, I agree that the average python programmer shouldn't have to learn all this. Initally at least. Especially if (s)he's not evaluating Spyne for a non-xml protocol.

I'll just replace '__main__' with the targetNamespace value. That'll make both of us happy, and won't break existing code as the current behaviour is to fail in an ugly fashion in that case anyway. Right?

  1. In all honesty, I thought __namespace__ was documented. It's
    all over in the examples and the types chapter of the manual.
    But then again, it's the users' views that matter here, and if
    you think you can improve the documentation, send your pull
    requests my way.

You can't wriggle out of it that easily.

But I wasn't trying to! Here's what happened though: I thought having a docstring for __namespace__ would be enough for it to appear in the reference documentation. I just checked and it's not there!

Here's the docstring: https://github.com/arskom/spyne/blob/master/spyne/model/_base.py#L101

I'm just saying that I need an outsider's perspective on making this more clear. So assuming I put __namespace__ back in the reference, how do you think I can clarify this more? Or, now that I have resolved the issue with '__main__', do I need to?

And hey, thanks a lot for your time.
Best regards,

@plq
Copy link
Member

plq commented May 30, 2013

@rstuart
Copy link
Contributor Author

rstuart commented May 30, 2013

You can set a default namespace, see the last example before Array here: http://spyne.io/docs/2.10/manual/03_types.html#complex

Hmm. Looks like I have an admission to make. When I started I was using 2.9, as that was what I could find packaged for Debian. So I read it's manual. I have since packaged 2.10.7 for Debian, but haven't re-read the 2.10.7 manual. I thought it would be much the same, but no, you are obviously working hard on it. That chapter on types didn't exist in 2.9.

Actually, I noticed you mentioned ordering in that type chapter. I wondered about that. Again I don't know much about SOAP, but SOAP calling its structures a "sequence" suggests the declared order is important. Which is an issue, because I can tell you from experience the order a dict is traversed is unpredictable. And I know in the metaclass you will be grabbing those field declarations from the passed dict, in the order some iterator returns them to you. This ordering can change between versions of Python. It has done so. Code I wrote persisted a dict to disk and compared the string it wrote to the current version to check if something had changed. After an upgrade it all broke even though the dict itself was identical. I also know that in C# and Java, they grab the WSDL and compile it, so if ordering is important and it changes bad things will happen.

In theory this is easily solved, you just ensure every field constructor involves a call of some kind, you never allow:

class MyType(Complex): field = Unicode

and always insist on:

class MyType(Complex): field = Unicode()

The call can then increment a global counter and store it in the type definition, and then you can use that counter to figure out the order the fields were declared in the class body.

Of course "easy" is a relative term, and it means breaking the API, particularly for ComplexModel where the constructor currently instantiates an object, not a type.

Ok, I agree that the average python programmer shouldn't have to learn all this. Initally at least. Especially if (s)he's not evaluating Spyne for a non-xml protocol.

I'll just replace 'main' with the targetNamespace value. That'll make both of us happy, and won't break existing code as the current behaviour is to fail in an ugly fashion in that case anyway. Right?

That would solve my main whinge completely.

But I wasn't trying to! Here's what happened though: I thought having a docstring for namespace would be enough for it to appear in the reference documentation. I just checked and it's not there!

Here's the docstring: https://github.com/arskom/spyne/blob/master/spyne/model/_base.py#L101

Ahh, OK.

I'm just saying that I need an outsider's perspective on making this more clear. So assuming I put namespace back in the reference, how do you think I can clarify this more? Or, now that I have resolved the issue with 'main', do I need to?

If I have a minor complaint about the wording I put in a pull request. It was a little hard to put in a pull request when the scaffolding wasn't there.

@plq
Copy link
Member

plq commented May 31, 2013

Hmm. Looks like I have an admission to make. When I started I was using 2.9, as that was what I could find packaged for Debian. So I read it's manual. I have since packaged 2.10.7 for Debian, but haven't re-read the 2.10.7 manual. I thought it would be much the same, but no, you are obviously working hard on it. That chapter on types didn't exist in 2.9.

apology accepted ;) don't hesitate to tell me if you figure anthing else is missing.

Actually, I noticed you mentioned ordering in that type chapter. I wondered about that. Again I don't know much about SOAP, but SOAP calling its structures a "sequence" suggests the declared order is important.

Wow, you're good :)

In the Python 2.x world, changing Python versions is not that frequent nowadays.

But the dict order gets shuffled when one adds a new field to an object as well. Contrary to what it seems, this is not guaranteed to be a backwards-compatible change (even when that field's an optional one) as schema validation (that's not a SOAP thing, but an Xml Schema thing) validates order of the elements as well. (Soft validation does not mind) So if your client(s) have your old WSDL cached, some requests will start failing for no apparent reason upon next deployment.

Whether you care or not depends mostly on the type of clients you serve though, so I don't want to make the lives of people who don't care about argument order harder for no apparent benefit to them.

In theory this is easily solved, you just ensure every field constructor involves a call of some kind, you never allow:

class MyType(Complex): field = Unicode

and always insist on:

class MyType(Complex): field = Unicode()

The call can then increment a global counter and store it in the type definition, and then you can use that counter to figure out the order the fields were declared in the class body.

This is a respectable idea which I'm not so sure initially about how it'd pan out, but would you mind not diverting much from the subject at hand? There is going to be a Python-3-only Spyne 3 (I can't wait to lay my hands on function annotations :)) and I intend to have an open discussion about what to change.

@plq
Copy link
Member

plq commented May 31, 2013

So ok, conclusion:

  1. Document the behaviour of implicit namespace assignment. (the story with __module__ and when cls.__module__.startsiwth("_"))
  2. Replace '__main__' as namespace value with whatever's passed as tns to the Application initializer.

Anything else to add?

@rstuart
Copy link
Contributor Author

rstuart commented May 31, 2013

Anything else to add?

Nope. So many words from me for such a little change. Sorry - I'll leave you in peace now.

plq added a commit to plq/spyne that referenced this issue Jun 9, 2013
@plq
Copy link
Member

plq commented Jun 9, 2013

So, now default ns is always used whenever an explicit ns is missing.

Re-open if you have anything else to say. I'll happily discuss it :)

@plq plq closed this as completed Jun 9, 2013
@plq
Copy link
Member

plq commented Jun 9, 2013

Especially regarding the docs!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants