-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
XML parser refactoring in preparation for latest nif.xml #62
Conversation
Trying to get it not crash. Not functional, but at least it parses the whole xml.
expressions seem to decode now with tokens from the nif xml
bitfields are parsed as now bit structs, but attribs are not set.
Turns them into a list of dicts which are then applied in the expression class
Tokens are now applied to matching attributes from a structs attrs dict
Now functional enough to read a full nif with some modifications to nif.xml
Also raise error again if check fails.
backward compatible
- represent #T# by class `T`, which is just a dummy (same for #ARG#, but that's probably not needed) - refactor token replacement to make sure fixed tokens are always applied
Much more readable and concise.
Works with old nif.xml
The other day @neomonkeus asked me for my expression parser that I use for my nifxml linting and xml/docs/codegen but it's seeming like that isn't necessary anymore since you appear to be both capable and willing to deal with PyFFI. I was so put off by PyFFI I instead opted to create a flat, statically typed Python lib with my codegen that can read everything NifSkope can (more, weirdly, due to NifSkope bugs) and isn't a complete nightmare to write for and debug since it's not relying 99.9% on dynamic language features. Using Python features like dataclasses and Enum/Flag (or IntEnum/IntFlag for less strictness) also leaves the generated code extremely clean and navigable (though I can optionally generate non-dataclass versions). The enums let me catch dozens of missing things from the XML, like 15 undocumented Havok materials and maybe a dozen other enumeration values spread across random games, since I could configure the Enum class's Creating the entire lib and incrementally fixing codegen to read all the files nifxml supports only took a couple days too i.e. far less than the time to figure out how to force PyFFI to adjust to the new nifxml... which to me says something about how dense and esoteric PyFFI is. I've browsed through your commits and unfortunately PyFFI remains... difficult for me so I can't yet comment on the correctness of anything you've implemented from nifxml. There are actually some nifxml changes which were meant to go into my PR that neo merged without warning, one which I still haven't settled on. Specifically Speaking of non-NiTriShape meshes, since you're willing to deal with PyFFI you should really try getting NIFs beyond 20.3 fully supported. Starting in 20.5 they deprecated NiTriShape etc. and switched to NiMesh. Though I can't tell what is needed in PyFFI itself vs the Blender plugin. PyFFI seems to split the RTTI args for The second upcoming XML change is a "stop condition" for object/field serialization. This is required in FO76 for shader properties which effectively stop serialization if the For this particular stop condition I haven't settled on the exact expression because there are issues with "truthiness" of strings in languages other than Python. Most simply it can be
which results in my codegen as: def read(self, stream: BinaryStream, args: Optional[List[int]]=None):
super().read(stream)
if stream.bs_version >= 155 and self.name:
return
# Rest of function More rigorously it should be:
Where At the moment I'm not sure about the state of string literal support in any of the expression parsers, incl. my own and NifSkope's. I would probably have to add grammars for it in mine at least. I would prefer the "truthy string" method but then C++ code generation would require some special casing as As for attributes which haven't been implemented in PyFFI yet, I'll try to go over my nifxml commits and pick out the ones that are actually important. Attributes such as I've run out of wind for now but my next comment will be about PyFFI and field name/type "collisions" which are technically not collisions and if that is finally fully addressed yet. But it will take me a while to come up with the cases and how I deal with them statically. |
I merged onto the PR into develop to get a baseline for us to work with. If there if any additional work locally feel free to PR towards develop. I added you as a reviewer both here and on the nif.xml PR from develop to master in case there was indeed any additional work that you thought should be included, such as the above. Maybe we should talk it over there, but good to have an insight into what is left out. Having a full working implementation of your parse would still be a great reference. |
Hi Jon! One strange pyffi bug needs further investigation before I can continue updating to your new xml specs completely: niobjects or compounds with fields of the same <niobject name="NiAVObject" abstract="true" inherit="NiObjectNET" module="NiMain">
Abstract audio-visual base class from which all of Gamebryo's scene graph objects inherit.
<field name="Flags" type="uint" default="0x8000E" vercond="#BSVER# #GT# 26">snip</field>
<field name="Flags" type="ushort" ver1="3.0" vercond="#BSVER# #LTE# 26">Basic flags for AV objects.</field> I haven't been able to tell what is going wrong there. Parsing of tokens and expressions works fine. When either one or the other field is commented out, Nifs of the remaining type parse fine. It really appears to be down to the doubled field name, but the whole pyffi system is so obfuscated that I haven't managed to pinpoint the actual source of the issue. I would also appreciate your parsing library for reference. It seems like it could be a very helpful resource for upgrading pyffi. |
From what I see, there are a few problems with mutiple same-named attributes. First when you read/write a struct, you go through a filtered list. This filtered list doesn't return same named attributes. Next in _MetaStructBase when assigning all the attributes it just overrides the old one with the new one. I believe there was one other problem but it isn't coming to mind right now. |
This is almost verbatim how I've referred to the problem case previously so maybe you've already caught some of the previous discussion in tickets and commits, but in case not I'll try to cover all the previous ground. It's the issue I referred to at the end of my last comment that I ran out of wind to cover. :) I decided to explicitly allow these cases because often fields are inevitably the same but change position (or expand/narrow in type) between versions. I first implemented the When the types are not integral, I implemented a So, in a statically typed codegen situation, dealing with this is actually extremely trivial. You form the member type out of a union of the duplicate types, and for the serialization types you refer to the original field type. The member type union may simplify to one type if all types are convertible, in which case you pick the largest type. Statically, in Python this looks like: @dataclass
class NiAVObject(NiObjectNET):
flags: uint = 524302
# other fields
def read(self, stream: BinaryStream, args: Optional[List[int]]=None):
super().read(stream)
if stream.bs_version > 26:
self.flags = stream.read_uint()
if stream.version >= 0x3000000 and stream.bs_version <= 26:
self.flags = stream.read_ushort()
# rest of serialization Here are currently all the instances in the XML for integral unions:
For some of these names there are more than two fields. And for non-integral unions there is: BSVertexData ...And also Vertex Data on BSTriShape, but in mypy, Also note that for The rest of the explicit SO... If I were to guess what the issue is in PyFFI, I would say that you're missing the distinction between member type and serialization type in some way. Either the first or the last field that gets generated dynamically overwrites what the serialization type should be, and you end up reading the wrong size. Of course you're also not using static typing, so I'm not sure "member type" is exactly a thing in your case. You may be only storing one read type, etc. In my static lib example this would look like: @dataclass
class NiAVObject(NiObjectNET):
flags: uint = 524302
# other fields
def read(self, stream: BinaryStream, args: Optional[List[int]]=None):
super().read(stream)
if stream.bs_version > 26:
self.flags = stream.read_uint()
if stream.version >= 0x3000000 and stream.bs_version <= 26:
self.flags = stream.read_uint() # INCORRECT, SHOULD BE read_ushort
# rest of serialization There is definitely no going back from this type unioning, because pushing minor type/version/condition differences all the way to the fully qualified name that you use at runtime (and in API and other libs..) is rather severe and unmaintainable. However, whether we class this as a PyFFI bug or possibly needing more specification in nifxml I guess is now up for discussion. At the moment though I can't clearly define any scenarios where more explicit attributes would actually help PyFFI parse these semantics aside from mere convenience. Even in my parsing I have to go over the members once, collect the types by unique name, and then run functions on the types to see which type the member ends up being. During serialize, I pretend the fields were never unioned into the same member. I suppose this data could be baked into the XML but I don't see it helping PyFFI's issue. (P.S. I wrote all this before Tagnum replied so I haven't checked out his comment yet) |
Thanks!
At least this one I had suspected too and changed, but then rolled it back because something somewhere was breaking tests (not necessarily this change).
This seems to be the core problem. Not sure how we can fix that. Maybe we should go over the filtered list instead of |
Another issue here in StructBase, it also avoids dupes and if it didn't it would override previous attributes: # initialize attributes
for attr in self._attribute_list:
...
# assign attribute value
setattr(self, "_%s_value_" % attr.name, attr_instance) https://github.com/niftools/pyffi/blob/develop/pyffi/object_models/xml/struct_.py#L294 The problem is, we need the |
In fact I did not find that discussion haha, I just read your issues for the formalization and comments on your nifxml commits.
I completely agree.
Very elegant, makes me wonder why pyffi has historically evolved as a dynamic system rather than a simple static codegen...
That seems to be pretty much what's happening. Init overwrites attributes and read / write don't have access to the ones they need.
It is most definitely a pyffi bug.
I don't think we need any additional data in the nifxml. Pyffi should be able to deal with these duped fields with a slight recode. |
I have a few ideas, whether or not they will work is another thing. In relation to the blender_nif_plugin. I wouldn't mind getting rid of PyFFI altogether because of all the Now for pyffi, non-duplicate struct attributes just we just written but any duplicate structs could be processed at read/write. |
I wouldn't be opposed to using a static codegen instead of pyffi for the blender plugin, but I'm sure that would entail a lot of recoding work to make up for functions that had been delegated to pyffi. In the long run the whole nif project would surely benefit from using just one nif XML parsing solution. |
BTW at least as of right now I have done this for the strict float comparison (as hex bytes): Added to the tokens for <condexpr token="#FLT_MAX#" string="0x7F7FFFFF" /> ...and in the actual
Python and floats are... weird, and not really my area of expertise, but what I found works for my static lib prototype is replacing the float.fromhex('0x1.fffffe0000000p+127') Which this feels... weird as I said before. I am unsure if this value is platform specific or even binary specific. I am using 32-bit 3.7.4 in this example. This is what FLT_MAX read from the NIF files reported as when turned into hex, at least on said Python version. I have the sneaking suspicion the value could be different on 64-bit Python so it doesn't seem like a very portable method. In any case, At least as a token now you can more cleanly deal with it. |
@jonwd7 I think the |
Endianness doesn't really factor in here as the endianness varies only on the left side of the expression I went ahead and looked into it and the Python docs were actually helpful: And as I assumed the reason the hexadecimal value does not match true flt_max (7f7fffff) is that Python float is actually a double. |
- arg is now an expression - add HFloat, NiFixedString - inspect header and store bs_header for global access - extend Expressions
Alright, I think I solved the union / serialization issue, at least for reading. Taking baby steps here 🤣 |
pyffi/object_models/common.py
Outdated
:param stream: The stream to read from. | ||
:type stream: file | ||
""" | ||
self._value = struct.unpack(data._byte_order + 'e', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Struct half-float e
was added in 3.6. I don't know about support for previous python versions.
Python half-float is IEEE 754 which might not be the right type for nifs.
We spoke about it but sadly I can't find it (might've been in the dev channel)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ummm, didn't mean to submit review...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I saw a PR for half float but it was implemented from the ground up, like with mantissa and stuff. I kinda want to avoid that if possible because it makes the whole thing even less maintainable. I guess we have to cut support for ancient versions at some point...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm mainly thinking that we target 3.7+ going forward.
For compatibility with older nif.xml
pyffi/object_models/xml/struct_.py
Outdated
else getattr(self, attr.arg) | ||
rt_type = attr.type_ if attr.type_ != type(None) else template | ||
rt_template = attr.template if attr.template != type(None) else template | ||
rt_arg = attr.arg |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With arg as an expression (for vertex desc) I'm not sure the rt_arg logic is still needed. But I'm also fairly sure the new code is not correct. Maybe evaluate expression instead?
@niftools/pyffi-reviewer
Overview
Fixes Known Issues
n/a
Documentation
n/a
Testing
n/a
Manual
Blender plugin seems to work just fine with old nif.xml. New nif.xml has kinks to work out, but the basics are there.
Automated
n/a
Additional Information
n/a