Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for Win32 issue with MAX_PATH #1252

Closed
wants to merge 2 commits into from

Conversation

ZimM-LostPolygon
Copy link

Added -short-types option that allows shortening mangled descriptor type names. Useful on Windows for avoiding the 260 max path length limit.
This is a finished version of an old patch (https://sourceforge.net/p/swig/patches/325/)

@vadz
Copy link
Member

vadz commented May 7, 2018

Thanks!

Arguably this should be always used by default under Windows (perhaps with ~240 as the default limit?) or, at least, we should check for file errors and advise to use this option if they happen due to path being too long, what do you think?

@ZimM-LostPolygon
Copy link
Author

ZimM-LostPolygon commented May 7, 2018

Arguably this should be always used by default under Windows

Probably not, since SWIG should produce the same output given the same input, on any system.

Showing a helpful advice if file errors happens because path is too long sounds like a good idea. That said, I personally don't really want to dig that deep into SWIG code to implement it :)

@wsfulton
Copy link
Member

These are my 3 comments replicated from the SourceForge link:

  • SWIG's output is identical on all platforms given identical input, currently the patch breaks this.
  • The mangling of the name should be off by default and turned on via a command line option. The option would apply to all languages even though it would be most useful in Java and C#. I suggest the following option which will need to be implemented in a new patch, the output from swig -help would be:
    -shorttypes [n] - Shorten mangled descriptor type names to max size [n], n=100 if omitted.
  • The new option needs to be documented under Doc/Manual showing the effect with a scripting language and a non-scripting language, such as the file write failure that it would overcome for Java. SWIG.html is probably the best place and possibly a mention in the Java chapter too.

The 1st seems to be addressed. For the 2nd, the name isn't quite right (-shorttypes vs -short-types). For the 3rd, some proper documentation and explanation is needed.

I'd also like to know more about the algorithm/maths behind the hash calculation. It should also be documented in the code. Importantly, how does it deal with hash conflicts?

A few cosmetic changes are also required to the patch as it breaks the coding conventions.

@ojwb
Copy link
Member

ojwb commented May 16, 2018

Importantly, how does it deal with hash conflicts?

It doesn't...

If we're going to just assume hashes won't collide, we really need to use a crypto-grade hash that has a large key space and is designed to be collision resistant.

@ZimM-LostPolygon
Copy link
Author

For the 2nd, the name isn't quite right (-shorttypes vs -short-types).

Sure, that's easy to change. Although SWIG has a -external-runtime command line option, shouldn't it be -externalruntime then?

As for the hash collisions... Now that I think about it, a better way to guarantee uniqueness is to append a value of an incrementing counter instead of just string hashes.

BTW, a collision is also possible if someone decides to name a class something like SomeVeryLongClassName_09dcf8a.

I'll write the documentation as soon as I'll have some preliminarily approval on the code itself.

@wsfulton
Copy link
Member

-externalruntime is a quirk, the convention is to not have dashes.

I asked if it deals with hash conflicts because if it doesn't then the patch is fundamentally flawed. I don't know what the solution will be, but at minimum collisions should be detected for the code parsed. Quite how a user can deal with a reported collision needs a strategy for a user to workaround. I suspect that with some thought an algorithm could be devised that will avoid collisions with an extremely high probability. Hence the maths behind the algorithm should be documented.

An incrementing counter is not going to work because when the code changes (eg code re-order) the mangled name will change which affects the names of generated types (eg Java/C# intermediary class names) that a user can use. So each time the input code changes, there is a chance that the updated generated code will break existing code using the generated code.

@wsfulton
Copy link
Member

No response. Feel free to re-open with improvements to address all the concerns.

@wsfulton wsfulton closed this Dec 22, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants