New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use malloc instead of new to allocate in external C API function GetCurrentModDir #9

Closed
wants to merge 1 commit into
base: master
from

Conversation

Projects
None yet
2 participants
@Brett208
Copy link
Member

Brett208 commented Dec 14, 2018

This fixes a legitimate bug in op2ext's external API. Perhaps we should role it into Outpost 2 before the next release of the game?

Also, not sure if we want to use static_cast there or c style cast. I think it is okay to call the static_cast here since the cast doesn't affect what is passed outside of the function and across DLL boundaries.

Closes #8

@Brett208 Brett208 requested a review from DanRStevens Dec 14, 2018

@DanRStevens
Copy link
Member

DanRStevens left a comment

Change looks good.

You're probably right about rolling this out quickly. Without this fix, we could get module crashes when modules try to clean up memory.

Edit: I think you're right about using static_cast here. It's fine. The implementation of op2ext uses C++, it's only the interface that is restricted to C. As the cast is contained within the method, and not part of the interface, it's valid to use all C++ features. Additionally, as static_cast is safer to use, and more searchable for validation, it should be preferred over the C-style cast.

@DanRStevens

This comment has been minimized.

Copy link
Member

DanRStevens commented Dec 16, 2018

I was thinking about this, and there may be another problem. The op2ext module is compiled separately from whatever client code that uses the returned string. As such we've crossed a module boundary between the point of malloc and the point of free. As the code in each module won't necessarily link to the same methods and global data structures, this can also result in undefined behaviour. In short, don't mix heap operations between modules.

The proper way to do this, is likely to provide a Free method, which the calling code could pass the string to after it is done with it. The above is the reason why most libraries with dynamically allocated resources also provide a corresponding Free* method.

This is also why a lot a APIs allow you to pass in a buffer to write into, which greatly simplifies memory management. However, then the called code is unable to resize the buffer, and so if the string is too big, it has to fail and somehow signal the error.

@Brett208

This comment has been minimized.

Copy link
Member Author

Brett208 commented Dec 18, 2018

op2ext's public API is a fair mess overall considering how small it is.

When we went with GetGameDir_s, we used the buffer + size concept. It might be nice to stick with that here for consistency, but that would require deprecating the current function.

Maybe creating a free function would keep from deprecating, which would be nice.

Dan, do you have a preference? I'm not knowledgeable enough to care either way as long as it works.

Will probably abandon this branch and create a new branch when we chose a way forward.

Thanks,
Brett

@DanRStevens

This comment has been minimized.

Copy link
Member

DanRStevens commented Dec 19, 2018

Memory allocations concerns are always a pain for API design with C/C++.

Two major approaches:

  1. Pass a buffer and size pair.
  • Calling code is free to use whatever memory allocation scheme they want
  • Side steps many memory allocation concerns, and allows for higher efficiency
  • Natural to use stack allocated local variables (or global/static variables)
  • Has to deal with buffer-is-too-small-but-otherwise-might-work errors (caller is unable to re-allocate)
  • Can result in complicated try again code paths
  • Or, can be worked around if there is a reasonable upper bound on the resource size
  • In the name of simplicity, (no try-again code paths), an artificial upper bound is often imposed
  1. Dynamically allocate resources, and pair with a Free method
  • Memory allocation and ownership concerns become part of the API design
  • Possibility of memory allocation failures
  • Requires a paired Free method to work around static/dynamic linking and separate heap issues
  • No complicated try again code paths
  • Caller must remember to call the corresponding Free (which the compiler can not reasonably verify)
  • No natural hard limit on the size of a resource

For Windows, paths have a soft upper bound of MAX_PATH (260) characters. It's not a hard limit though, and deeply nesting directory structures can cause path lengths to exceed that amount. Some APIs have a limit of MAX_PATH, whereas others have workarounds to handle longer paths.

My default preference is for the first method, though I have a strong bias for efficiency which is not particularly relevant here. What's annoying about the first is there is typically an artificially imposed upper bound on resource sizes, used to keep code simple.

Here, I'm tempted to use the Windows MAX_PATH, and settle on an artificially low upper bound.


I suppose there is another alternative, which is sometimes used. Have the buffer owned and managed by the library, but allow reading to it by returning a (possibly read-only) pointer into the buffer. The calling code is free to read/use/copy the contents of the buffer. It doesn't not own it though, and can not free it. The buffer may have it's contents cleared/rewritten/invalidated by future calls to library methods. This is mostly appropriate when calls to library methods occur in some set pattern where the library can be sure it has an opportunity to clean up it's own internal buffer at some point and there is no need to have multiple copies.

For instance, each call to GetModulePath might re-allocate the buffer, fill it with data, and leave it that way until a future call to GetModulePath, or the library is unloaded, and the library has hooked that event and free's any dynamically allocated buffer then.

One downside is the limit of a single buffer can cause re-entrancy issues in some cases. This is an issue if the same library method is called from two different routines, where one is nested inside the other. The inner method can invalidate the buffer used by the outer method. This can result in hard to understand bugs. The problem is made worse in a modular system where the inner method is dynamically determined, and may not be aware of the implementation of the outer method.

In the case of the OP2 module system, I'm not aware of any case where one module may depend on another module. They are generally independent of each other, with the only dependencies being between the modules of the game itself.

I'm kind of thinking this last option might actually be the most convenient.

@Brett208

This comment has been minimized.

Copy link
Member Author

Brett208 commented Dec 19, 2018

Okay, well I guess we could mirror the GetGameDir_s call and create a GetCurrentModDir_s function. Below is the GetGameDir_s syntax. This would allow calling larger than max_path although a user could start with max_path for the buffer size.

We could then mark the old function as deprecated.

// Retrieves the current absolute directory of the Outpost 2 executable with a trailing slash.
// If bufferSize is smaller than required to copy entire path, buffer is provided as much of path as possible.
// Returns 0 on success. Returns the required minimum size of the buffer on failure.
OP2EXT_API size_t GetGameDir_s(char* buffer, size_t bufferSize);

@DanRStevens

This comment has been minimized.

Copy link
Member

DanRStevens commented Dec 20, 2018

Alright, let's go with that.

@Brett208 Brett208 closed this Dec 22, 2018

@Brett208 Brett208 deleted the Use-Malloc-In-Function-GetCurrentModDir branch Dec 22, 2018

@DanRStevens

This comment has been minimized.

Copy link
Member

DanRStevens commented Dec 23, 2018

Hmm, you know, if the function is going to continue to exist in deprecated form, I think I might still recommend merging this. It's broken either way, though it's probably less broken with this merge. At the very least, the discussion as to why it's broken can become a permanent part of the project's history.

@Brett208 Brett208 restored the Use-Malloc-In-Function-GetCurrentModDir branch Dec 23, 2018

@Brett208

This comment has been minimized.

Copy link
Member Author

Brett208 commented Dec 23, 2018

I didn't voice this earlier, but my concern is if previous modules have employed the function knowing that it uses new instead of malloc, will we be causing harm in those DLL's memory management by switching it?

If someone wants to investigate, console modules that are probably worth reviewing would be cm1, cm2, and the tech tree mod that I forget the name of.

If you think it is still the right choice, then let's merge in.

Restoring branch until we make a choice.

-Brett

@Brett208 Brett208 reopened this Dec 23, 2018

@DanRStevens

This comment has been minimized.

Copy link
Member

DanRStevens commented Dec 24, 2018

It's hard to say how it would affect them without looking at both the source, and what the compiler did with it.

Though if they ignored the problem, and just leaked memory, then it really doesn't matter how we allocate it. That might very well be the case.

I'm uncertain where to find the source for those projects.

@Brett208

This comment has been minimized.

Copy link
Member Author

Brett208 commented Dec 24, 2018

Arklon might have the source to cm1 and cm2. sirbomber would have the source for the other one. Would be nice to have both in the repository.

I could load the modules and set a breakpoint in the function GetCurrentModDir within op2ext. Then by loading Outpost 2 and using the mod for a bit, it should tell us if they are loading using the function or not. I haven't tried this before, but if the breakpoint triggers, then VS may let me see the assembly associated with the call across the module sort of like what I provided when we were troubleshooting.

Below is the complete list of mods I know about. It would be really nice if someone could take the time to test and document them all. Some sort of a la carte menu for people to download and mod the game with. Unfortunately if I recall correctly, many are broken.

  • Color Mod 1 (CM1) - sirbomber/Arklon (console loaded)
  • Color Mod 2 (CM2) - sirbomber/Arklon (console loaded)
  • Multitek2 - sirbomber (Console loaded, depends on ART_PATH only)
  • NetFix - Hooman (.ini loaded)
  • NetHelper - Arklon (.ini loaded)
  • Renegades - EddyB and team
    • It uses the Console module loader to work. However, the installer mentions that it will install a patch to fix a bug in the console module loader (old name modmgr). No idea what the bug is though.
  • OP2DisasterlessMod - Arklon (Ini/Console)
  • OP2GPPatch - ZigZagJoe (Ini/Console)
  • OP2ScrollFix - ZigZagJoe (Ini/Console)
  • SSUtil2 (Screenshot Utility) - TH300 (Ini/Console)
  • Video Recorder (Arklon has source) Contains major bugs

After thinking about this some more, I think we should just leave the deprecated function alone. If someone properly cleaned up the new call, I don't want to break their work. If a new module is compiled, it needs to use the new function. Testing all these modules seems to be a lot of work for a deprecated function.

-Brett

@DanRStevens

This comment has been minimized.

Copy link
Member

DanRStevens commented Dec 24, 2018

If someone properly cleaned up the new call

Which they can't!

Though, it may have the appearance of working, and possibly do the right thing by accident.


Ehh, that would be a major pain to track down source and test things. Particularly since I haven't heard from some of those people in years. Last I spoke to Eddy, he'd lost access to the Renegades source. And the method, with or without the patch, is broken anyway.

As much as I don't like to just leave things, I think you might be right. This probably isn't worth the effort to pursue further. Let's just close this.

@DanRStevens DanRStevens deleted the Use-Malloc-In-Function-GetCurrentModDir branch Dec 30, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment