Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong assumptions about C types #823

Closed
oprypin opened this issue Jun 17, 2015 · 10 comments
Closed

Wrong assumptions about C types #823

oprypin opened this issue Jun 17, 2015 · 10 comments
Assignees

Comments

@oprypin
Copy link
Member

oprypin commented Jun 17, 2015

In Crystal's codebase wrong assumptions are made about C types. The type int is just mapped as Int32, etc. The same assumptions are recommended in the documentation (introduction).

Why am I saying this is wrong? There are very few, if any, C types that have their exact byte size specified in the standard.
What does the C++ standard state the size of int, long type to be?

Of course, there are some assumptions that are true for current major systems (say, Windows, Mac and Linux), so let's list them.
Signed-ness won't be mentioned here, because it should be obvious.

  • char is 1 byte
  • short is 2 bytes
  • int is 4 bytes
  • long long is 8 bytes
  • size_t is the same number of bits as the OS bit-ness (4 or 8 bytes)
  • long
    • Mac, Linux: same as size_t
    • Windows: 4 bytes
  • float is 4 bytes
  • double is 8 bytes

Reference
(Here are some of these things implemented in code)

I've definitely seen a few cases where one of size_t/long is mapped as one of Int32/Int64 in Crystal's codebase. So, many things are broken now, not just in perspective.

Short-term fix: make sure the definitions of SizeT and LongT are correct and always use these.

This will need to be done if we expect Crystal to work on Windows.


But, I don't think it's a good idea in general to directly use Crystal's IntXX types for C bindings, because that works only by coincidence with the current de-facto standards. So if Crystal wants to support another platform has has some different choices, not only will there be a ton of work required to fix up the Crystal compiler for them, but additionally (almost) every library that uses C bindings will break. You probably realize that this is not something that can be solved reasonably.

My suggestion is to simply provide aliases for every C numeric type, even if they are trivial. Trivial for now, that is, because they easily may change in the future. In addition, less thinking is required to use straightforwardly named aliases.
A great example of such aliases

The sooner this is done, the better.

@refi64
Copy link
Contributor

refi64 commented Jun 17, 2015

There's a worse example: getchar, which returns an int, is set to return a char.

@oprypin
Copy link
Member Author

oprypin commented Jun 17, 2015

You may think, why doesn't everything fall apart when things like this are used? Well... one function argument (not sure if last one or first one) is "allowed" to have wrong size. Same position in stack, little-endian architecture...

@asterite
Copy link
Member

I agree. We could have top-level aliases like CInt, CChar, CLong, etc., and even CString (aliased to CChar* or just UInt8*). We can mention them in the C bindings docs, and of course use them in the standard library. And as a plus, like @BlaXpirit says, it will make it easier to write C bindings because one wouldn't need to think "Ok, char* is UInt8*, int is Int32", etc, just put a "C" before it.

@waj thoughts?

@asterite
Copy link
Member

Yes, like the getchar mapping. I think that's a leftover from early days, I searched its uses and it's not used at all so we could remove it (and also putchar).

@jhass
Copy link
Member

jhass commented Jun 17, 2015

I'm not sure I'd introduce them to the toplevel, I wouldn't mind having to write LibC::Int, LibC::SizeT etc, after all it's intended for bindings only.

@vyp
Copy link
Contributor

vyp commented Jun 17, 2015

+1 for me. I also agree with @jhass too.

@bcardiff
Copy link
Member

+1

not polluting the toplevel seems right, yet, I would allow just Int SizeT inside a lib declaration.

May be we should avoid other types in a lib actually. It seems that every desired type used should be this C built-in types or declared in another lib.

@waj waj self-assigned this Aug 1, 2015
waj added a commit that referenced this issue Aug 1, 2015
@waj
Copy link
Member

waj commented Aug 1, 2015

I just added the type aliases (69c6254) and changed a few declarations. Change all the existing declarations might be a tedious job, so I suggest we fix them as needed.

I'll update soon the function definitions of LibGMP so we can support 32bit. PR for other libraries are welcome! :-)

@bgdncz
Copy link

bgdncz commented Aug 2, 2015

@waj Thanks! This is very useful for #1075

@asterite
Copy link
Member

After the above commit, in which I went through all libs to make sure they are mapped to the correct types, and this one I'm closing this.

@BlaXpirit thanks for brining this issue. I believe writing C bindings now will much easier and less error-prone.

akzhan added a commit to akzhan/crystal that referenced this issue Jun 26, 2017
…g two pointers.

ptrdiff_t is used for pointer arithmetic and array indexing, if negative values are possible.

Refs crystal-lang#823, crystal-lang#4589.
akzhan added a commit to akzhan/crystal that referenced this issue Jun 26, 2017
…g two pointers.

ptrdiff_t is used for pointer arithmetic and array indexing, if negative values are possible.

Refs crystal-lang#823, crystal-lang#4589.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants