Differentiate between signed and unsigned types #62

sitio-couto · 2023-04-18T21:34:16Z

CIR Integer types are built without any sign information:

clangir/clang/lib/CIR/CodeGen/CIRGenTypes.cpp

Lines 394 to 396 in 0bbac72

    
           // FIXME: break this in s/u and also pass signed param. 
        
           ResultType = 
        
               Builder.getIntegerType(static_cast<unsigned>(Context.getTypeSize(T)));

bcardosolopes · 2023-04-18T21:47:42Z

Yep, that's the status quo, and this is also related with #5

Both are something we're gonna have to tackle sooner than later anyways, in case you wanna add to your list!

sitio-couto · 2023-04-25T13:39:22Z

@bcardosolopes regarding #5, what is the level of granularity we are looking for?

Does a cir.int and cir.float with arbitrary sizes suffice?
Or would a type-per-keyword scheme (cir.char ,cir.short, ...) be preferable?

I suggest we mirror MLIR's built-in dialect types (single int type with arbitrary size and one float type per size), since it already tracks signedness, then add a few qualifier attributes to mark const and volatile types.

bcardosolopes · 2023-04-25T17:40:25Z

Does a cir.int and cir.float with arbitrary sizes suffice? Or would a type-per-keyword scheme (cir.char ,cir.short, ...) be preferable?

Tracking arbitrary sizes sounds good enough.

I suggest we mirror MLIR's built-in dialect types (single int type with arbitrary size and one float type per size), since it already tracks signedness

Signedness is pretty important because we want code analysis writers to be able to detect things like integer overflows and whatnots. If we can take advantage of the underlying in-tree primitive types to represent that, all we would need is an extra getter method for C/C++ specific signed/unsigned queries.

Implementing primitive types while tracking signedness would be step (1).

Step (2): we should also consider adding an optional clang::Type or similar (just like we do keep RecordDecl's around wrapped in an attribute for cir.struct). It's possible some analysis might wanna check uses of (or lack of) size_t, which we know it's an alias for a primitive type but we don't wanna create a new type for it.

then add a few qualifier attributes to mark const and volatile types.

This brings an interesting point, qualifiers in clang are not part of the type, I believe the intent was to optimize memory usage for not creating extra types every time there's a qualifier variation (and also possibly helps to implement deductions that drop qualifiers, etc). We should probably handle qualifiers as part of step (2) or a new step (3), so we have some time to think/discuss while we make progress. Thoughts?

sitio-couto · 2023-04-25T19:35:38Z

@bcardosolopes,

Regarding step 1, using mlir::IntegerType and mlir::Float built-in types should suffice.

If we can take advantage of the underlying in-tree primitive types to represent that, all we would need is an extra getter method for C/C++ specific signed/unsigned queries.

The mlir::IntegerType can track the width of an integer as well as if it is singed/unsigned/signless (See sitio-couto@762cd50).

The built-in Floating point types seem to cover all C/C++ primitives as well.

With this in mind, should we use MLIR's built-in type for C/C++ primitives instead of custom CIR types?
I'm not sure how would we benefit from a custom cir.int/cir.float otherwise.

bcardosolopes · 2023-04-26T17:27:50Z

The mlir::IntegerType can track the width of an integer as well as if it is singed/unsigned/signless (See sitio-couto@762cd50).

The built-in Floating point types seem to cover all C/C++ primitives as well.

Yeah I now, it's pretty attractive (I've been there).

With this in mind, should we use MLIR's built-in type for C/C++ primitives instead of custom CIR types? I'm not sure how would we benefit from a custom cir.int/cir.float otherwise.

I still believe we should wrap them so we can customize as we see fit (e.g. adding CIR specific attributes that won't be dropped by random passes) and hide the rest of CIR from MLIR in tree changes - example: if they decide at some point that the types should be part of a specific dialect, we only have to change our implementation in one specific place. It will also make our lives easier when adding qualifiers (be it by incorporating clang types with specific attributes or adding our own notion of qualifiers).

lanza · 2023-04-26T19:15:45Z

I still believe we should wrap them so we can customize as we see fit (e.g. adding CIR specific attributes that won't be dropped by random passes) and hide the rest of CIR from MLIR in tree changes - example: if they decide at some point that the types should be part of a specific dialect, we only have to change our implementation in one specific place. It will also make our lives easier when adding qualifiers (be it by incorporating clang types with specific attributes or adding our own notion of qualifiers).

Yup. As a first-principle we want to avoid being coupled to downstream MLIR changes. Rebasing against MLIR is an absolute nightmare. And when they change functionality of dialects/types/etc we become upwards exposed to surprise behavioral differences. I doubt mlir::IntegerType is changing much going forward, but AFAIK there's no guarantee that it doesn't.

At a high level, we use MLIR as an infrastructure for writing an IR and not as a tree of dialects that we can use.

It would be fine as an intermediate step if we used mlir's type, but that just pushes the work forward to some future date.

Updates CodeGen type converter and emitters to handle sign information of integer values. Lowering is also updated to convert const_arrays of signed types. Most tests were also updated since MLIR uses a 's' and 'u' prefix on integer types to identify their sings. Fix llvm#62

sitio-couto · 2023-05-15T17:22:58Z

@bcardosolopes @lanza can you take a look at this draft:

#72

It implements a custom cir.int type and attribute to partially detach CIR from MLIR's built-in integers and track signedness information.

sitio-couto · 2023-05-24T18:10:03Z

#72 merged

Kuree mentioned this issue May 2, 2023

[CIR][Dialect] Add logical binop #68

Merged

sitio-couto closed this as completed May 24, 2023

This was referenced May 24, 2023

Create custom CIR floating point types. #78

Closed

Create a custom CIR void type #79

Closed

orbiri-ns mentioned this issue Apr 22, 2024

Global "NaN" Is Lowered to 0.0 During LLVM Conversion #559

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Differentiate between signed and unsigned types #62

Differentiate between signed and unsigned types #62

sitio-couto commented Apr 18, 2023

bcardosolopes commented Apr 18, 2023

sitio-couto commented Apr 25, 2023

bcardosolopes commented Apr 25, 2023 •

edited

Loading

sitio-couto commented Apr 25, 2023

bcardosolopes commented Apr 26, 2023

lanza commented Apr 26, 2023 •

edited

Loading

sitio-couto commented May 15, 2023

sitio-couto commented May 24, 2023

Differentiate between signed and unsigned types #62

Differentiate between signed and unsigned types #62

Comments

sitio-couto commented Apr 18, 2023

bcardosolopes commented Apr 18, 2023

sitio-couto commented Apr 25, 2023

bcardosolopes commented Apr 25, 2023 • edited Loading

sitio-couto commented Apr 25, 2023

bcardosolopes commented Apr 26, 2023

lanza commented Apr 26, 2023 • edited Loading

sitio-couto commented May 15, 2023

sitio-couto commented May 24, 2023

bcardosolopes commented Apr 25, 2023 •

edited

Loading

lanza commented Apr 26, 2023 •

edited

Loading