Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support casting to structures and data types along with sizeof operator in the script engine #321

Open
SinaKarvandi opened this issue Nov 23, 2023 · 5 comments
Labels
enhancement An enhancement to an existing feature feature New feature or request

Comments

@SinaKarvandi
Copy link
Member

This is one of the highly needed features and makes HyperDbg users able to write scripts that can use Microsoft symbols, drivers, and applications symbols.

Previously I created the following file in the symbol parser:
https://github.com/HyperDbg/HyperDbg/blob/master/hyperdbg/symbol-parser/code/casting.cpp

The current symbol parser of HyperDbg has functions to fulfill the casting requirements, however for making things as simple as possible, I made a function that gets the symbol name (e.g., a structure name), and a field of a structure and then returns whether the following field is a pointer or a structure, sizeof the structure or the field, and offset of the field from the top of the structure.

This is the link to this function:

SymCastingQueryForFiledsAndTypes(_In_ const char * StructName,

(Note that, the above functionality is available in the symbol parser but to make testing of the script engine simpler for the initial design, it is written like this).

What needed to be implemented?

  1. The 'sizeof' operator. This operator is really useful for debugging, and reverse engineering. If we could manage the script engine to support the 'sizeof' operator, it would enhance the script engine.

  2. The cast operator. Generally, as HyperDbg's script engine doesn't have any types, we could use the casting characters. Assume the following structures:

typedef struct _UNICODE_STRING
{
    USHORT Length;        // +0x000
    USHORT MaximumLength; // +0x002
    PWSTR  Buffer;        // +0x004
} UNICODE_STRING, *PUNICODE_STRING;

typedef struct _STUPID_STRUCT1
{
    UINT32          Flag32;      // +0x000
    UINT64          Flag64;      // +0x004
    PVOID           Context;     // +0x00c
    PUNICODE_STRING StringValue; // +0x014
} STUPID_STRUCT1, *PSTUPID_STRUCT1;

typedef struct _STUPID_STRUCT2
{
    UINT32          Sina32;        // +0x000
    UINT64          Sina64;        // +0x004
    PVOID           AghaaSina;     // +0x00c
    PUNICODE_STRING UnicodeStr;    // +0x014
    PSTUPID_STRUCT1 StupidStruct1; // +0x01c

} STUPID_STRUCT2, *PSTUPID_STRUCT2;

These structures are compiled with the following offsets:

        Local var @ 0x95fa18df58 Type AllocateStructForCasting::__l2::_STUPID_STRUCT2*
        0x00000260`d065ee30

           +0x000 Sina32           : 0x32
           +0x008 Sina64           : 0x64
           +0x010 AghaaSina        : 0x00000000`00000055 Void
           +0x018 UnicodeStr       : 0x00000260`d065ec70 AllocateStructForCasting::__l2::_UNICODE_STRING
              +0x000 Length           : 0x40
              +0x002 MaximumLength    : 0x40
              +0x008 Buffer           : 0x00000260`d065ecf0  "Goodbye I'm at stupid struct 2!"
           +0x020 StupidStruct1    : 0x00000260`d065eda0 AllocateStructForCasting::__l2::_STUPID_STRUCT1
              +0x000 Flag32           : 0x3232
              +0x008 Flag64           : 0x6464
              +0x010 Context          : 0x00000000`00000085 Void
              +0x018 StringValue      : 0x00000260`d065eb50 AllocateStructForCasting::__l2::_UNICODE_STRING
                 +0x000 Length           : 0x3c
                 +0x002 MaximumLength    : 0x3c
                 +0x008 Buffer           : 0x00000260`d065ebd0  "Hi come from stupid struct 1!"

A suggested grammar for handling casts is like this (feel free to discuss and change the casting grammar if it seems not suitable, but this is what came to my mind):

 Grammar :

      EXPRSSION-> cast < STRING0 > ( EXPRESSION ) STRING_SEQUENCE
      STRING_SEQUENCE->.STRING[1 - N] STRING_SEQUENCE
      STRING_SEQUENCE->'->' STRING STRING_SEQUENCE
      STRING_SEQUENCE->eps

As a result, HyperDbg's script engine should use the above function and support these statements (note that the value of the test cases is commented at the next line):


        my_var =  cast<PSTUPID_STRUCT2>(@rcx)->Sina32;
        // my_var = 0x32

        my_var =  cast<PSTUPID_STRUCT2>(@rcx)->Sina64;
        // my_var = 0x64

        my_var =  cast<PSTUPID_STRUCT2>(@rcx)->AghaaSina;
        // my_var = 0x55

        my_var =  cast<PSTUPID_STRUCT2>(@rcx).Unknownnnnnn;
        // Error because Unknownnnnnn not found

        my_var =  cast<PSTUPID_STRUCT2>(@rcx)->Unknownnnnnn;
        // Error because Unknownnnnnn not found

        my_var =  cast<PSTUPID_STRUCT2>(@rcx).Sina32;
        // Error because PSTUPID_STRUCT2 is pointer, should use '->'

        my_var =  cast<PSTUPID_STRUCT2>(*@rcx).Sina32;
        // Error because PSTUPID_STRUCT2 is pointer, should use '->'

        my_var =  cast<STUPID_STRUCT2>(*@rcx).Sina32;
        // my_var = 0x32

        my_var =  cast<STUPID_STRUCT2>(*@rcx).Sina64;
        // my_var = 0x64

        my_var =  cast<STUPID_STRUCT2>(*@rcx).AghaaSina;
        // my_var = 0x55

        my_var =  cast<STUPID_STRUCT2>(*@rcx).UnicodeStr->MaximumLength;
        // my_var = 0x40

        my_var =  cast<PSTUPID_STRUCT2>(@rcx)->UnicodeStr->MaximumLength;
        // my_var = 0x40

        my_var =  cast<PSTUPID_STRUCT2>(@rcx)->StupidStruct1->Flag32;
        // my_var = 0x3232

        my_var =  cast<PSTUPID_STRUCT2>(@rcx)->StupidStruct1->Flag64;
        // my_var = 0x6464

        my_var = cast<STUPID_STRUCT2>(*@rcx).StupidStruct1->StringValue->MaximumLength;
        // my_var = 0x3c

        printf("Result is : %ws\n", cast<PSTUPID_STRUCT2>(@rcx)->UnicodeStr->Buffer );
        // Result is : Goodbye I'm at stupid struct 2!"

        printf("Result is : %ws\n", cast<PSTUPID_STRUCT2>(@rcx)->StupidStruct1->StringValue->Buffer );
        // Result is : Goodbye I'm at stupid struct 2!"

Please note that the only thing that matters in the script engine is the offsets. The symbol parser doesn't need to parse anything in the kernel (VMX-root mode) because everything is parsed in the user-mode parser and once the offsets are determined, the results of dereferences and other offsets will be sent to VMX-root mode.

Feel free to discuss this feature and possible modifications to it.

@SinaKarvandi
Copy link
Member Author

@xmaple555 If you find free time, it would be best if you could help us with this, but if it needs too much work or if it's too hard to be done, no worries, I'll find another way. 🙂

@SinaKarvandi SinaKarvandi added feature New feature or request enhancement An enhancement to an existing feature labels Nov 23, 2023
@xmaple555
Copy link
Member

@xmaple555 If you find free time, it would be best if you could help us with this, but if it needs too much work or if it's too hard to be done, no worries, I'll find another way. 🙂

It is not difficult to add type for variables in the script engine. I can make the script engine more like c language, but I will fix the current bugs first.

@SinaKarvandi
Copy link
Member Author

Sure, if it can handle the types, that would be great and make the implementation cleaner since we won't have to use the 'cast' keyword anymore. Let me know once anything needs to be done on my side.

@LukeTheEngineer
Copy link

Open to another contributor?

@SinaKarvandi
Copy link
Member Author

Open to another contributor?

HyperDbg welcomes contributions from everyone, but as @xmaple555 is actively working on the script engine development, please make sure to coordinate with him to avoid overlapping efforts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement An enhancement to an existing feature feature New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants