Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

example: Create sleigh-lift example program #9

Merged
merged 19 commits into from
Nov 5, 2021
Merged

Conversation

tetsuo-cpp
Copy link
Contributor

@tetsuo-cpp tetsuo-cpp commented Oct 20, 2021

Closes #1

  • Build in CI
  • Use more stringent compiler warnings
  • Make this part of the sleigh CMake project instead of a separate one
  • Figure out what's wrong with the remill-lift example
  • Figure out whether the excessive Pcode is incorrect

@tetsuo-cpp tetsuo-cpp marked this pull request as draft October 20, 2021 15:12
@tetsuo-cpp
Copy link
Contributor Author

tetsuo-cpp commented Oct 20, 2021

This still needs to be cleaned up but I wanted to check that it's more or less what you were expecting. I've modelled it after remill-lift.

At the moment, it works like this:

tetsuo@Alexs-MacBook-Pro build % ./sleigh-lift disassemble ~/Build/install/share/sleigh/Processors/x86/data/languages/x86-64.sla 4881ecc00f0000
0x00000000: SUB RSP,0xfc0
tetsuo@Alexs-MacBook-Pro build % ./sleigh-lift pcode ~/Build/install/share/sleigh/Processors/x86/data/languages/x86-64.sla 4881ecc00f0000
(register,0x200,1) = INT_LESS (register,0x20,8) (const,0xfc0,8)
(register,0x20b,1) = INT_SBORROW (register,0x20,8) (const,0xfc0,8)
(register,0x20,8) = INT_SUB (register,0x20,8) (const,0xfc0,8)
(register,0x207,1) = INT_SLESS (register,0x20,8) (const,0x0,8)
(register,0x206,1) = INT_EQUAL (register,0x20,8) (const,0x0,8)
(unique,0x12c00,8) = INT_AND (register,0x20,8) (const,0xff,8)
(unique,0x12c80,1) = POPCOUNT (unique,0x12c00,8)
(unique,0x12d00,1) = INT_AND (unique,0x12c80,1) (const,0x1,1)
(register,0x202,1) = INT_EQUAL (unique,0x12d00,1) (const,0x0,1)

I don't know why there's so much Pcode for that instruction.

I noticed that the disassembly doesn't match for the Remill example here. It looks like this for me:

tetsuo@Alexs-MacBook-Pro build % ./sleigh-lift disassemble ~/Build/install/share/sleigh/Processors/x86/data/languages/x86-64.sla c704ba01
0x00000000: MOV word ptr [SI],0x1ba

I'll have to look into that also.

@pgoodman
Copy link
Contributor

I don't know why there's so much Pcode for that instruction.

I think that's the p-code for SUB RSP,0xfc0? If so, then the complexity is due to x86 flags computations, in which case that p-code looks absolutely fine to me :-)

Overall this seems good. Can you make some convenience functions for automatically discovering the locations of these SLA files? Remill has the same problem, and I have some search paths hard-coded [1], some of which are CMake variables. Probably you could do a better job of what I did in Remill by using CMake's ability to configure a file, substituting @ variables.

[1] https://github.com/lifting-bits/remill/blob/master/lib/BC/Util.cpp#L497-L548

Copy link
Contributor

@pgoodman pgoodman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. A few superficial edits requested.

Otherwise, do you have a sense on how extensible the assembly printers are? Do they break the assembly down into any kind of structure, such as tokens or trees, or would I need to assemble then disassemble to get that?

example/src/main.cpp Outdated Show resolved Hide resolved
example/src/main.cpp Outdated Show resolved Hide resolved
example/src/main.cpp Outdated Show resolved Hide resolved
tools/sleigh-lift/CMakeLists.txt Outdated Show resolved Hide resolved
tools/sleigh-lift/src/main.cpp Outdated Show resolved Hide resolved
@tetsuo-cpp
Copy link
Contributor Author

Otherwise, do you have a sense on how extensible the assembly printers are? Do they break the assembly down into any kind of structure, such as tokens or trees, or would I need to assemble then disassemble to get that?

It doesn't seem to provide a tree structure. You just get an address, an instruction mnemonic and then a "body" which is basically just the arguments.

@tetsuo-cpp tetsuo-cpp marked this pull request as ready for review November 3, 2021 12:09
@tetsuo-cpp
Copy link
Contributor Author

tetsuo-cpp commented Nov 3, 2021

Can you make some convenience functions for automatically discovering the locations of these SLA files?

This isn't trivial so I'd prefer to make a follow up change with this (I've made #12 to track it). I'm guessing this should be exposed in the header that we discussed above since other projects will want this too.

The problem is that the .sla files aren't checked in. They're generated from .slaspec files being fed into the sleigh_opt compiler, and they're all generated in different directories alongside the .slaspec files. So I really need some platform independent way of copying all the .sla files into a single directory within the installation and then passing that into a CMake variable such that it can be used in the code.

I think the right thing to do is to stop using sleigh_opt in batch mode and instead have a foreach that loops over spec_file_list and invokes the compiler on each one. And use get_filename_component on the .slaspec to put together a desired filepath to write the .sla file to.

tools/sleigh-lift/src/main.cpp Outdated Show resolved Hide resolved
tools/sleigh-lift/src/main.cpp Outdated Show resolved Hide resolved
@pgoodman
Copy link
Contributor

pgoodman commented Nov 3, 2021

I think the right thing to do is to stop using sleigh_opt in batch mode and instead have a foreach that loops over spec_file_list and invokes the compiler on each one. And use get_filename_component on the .slaspec to put together a desired filepath to write the .sla file to.

Try to use std::filesystem::path when/where possible. We have some repos that have a FindFilesystem.cmake to make sure we always find it; check Dr. Lojekyll repo.

@tetsuo-cpp
Copy link
Contributor Author

tetsuo-cpp commented Nov 4, 2021

Try to use std::filesystem::path when/where possible. We have some repos that have a FindFilesystem.cmake to make sure we always find it; check Dr. Lojekyll repo.

Sounds good.

I was referring more to the CMake code that runs on installation though. At the moment the generated .sla files are littered throughout the installation so I need to fix this so they're all in one directory. Then I can point the C++ helper at the directory containing the .sla files and use the std::filesystem stuff to find them.

Though, this isn't really related to sleigh-lift so I'd prefer to just get this in and follow up with the header + helpers to find .sla files.

Copy link
Contributor

@pgoodman pgoodman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets pull out that one "megaheader" include thing with the pragma's and call it a PR :-)

include/libsleigh.hh Outdated Show resolved Hide resolved
@pgoodman pgoodman merged commit 66c4f8b into master Nov 5, 2021
@pgoodman pgoodman deleted the alex/sleigh-example branch November 5, 2021 15:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add example program
2 participants