Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor documentation. #63

Closed
wants to merge 6 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 38 additions & 0 deletions .github/workflows/actions.yml
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,44 @@ jobs:
with:
name: ddisasm-build-${{ matrix.os }}-${{ matrix.compiler }}
path: build

test-documenation:
runs-on: ubuntu-latest
permissions:
packages: read
strategy:
matrix:
os: [focal]
compiler: [g++, clang++]
needs: docker
env:
BUILD_TYPE: Release
container: ${{ needs.docker.outputs.image_path }}${{ matrix.os }}:${{ needs.docker.outputs.image_tag }}
steps:
- name: Install capstone, gtirb, gtirb-pprinter
run: |
curl https://download.grammatech.com/gtirb/files/apt-repo/conf/apt.gpg.key | apt-key add -
echo "deb https://download.grammatech.com/gtirb/files/apt-repo ${{ matrix.os }} unstable" >> /etc/apt/sources.list
apt-get update
apt-get -y install libcapstone-dev=1:5.0.0-gtdev libgtirb libgtirb-dev libgtirb-pprinter libgtirb-pprinter-dev gtirb-pprinter
- name: Checkout ddisasm
uses: actions/checkout@v3
- name: Build
run: |
mkdir build
cd build
cmake -DCMAKE_CXX_COMPILER=${{ matrix.compiler }} -DCMAKE_BUILD_TYPE=${BUILD_TYPE} -DLIEF_ROOT=/usr/ ..
make
- name: Upload artifacts
uses: actions/upload-artifact@v3
with:
name: ddisasm-build-${{ matrix.os }}-${{ matrix.compiler }}
path: build
- name: build documentation
run: |
cd doc/
make html

test:
runs-on: ubuntu-latest
permissions:
Expand Down
10 changes: 10 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,11 @@
#documentation

!doc/source/3-REFERENCES/src_docs/
src/datalog/


#builds

*.o
*.a

Expand Down Expand Up @@ -40,6 +48,8 @@ examples/**/*.gtirb
examples/**/*.exp
examples/**/*.lib

#cache

/gtirb
/gtirb-pprinter
/libehp
Expand Down
32 changes: 16 additions & 16 deletions doc/Makefile
Original file line number Diff line number Diff line change
@@ -1,16 +1,16 @@
all: ddisasm.html ddisasm.1.gz

datalog-docs:
python3 build_index.py
python3 -m sphinx . datalog-docs
clean:
rm -f *.html *.gz

%.html: %.md
pandoc -s -t html $< -o $@

%.md.tmp: %.md
pandoc -s -t man $< -o $@

%.1.gz: %.md.tmp
gzip -9 < $< > $@
SPHINXOPTS ?=
SPHINXBUILD ?= sphinx-build
SOURCEDIR = source
BUILDDIR = build
# Put it first so that "make" without argument is like "make help".
help:
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
%: Makefile
python3 build_index.py
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
clean:
rm -f *.html *.gz
7 changes: 4 additions & 3 deletions doc/build_index.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,8 @@

DDISASM_ROOT = Path(__file__).resolve().parent.parent

DL_DOCS = DDISASM_ROOT / "doc" / "src_docs"
DL_DOCS = DDISASM_ROOT / "doc" / "source" / "3-REFERENCES" / "src_docs"
CSV_DOCS = DDISASM_ROOT / "doc" / "source" / "3-REFERENCES" / "src_docs"
gogo2464 marked this conversation as resolved.
Show resolved Hide resolved


ARCHITECTURES = [
Expand All @@ -28,7 +29,7 @@ def build_main_index() -> None:
glob.glob(f"{DDISASM_ROOT}/src/datalog/**/*.dl", recursive=True)
):
dl_file = dl_file[len(f"{DDISASM_ROOT}/src/datalog/") : -len(".dl")]
print(f"creating {dl_file} in /doc/src_docs/")
print(f"creating {dl_file} in /doc/sources/3-REFERENCES/src_docs/")
source_doc_page = (DL_DOCS / dl_file).with_suffix(".rst")
if not source_doc_page.parent.exists():
source_doc_page.parent.mkdir(parents=True, exist_ok=True)
Expand Down Expand Up @@ -61,7 +62,7 @@ def build_dependecy_graph() -> None:
dependencies[edge.get_source().replace('"', "")].add(
edge.get_destination().replace('"', "")
)
with open(DL_DOCS / "dependencies.csv", mode="w") as f:
with open(CSV_DOCS / "dependencies.csv", mode="w") as f:
for src in sorted(dependencies):
for dest in sorted(dependencies[src]):
print(src, dest, file=f)
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
Chapter 1 disassemble and reassemble ls Linux command
=====================================================


Introduction
------------

We are going to disassemble the ls comand program on Linux. We will work on an x64 linux elf binary.

chapter 1: disassemble
-----------------------

cp "$(which ls)" new-ls
ddisasm ./new-ls --asm ls.s
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are actually moving away from this approach. We recommend generating a GTIRB file first

ddisasm ./new-ls --ir ls.gtirb

And then using the gtirb-pprinter. The option -b knows how to reassemble (which library dependencies to include.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can update the example myself if that is easier once the other issues are resolved.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes of course fell free to update the example yourself. Just mention my nick as member of the authors in commits.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aeflores Could you provide me the commands that I should write on the doc please?



chapter 1: reassemble
-----------------------

Reverse engineering the ls libraries.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

In order to tell to the compiler (such as gcc) what libraries to use to recompile the program, we will do some reverse engineeing to list the libraries.


readelf --dynamic ./new-ls
Dynamic section at offset 0x21a58 contains 28 entries:
Tag Type Name/Value
0x0000000000000001 (NEEDED) Shared library: [libselinux.so.1]
0x0000000000000001 (NEEDED) Shared library: [libc.so.6]

Lets' focus on `Shared library: [libselinux.so.1]`. We now know that the program ls uses the library selinux. Let's install it.

```bash
sudo apt install selinux-utils -y libselinux1-dev ;
```

Compile back!
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

gcc -nostartfiles ls.s -o ls-out -l selinux


run!
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

./ls-out
myfile hey.png hello.out


Congratulation!

Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
Contribute to ddisasm documentation
===================================

Introduction
------------

Disassm uses the template of the 4 types of documentations including:
1-tutorials (to get started)
2-how to (real life issues solving)
3-references (the autogenerated api)
4-explaination (for the concepts)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we don't have content for 4, I would leave it out for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going to provide a "four types of documentation templates". This is about the context of the project.

  • What can we do with reverse engineering in general. Not specifically any ctf. Some laws about reverse engineering to use it safely.
  • alternatives such as radare2, ghidra and IDA pro disassemblers. I will tell all the bugs in r2 as well as the lack of ability to reconstruct a full binary in ghidra and IDA pro. I will also tell that ddisasm is slow and consumes ram.

As you mentionned. I have not wrote this yet. Can I write it or do I remove the chapter?


Build Documentation
-------------------

Introduction
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

`ddisamsm` documentation is built with `sphinx`.

Most of the dependencies come from python. Then you must use `pip` securely with `virtualenv`

Documentation building dependencies
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Debian
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

```bash
sudo wget https://souffle-lang.github.io/ppa/souffle-key.public -O /usr/share/keyrings/souffle-archive-keyring.gpg
echo "deb [signed-by=/usr/share/keyrings/souffle-archive-keyring.gpg] https://souffle-lang.github.io/ppa/ubuntu/ stable main" | sudo tee /etc/apt/sources.list.d/souffle.list
sudo apt update
sudo apt install souffle -y


sudo apt install python3 virtualenv -y &&
virtualenv -p python3 venv3-ddisasm &&
source venv3-ddisasm/bin/activate &&
pip install networkx sphinx sphinx_rtd_theme pydot
```

Windows
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The compilation on Windows requries many dependencies more likely related to Linux than to Windows and is then much more complicated than Linux. It requires then many many dependencies compilable only from source. For this reason, we strongly recommand to build documentation on Linux than Windows.

Install python3 and pip.

Open a powershell tab and then:

```powershell
python -m venv venv3-ddisasm
.\venv3-ddisasm\Scripts\activate
pip install networkx sphinx sphinx_rtd_theme -y
```

MacOs
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
TODO


Build documentation from source
--------------------------------------
cd doc
make all


Build documentation
--------------------------------------

In order to generate API subdocumentation from sphinx, run `sphinx-apidoc -o doc/3-REFERENCE/src_docs/ ./`, then you are finally able to create documentation with:

```bash
make html
```

Check the folder `doc/build/html` ... congratulation! You have generated the documentation from source!


In case of doubt, fell free to use `make help`

```bash
$ make help
Sphinx v6.1.3
Please use `make target' where target is one of
html to make standalone HTML files
dirhtml to make HTML files named index.html in directories
singlehtml to make a single large HTML file
pickle to make pickle files
json to make JSON files
htmlhelp to make HTML files and an HTML help project
qthelp to make HTML files and a qthelp project
devhelp to make HTML files and a Devhelp project
epub to make an epub
latex to make LaTeX files, you can set PAPER=a4 or PAPER=letter
latexpdf to make LaTeX and PDF files (default pdflatex)
latexpdfja to make LaTeX files and run them through platex/dvipdfmx
text to make text files
man to make manual pages
texinfo to make Texinfo files
info to make Texinfo files and run them through makeinfo
gettext to make PO message catalogs
changes to make an overview of all changed/added/deprecated items
xml to make Docutils-native XML files
pseudoxml to make pseudoxml-XML files for display purposes
linkcheck to check all external links for integrity
doctest to run all doctests embedded in the documentation (if enabled)
coverage to run coverage check of the documentation (if enabled)
clean to remove everything in the build directory
```
9 changes: 9 additions & 0 deletions doc/source/2-HOW-TO/CASE1-CRYPTANALYSIS-CTF.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
Case 1: Reverse Engineering a cryptanalysis CTF with DDisasm
============================================================================================================================================================


Introduction
------------

Subsection
~~~~~~~~~~
13 changes: 13 additions & 0 deletions doc/source/2-HOW-TO/CASE2-MODDING-PWNISLAND.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
Case 1: Modding the video game pwn island
aeflores marked this conversation as resolved.
Show resolved Hide resolved
=========================================


Introduction
------------

Modding is the art to modify a video game or a software *with no source code* in order to make it better and customize it. It could be considered as malicious by some copyright owner as well as an act of freedom by free software defenders. Let's stay far from the melee and focus on the technic.

In order to avoid any copyright attempt, we will focus on a video game made to be reversed to teach reverse engineering. It name is pwn island.

Subsection
~~~~~~~~~~
1 change: 1 addition & 0 deletions doc/source/3-REFERENCES/src_docs/.gitkeep
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@

44 changes: 44 additions & 0 deletions doc/source/4-EXPLANATIONS/alternatives.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
gdb
================
tough GDB is a tool to code IDe for programmer and is not a reverse engineering tool, some people use it to disassemble and debug over compiled programs. In the opinion of the ddisasm dev it is a mistake to use gdb for reversing. Otherwhere, due to a lack of tool to fit with pwntools library interoperability, you will need to use gdb in order to exploit memory corruption vulnerabilities with pwntools. There is a pwntools current work to hadnle radare2. I strongly discourage people to use gdb out of scope of coding if no reason.

Use gdb if you need to exploit memory corruption vulnerabilities if you have a lack of tooling on pwntools.
Use ddisasm for reverse engineering.


radare2 / rizin
================

Radare2 and Rizin are two framwork of reverse engineering in command line. Contrary to gdb they really aim to be reverse engineering tools. They are not supposed to be only for dev. With them you can patch, view graph and more that you can not with gdb that is not madde for reverse engineering. There are evgen some conflict between developpers of two sides in order to know who has less bugs. Globaly r2 dev focus on features when Rizin claims to focus on testing. Rizin actually has less features than radare2.

As their philosophy is to limit the number of dependence to 0, they sometimes have bugs. Both project have 14 old years of active development. This points reducts from far the number off potential bugs.

Use Radare2 / Rizin if you need to:
#. Patch a program if you have enough dead place in the program hard disk... not confortable.
#. Debug a program with view graph
#. Decompile a program
#. Use all type of reversing tactics.

Use specifically Radare2 if you need to:
#. reverse exotic architecture.
Use specifically Rizin if:
#.you need to reverse a common architecture and you require absolutely reliable tooling.

ddisasm
==================
Ddisasm is probably the most accurate disassembler. You can disassemble so accuratly that you can litteraly recompile your assembly language with a compilator.
DDisasm does not provide reversing / debugging tool by itself.


The best way in my opinion: ddisasm with another framwork
===========================================================

#. Disassemble your program with ddisasm.
#. Debug the producted ATT&T assembly language with radare2 / rizin.
#. Document the labels, the function names and document in code comment and why not even add unit test manually in Ddisasm disassembled assembly language.
#. Once everything is documented, debug your program in a IDE with gdb debugger.

You should have all the documented assembly code! Congratulations! You could even rewrite the assemblies functions in c using interoperability between C and assembly language.

DDisasm is not a framwork of reverse engineering compared to r2. You theoritically only disassemble. But you could with the help of other tool as mentionned previously.s

Loading