# Binary files  The basics :
A binary file is usually a file that's not a text file (i-e readable by human).
Binary files can be compiled programm, but also images, videos, sound files, other data formats...

## Structure :
Depending on the format, binary files will have different structure. But they will mostly be a sequence of bytes (bits (binary digits) grouped by eights).

Some binary contain headers (blocks of metadata) used by a computer programm to interpret the data in the file.

The header often contins a signature or magic number which can identify the format. For instance, a `GIF` file can contain multiple images, and headers are used to identify and describe each block of image data. The leading bytes of the header would contain text like `GIF87a` or `GIF89a` that can identify the binary as a `GIF` file.

If a binary file does not contain any headers, it may be called a **flat binary file**.

### Signatures (aka. "Magic numbers") instances :

(taken from [wikipedia](https://en.wikipedia.org/wiki/List_of_file_signatures))

| Hex signature | ISO 8859-1 | Offset | Extension | Description |
| --- | --- | --- | --- | --- |
| `23 21` | `#!` | 0 |  | Script or data to be passed to the programm following the shebang (`#!`) |
| `53 51 4C 69 74 65 20 66 6F 72 6D 61 74 20 33 00` | `SQLite format 3` | 0 | sqlitedb sqlite db | SQLite Database |
| `FF D8 FF E0` | `ÿØÿà` | 0 | jpg | `JPEG` raw or in the `JFIF` or `Exif` file format |
| `7F 45 4C 46` | `␡ELF` | 0 | | [Executable and Linkable Format](https://en.wikipedia.org/wiki/Executable_and_Linkable_Format) |

## The simple file (Hello world) we'll use in these notebooks :
We'll use the following C code :

In [None]:
%%writefile helloworld.c
#include <stdio.h>
int main(){
    printf("Hello world!\n");
    return 0;
}

...that will be compiled that way :

In [None]:
%%script bash --no-raise-error
gcc helloworld.c -o helloworld

## Viewing binary files :
Reading a binary file usually gives gibberish. An hex editor can be used to handle such files :

### Example with a simple hello world :

We use the following C code :

In [None]:
%%writefile helloworld.c
#include <stdio.h>
int main(){
    printf("Hello world!\n");
    return 0;
}

We compile it this way :

In [None]:
%%script bash --no-raise-error
gcc helloworld.c -o helloworld

And we end up trying to display the result with a `cat` :

In [None]:
%%script bash --no-raise-error
cat helloworld

And why not with `cat -e` :

In [None]:
%%script bash --no-raise-error
cat -e helloworld

#### This isn't really clear, right? Here are some tools that can help :

##### For viewing :
###### hexdump
hexdump can display our binary in a more readable way :

In [None]:
%%script bash --no-raise-error
hexdump -C helloworld

###### xxd
So does `xxd`, here with binary display :

In [None]:
%%script bash --no-raise-error
xxd -b helloworld

And here with hexadecimal display :

In [None]:
%%script bash --no-raise-error
xxd helloworld

##### For editing :
###### With Emacs :
Emacs provides a documentation for [editing binary files](https://www.gnu.org/software/emacs/manual/html_node/emacs/Editing-Binary-Files.html).

You can try to open emacs and activate the `Hexl` mode (`M-x hexl-mode`).

###### With vim :
[SO post about editing binaries with Vim](https://vi.stackexchange.com/q/2232).

###### With VSC :

###### With bloc note :
![image.png](attachment:f52416dd-6776-4666-8b67-40b179a15cb3.png)


## Executable files :
Executable files are files containing machine code meant as instructions for CPU.

## Object files :

## Various executable file formats exist :

- [ELF](https://en.wikipedia.org/wiki/Executable_and_Linkable_Format) : 