In [134]:
%run -i ../python/common.py
UC_SKIPTERMS=True
%run -i ../python/ln_preamble.py

# UC-SLS Lecture 20 : Using LibC to access the OS and escape the confines our process
- Preliminaries
  - libraries
  - Standard library : `libc.a[.so]`
- Address Space management:
  - dynamic memory for data items: `malloc` and `free`
  - more powerfully control using `mmap` and `munmap`
- I/O
  - low-level file descriptor based: `open`, `read`, `write`, `close`, `getc`, `getchar`, `gets`, `putc`, `putchar`, `puts`
  - Formatted and Buffered IO: `fopen`, `fread`, `fwrite`, `fclose`, `fprintf`/`printf`,
  `fscanf`/`scanf`, `fgetc`, `fgets`, `fputs`, `fputc`, `fgetpos`, `fsetpos`, `fseek`, `fflush`, 

In [136]:
# setup for sumit examples
appdir=os.getenv('HOME')
appdir=appdir + "/libc"
#print(movdir)
output=runTermCmd("[[ -d " + appdir + " ]] &&  rm -rf "+ appdir + 
             ";mkdir " + appdir + 
             ";cp ../src/Makefile ../src/cexp.c ../src/badmsgcode.c ../src/dynmem.c ../src/dynmemsyscalls.c " + appdir)

display(Markdown('''
- create a directory `mkdir libc; cd libc`
- copy examples
- add a `Makefile` to automate assembling and linking
    - we are going run the commands by hand this time to highlight the details
- normally you would want to track everything in git
'''))
TermShellCmd("ls " + appdir)


- create a directory `mkdir libc; cd libc`
- copy examples
- add a `Makefile` to automate assembling and linking
    - we are going run the commands by hand this time to highlight the details
- normally you would want to track everything in git


$ ls /home/jovyan/libc
badmsgcode.c  cexp.c  dynmem.c  dynmemsyscalls.c  Makefile
$ 


## Overview

<center>
<img src="../images/LibC-001.png" >
</center>

<center>
<img src="../images/LibC-002.png" >
</center>

<center>
<img src="../images/LibC-003.png" >
</center>

## Preliminaries 
### SDKs 

Developers package and distribute "native" code for an specific computer and OS as collection of documentation, header files, and libraries.  We often call this collection a Software Development ToolKit (SDK).  The functions and types defined in an SDK are often referred to as an Application Programmer Interface.  

#### Documentation

Developers provide documentation that explains the API in terms of the functions and types that their code provides for your use.  Traditionally on UNIX systems this comprises as set of man pages.  

**man complex**

In [68]:
TermShellCmd("man complex", noposttext=True, markdown=False)

$ man complex
COMPLEX(7)                 Linux Programmer's Manual                COMPLEX(7)

NNAAMMEE
       complex - basics of complex mathematics

SSYYNNOOPPSSIISS
       ##iinncclluuddee <<ccoommpplleexx..hh>>

DDEESSCCRRIIPPTTIIOONN
       Complex  numbers  are  numbers of the form z = a+b*i, where a and b are
       real numbers and i = sqrt(-1), so that i*i = -1.

       There are other ways to represent that number.  The pair (a,b) of  real
       numbers  may be viewed as a point in the plane, given by X- and Y-coor‐
       dinates.  This same point may also be described by giving the  pair  of
       real  numbers (r,phi), where r is the distance to the origin O, and phi
       the angle between the X-axis and the line Oz.  Now z =  r*exp(i*phi)  =
       r*(cos(phi)+i*sin(phi)).

       The basic operations are defined on z = a+b*i and w = c+d*i as:

       aaddddiittiioonn:: zz++ww == ((aa++cc)

**man cexp**

In [69]:
TermShellCmd("man cexp", noposttext=True, markdown=False)

$ man cexp
CEXP(3)                    Linux Programmer's Manual                   CEXP(3)

NNAAMMEE
       cexp, cexpf, cexpl - complex exponential function

SSYYNNOOPPSSIISS
       ##iinncclluuddee <<ccoommpplleexx..hh>>

       ddoouubbllee ccoommpplleexx cceexxpp((ddoouubbllee ccoommpplleexx _z));;
       ffllooaatt ccoommpplleexx cceexxppff((ffllooaatt ccoommpplleexx _z));;
       lloonngg ddoouubbllee ccoommpplleexx cceexxppll((lloonngg ddoouubbllee ccoommpplleexx _z));;

       Link with _-_l_m.

DDEESSCCRRIIPPTTIIOONN
       These  functions  calculate  e  (2.71828...,  the base of natural loga‐
       rithms) raised to the power of _z.

       One has:

           cexp(I * z) = ccos(z) + I * csin(z)

VVEERRSSIIOONNSS
       These functions first appeared in glibc in version 2.1.

AATTTTRR

**Documentation tells us**
    
0. Tells us how to use the functions, macros and types of a SDK
1. What header files we must include in our source to call particular functions
2. What libraries we must include when we link

#### Header files

As we discussed to generate assembly for a call to a function compiler must have
1. A declaration for a function 
2. Definitions for all types it requires
3. and possibly preprocessor macros 

The headers files of a library provide these things so that your code can compile with calls to the libraries functions. Remember we use preprocessor `#include <file>` to substitute the contents of `<file>` into our own source.

#### Libraries

Libraries are a new kind of file for us. The are "archives" of object files.

Two main types on
1. Static archive eg. Linux: `libm.a` 
2. Dynamic archive eg. Linux: `libm.so` 

Statically linking requires static library and dynamic linking requires dynamic

##### Linker and Libraries
 
We can add a library when we link our executable by passing the right parameters to the linker.  

eg.  `-l<name>` will tell linker to use objects from `lib<name>.a` or `lib<name>.so` if it needs too

Specifically the library file contains a table of contents with all the object files and symbols define in those objects.  If your our object files reference a symbol that you do not define the linker will look for it in the table of contents.  If there is an object file in the library that defines the symbol it will include the necessary object file as if you had specified as one of your object files.

#### Example

In [128]:
display(Markdown('<font size="6rem">' + FileCodeBox(
    file=appdir + "/cexp.c", 
    lang="c", 
    title="<b>C: cexp.c",
    h="100%", 
    w="100%"
)))
TermShellCmd("[[ -a cexp ]] && rm cexp;make cexp", cwd=appdir, prompt='', noposttext=True)
TermShellCmd("./cexp", cwd=appdir, noposttext=True)

<font size="6rem"><b>C: cexp.c
<div style="width:100%; height:100%; font-size:inherit; overflow: auto;" >


``` c
#include <math.h> /* for atan */
#include <stdio.h>
#include <complex.h>
#include <stdlib.h>

int
main(int argc, char **argv)
{
  double y=1.0;
  if (argc>1) y=atof(argv[1]);
  double pi = 4 * atan(y);
  double complex z = cexp(I * pi);
  printf("%f + %f * i\n", creal(z), cimag(z));
}

```


</div>



gcc  -g  -Os -static -Xlinker -Map=cexp.map cexp.c -o cexp -lm

$ ./cexp
-1.000000 + 0.000000 * i



**How does the preprocessor and linker know where to find things**

The compiler driver, in our case `gcc` passes parameters to both the preprocessor and linker
- `-I <dir>` tells preprocessor to look for header files in `<dir>` 
   - several standard directories are specified by default
     - eg. `-I/usr/include`

In [71]:
TermShellCmd("ls -la /usr/include/math.h /usr/include/stdio.h /usr/include/complex.h", noposttext=True)

$ ls -la /usr/include/math.h /usr/include/stdio.h /usr/include/complex.h
-rw-r--r-- 1 root root  7164 Dec 16  2020 /usr/include/complex.h
-rw-r--r-- 1 root root 46404 Dec 16  2020 /usr/include/math.h
-rw-r--r-- 1 root root 29950 Dec 16  2020 /usr/include/stdio.h



- We can look at look at these files if we want to see the details 

- `-L <dir>` similarly tells the linker to look for libraries in `<dir>`
  - several standard directories are specified by default
       - eg. `-L/usr/lib/x86_64-linux-gnu`

**The linker map file lets us see all the .o's that got linked in and where they came from**


In [72]:
TermShellCmd("head -20 cexp.map ", cwd=appdir, noposttext=True)

$ head -20 cexp.map 
Archive member included to satisfy reference by file (symbol)

/usr/lib/x86_64-linux-gnu/libm-2.31.a(s_atan.o)
                              /tmp/ccQxfA00.o (atan)
/usr/lib/x86_64-linux-gnu/libm-2.31.a(s_cexp.o)
                              /tmp/ccQxfA00.o (cexp)
/usr/lib/x86_64-linux-gnu/libm-2.31.a(mpa.o)
                              /usr/lib/x86_64-linux-gnu/libm-2.31.a(s_atan.o) (__mp_dbl)
/usr/lib/x86_64-linux-gnu/libm-2.31.a(mpatan.o)
                              /usr/lib/x86_64-linux-gnu/libm-2.31.a(s_atan.o) (__mpatan)
/usr/lib/x86_64-linux-gnu/libm-2.31.a(mpsqrt.o)
                              /usr/lib/x86_64-linux-gnu/libm-2.31.a(mpatan.o) (__mpsqrt)
/usr/lib/x86_64-linux-gnu/libm-2.31.a(s_atan-fma.o)
                              /usr/lib/x86_64-linux-gnu/libm-2.31.a(s_atan.o) (__atan_fma)
/usr/lib/x86_64-linux-gnu/libm-2.31.a(mpa-fma.o)
                              /usr/lib/x86_64-linux-gnu/libm-2.31.a(s_atan-fma.o) (__dbl_mp_fma)
/

**`ar` is a tool for working with static libraries**
see `man ar` for details

We can use it to list the table of contents of the `.o` in the archive

In [73]:
TermShellCmd("ar t /usr/lib/x86_64-linux-gnu/libm-2.31.a | head", cwd=appdir, noposttext=True)

$ ar t /usr/lib/x86_64-linux-gnu/libm-2.31.a | head
s_lib_version.o
s_matherr.o
s_signgam.o
fclrexcpt.o
fgetexcptflg.o
fraiseexcpt.o
fsetexcptflg.o
ftestexcept.o
fegetround.o
fesetround.o



We can even use it to extract a member (just like the linker does when linking)

In [74]:
TermShellCmd("ar x /usr/lib/x86_64-linux-gnu/libm-2.31.a s_cexp.o; ls -l s_cexp.o", cwd=appdir, noposttext=True)

$ ar x /usr/lib/x86_64-linux-gnu/libm-2.31.a s_cexp.o; ls -l s_cexp.o
-rw-r--r-- 1 jovyan root 3792 Dec  6 16:06 s_cexp.o



**And no surprise it is an object file like the kind we have been creating**

In [75]:
TermShellCmd("objdump -d -Mintel s_cexp.o | head -20", cwd=appdir, noposttext=True)

$ objdump -d -Mintel s_cexp.o | head -20

s_cexp.o:     file format elf64-x86-64


Disassembly of section .text:

0000000000000000 <__cexp>:
   0:	f3 0f 1e fa          	endbr64 
   4:	53                   	push   rbx
   5:	66 0f 28 d0          	movapd xmm2,xmm0
   9:	66 0f 28 e9          	movapd xmm5,xmm1
   d:	66 0f 28 c1          	movapd xmm0,xmm1
  11:	66 0f 28 f2          	movapd xmm6,xmm2
  15:	48 83 ec 40          	sub    rsp,0x40
  19:	f3 0f 7e 1d 00 00 00 	movq   xmm3,QWORD PTR [rip+0x0]        # 21 <__cexp+0x21>
  20:	00 
  21:	64 48 8b 04 25 28 00 	mov    rax,QWORD PTR fs:0x28
  28:	00 00 
  2a:	48 89 44 24 38       	mov    QWORD PTR [rsp+0x38],rax
  2f:	31 c0                	xor    eax,eax



**What about `_start`, `atof` and `fprintf`**

- Where did they come from????
   - Lets look for them in the map file
      - map file can even tell not only what file a symbol came from but also where it ends up being placed in the memory image of our executable

In [76]:
TermShellCmd("grep -B 1 ' _start$' cexp.map", cwd=appdir, noposttext=True)

$ grep -B 1 ' _start$' cexp.map
 .text          0x0000000000401c30       0x35 /usr/lib/gcc/x86_64-linux-gnu/9/../../../x86_64-linux-gnu/crt1.o
                0x0000000000401c30               [01;31m[K _start[m[K



In [77]:
TermShellCmd("grep -B 1 ' atof$' cexp.map", cwd=appdir, noposttext=True)

$ grep -B 1 ' atof$' cexp.map
 .text          0x000000000041f460        0xb /usr/lib/gcc/x86_64-linux-gnu/9/../../../x86_64-linux-gnu/libc.a(atof.o)
                0x000000000041f460               [01;31m[K atof[m[K



In [78]:
TermShellCmd("grep -B 1 ' fprintf$' cexp.map", cwd=appdir, noposttext=True)

$ grep -B 1 ' fprintf$' cexp.map
 .text          0x000000000047ae00       0xb7 /usr/lib/gcc/x86_64-linux-gnu/9/../../../x86_64-linux-gnu/libc.a(fprintf.o)
                0x000000000047ae00               [01;31m[K fprintf[m[K



#### Defaults

- C compiler driver ensures that we always link against a set of standard object files and libraries
- But the core one is `libc` -- The C standard library!


## C Standard Library `libc` (`-lc`)

The C standard library was developed at the same time as the core language
- There are standards a C compiler and C standard library implementation can conform too
  - eg. https://www.iso.org/standard/17782.html
- The gnu C compiler `gcc` has its associated gnu libc `glibc`
  - https://www.gnu.org/software/libc/manual/html_node/index.html
  - These are the standards it conforms too
    - https://www.gnu.org/software/libc/manual/html_node/Standards-and-Portability.html#Standards-and-Portability

### Overview

- The C Standard library is very large and provides many categories of routines

- We will only consider a small fraction of its functionality
  2. Dynamic Memory Management 
  3. Basic overview of IO

### Dynamic Memory Management
- https://www.gnu.org/software/libc/manual/html_mono/libc.html#Memory-Concepts
- https://www.gnu.org/software/libc/manual/html_mono/libc.html#Dynamic-Memory-Allocation
    - C Language has no built in support for Dynamic Memory Variables 
    - other than automatics (function local variables and function parameters)
  - Must use system calls to get and remove memory from the process
  - Must use pointers to track it

#### Two categories

LibC provides routines that give a programmer the ability to dynamically allocate memory 
- The Memory Allocator (https://www.gnu.org/software/libc/manual/html_mono/libc.html#The-GNU-Allocator)
  - It calls the OS system calls for the programmer 
    - There is a lot of subtlety to implementing a high performance memory allocator
  - Basic idea is 
    1. The allocator code in libc `malloc` and `free` are called by the application code
    2. These routines allocate large chucks of memory from the OS
    3. They then break these large chucks down handing out pieces as requested by `malloc` calls
    4. And coaleasing pieces back into the chucks when `free` is called
    5. If large requests are made the libc routines call OS to create a separate mappings for these
    6. Similarly if these large requests are freed they immediately free them to the OS
- Directly calling `mmap` or `brk`

#### Main calls
- https://www.gnu.org/software/libc/manual/html_node/Summary-of-Malloc.html#Summary-of-Malloc

```
void *malloc (size_t size)
// Allocate a block of size bytes. See Basic Allocation.

void free (void *addr)
// Free a block previously allocated by malloc. See Freeing after Malloc.

void *realloc (void *addr, size_t size)
// Make a block previously allocated by malloc larger or smaller, possibly by copying it to a new location. See Changing Block Size.

void *reallocarray (void *ptr, size_t nmemb, size_t size)
// Change the size of a block previously allocated by malloc to nmemb * size bytes as with realloc. See Changing Block Size.

void *calloc (size_t count, size_t eltsize)
//Allocate a block of count * eltsize bytes using malloc, and set its contents to zero. See Allocating Cleared Space.

void *valloc (size_t size)
// Allocate a block of size bytes, starting on a page boundary. See Aligned Memory Blocks.

void *aligned_alloc (size_t size, size_t alignment)
// Allocate a block of size bytes, starting on an address that is a multiple of alignment. See Aligned Memory Blocks.

int posix_memalign (void **memptr, size_t alignment, size_t size)
// Allocate a block of size bytes, starting on an address that is a multiple of alignment. See Aligned Memory Blocks.

void *memalign (size_t size, size_t boundary)
//Allocate a block of size bytes, starting on an address that is a multiple of boundary. See Aligned Memory Blocks.

int mallopt (int param, int value)
// Adjust a tunable parameter. See Malloc Tunable Parameters.

int mcheck (void (*abortfn) (void))
// Tell malloc to perform occasional consistency checks on dynamically allocated memory, and to call abortfn when an inconsistency is found. See Heap Consistency Checking.

struct mallinfo2 mallinfo2 (void)
// Return information about the current dynamic memory usage. See Statistics of Malloc.
```

### Explore with an example

#### man malloc

In [79]:
TermShellCmd("man 3 malloc", noposttext=True, markdown=False)

$ man 3 malloc
MALLOC(3)                  Linux Programmer's Manual                 MALLOC(3)

NNAAMMEE
       malloc, free, calloc, realloc - allocate and free dynamic memory

SSYYNNOOPPSSIISS
       ##iinncclluuddee <<ssttddlliibb..hh>>

       vvooiidd **mmaalllloocc((ssiizzee__tt _s_i_z_e));;
       vvooiidd ffrreeee((vvooiidd _*_p_t_r));;
       vvooiidd **ccaalllloocc((ssiizzee__tt _n_m_e_m_b,, ssiizzee__tt _s_i_z_e));;
       vvooiidd **rreeaalllloocc((vvooiidd _*_p_t_r,, ssiizzee__tt _s_i_z_e));;
       vvooiidd **rreeaallllooccaarrrraayy((vvooiidd _*_p_t_r,, ssiizzee__tt _n_m_e_m_b,, ssiizzee__tt _s_i_z_e));;

   Feature Test Macro Requirements for glibc (see ffeeaattuurree__tteesstt__mmaaccrrooss(7)):

       rreeaallllooccaarrrraayy():
 

**What does this code do?**
- What do you think will happen?

In [133]:
display(Markdown("<font size='6.5em'>" + FileCodeBox(
    file=appdir + "/dynmem.c", 
    lang="c", 
    title="<b>C: dynmem.c",
    h="100%", 
    w="100%"
)))
TermShellCmd("[[ -a dynmem ]] && rm dynmem;make dynmem", cwd=appdir, prompt='', noposttext=True)

<font size='6.5em'><b>C: dynmem.c
<div style="width:100%; height:100%; font-size:inherit; overflow: auto;" >


``` c
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
// Use debugger to explore what happens
int main(int argc, char **argv)
{
  char *cptr;
  int n = 4096;

  cptr = malloc(n);
  memset(cptr, 0xaa, n);
  free(cptr);
  return 0;
}

```


</div>



gcc -Werror  -g dynmem.c -o dynmem



In [81]:
display(showDT())

<b>Debug</b>

- demonstrate strace ./dynmem
  - what do you expect to see
Use debugger 
  - use `jump` command and `set var n=X` to execute several malloc calls
  - explore /proc/<pid>/maps
  - set breakpoints on malloc, sbrk, brk, 
    - disass - brk easy see syscall
    - break on syscall instruction - look at maps
    - use where to show call chains
- then add below and use strace

```c
#include <stdlib.h>
#include <unistd.h>

// Use debugger to explore what happens                                            

int
main(int argc, char **argv)
{
  char *cptr;
  char c;

  read(0, &c, 1);

  int n = 4096;
  cptr = malloc(n);
  read(0, &c, 1);

  n = 1024 * 1024  * 1024;
  cptr = malloc(n);
  read(0, &c, 1);

  n = 1024 * 1024  * 1024;
  cptr = malloc(n);
  read(0, &c, 1);

  free(cptr);
  return 0;
}
```

#### LibC typically C wrappers for OS system calls and support for making syscalls directly in C


**libc provides C wrappers for system calls**

- Wrappers expose C function interface for system calls of the OS
- You can lookup C version and simply call it like any other C function call
  - implementation in libc takes care of all the assembly stuff for you
    - putting parameters in the right registers
    - filling in the system call number

**man 2 syscalls**

In [82]:
TermShellCmd("man 2 syscalls|head -80", noposttext=True, markdown=False)

$ man 2 syscalls|head -80
SYSCALLS(2)                Linux Programmer's Manual               SYSCALLS(2)

NAME
       syscalls - Linux system calls

SYNOPSIS
       Linux system calls.

DESCRIPTION
       The system call is the fundamental interface between an application and
       the Linux kernel.

   System calls and library wrapper functions
       System calls are generally not invoked directly, but rather via wrapper
       functions in glibc (or perhaps some other library).  For details of di‐
       rect invocation of a system call, see intro(2).  Often, but not always,
       the  name of the wrapper function is the same as the name of the system
       call that it invokes.  For example, glibc contains a  function  chdir()
       which invokes the underlying "chdir" system call.

       Often the glibc wrapper function is quite thin, doing little work other
       than copying arguments to the right registers before invoking the  sys‐
       tem  call, 

**man 2 brk**

In [83]:
TermShellCmd("man 2 brk", noposttext=True, markdown=False)

$ man 2 brk
BRK(2)                     Linux Programmer's Manual                    BRK(2)

NNAAMMEE
       brk, sbrk - change data segment size

SSYYNNOOPPSSIISS
       ##iinncclluuddee <<uunniissttdd..hh>>

       iinntt bbrrkk((vvooiidd **_a_d_d_r));;

       vvooiidd **ssbbrrkk((iinnttppttrr__tt _i_n_c_r_e_m_e_n_t));;

   Feature Test Macro Requirements for glibc (see ffeeaattuurree__tteesstt__mmaaccrrooss(7)):

       bbrrkk(), ssbbrrkk():
           Since glibc 2.19:
               _DEFAULT_SOURCE ||
                   (_XOPEN_SOURCE >= 500) &&
                   ! (_POSIX_C_SOURCE >= 200112L)
           From glibc 2.12 to 2.19:
               _BSD_SOURCE || _SVID_SOURCE ||
                   (_XOPEN_SOURCE >= 500) &&
                   ! (_POSIX_C_SOURCE >= 200112L)
           Before glibc 2.12:
               _BSD_SOURCE || _SVID_SOURCE || _XOPEN_SOURCE

**glibc also has support for using the syscall instruction in C**

**man 2 syscall**

In [84]:
TermShellCmd("man 2 syscall", noposttext=True, markdown=False)

$ man 2 syscall
SYSCALL(2)                 Linux Programmer's Manual                SYSCALL(2)

NNAAMMEE
       syscall - indirect system call

SSYYNNOOPPSSIISS
       ##iinncclluuddee <<uunniissttdd..hh>>
       ##iinncclluuddee <<ssyyss//ssyyssccaallll..hh>>   /* For SYS_xxx definitions */

       lloonngg ssyyssccaallll((lloonngg _n_u_m_b_e_r,, ......));;

   Feature Test Macro Requirements for glibc (see ffeeaattuurree__tteesstt__mmaaccrrooss(7)):
       ssyyssccaallll():
           Since glibc 2.19:
               _DEFAULT_SOURCE
           Before glibc 2.19:
               _BSD_SOURCE || _SVID_SOURCE

DDEESSCCRRIIPPTTIIOONN
       ssyyssccaallll()  is  a  small  library  function that invokes the system call
       whose assembly language interface has the  specified  _n_u_m_b_e_r  with  the
       specified  arguments.  Employing

**In the following example we will call brk directly rather than letting malloc call it**

Do not do this -- you can screw up malloc only for example purposes


In [138]:
display(Markdown('<font size="6rem">' + FileCodeBox(
    file=appdir + "/dynmemsyscalls.c", 
    lang="c", 
    title="<b>C: dynmemsyscalls.c",
    h="100%", 
    w="107em"
)))
TermShellCmd("[[ -a dynmemsyscalls ]] && rm dynmemsyscalls;make dynmemsyscalls", cwd=appdir, prompt='', noposttext=True)

<font size="6rem"><b>C: dynmemsyscalls.c
<div style="width:107em; height:100%; font-size:inherit; overflow: auto;" >


``` c
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/syscall.h>   /* For SYS_xxx definitions */

int
main(int argc, char **argv)
{
  char *cptr;
  int n = 4096;

  cptr = malloc(n);
  memset(cptr, 0xaa, n);
  free(cptr);

  cptr = sbrk(4096);
  memset(cptr, 0xaa, 4096);

  cptr = (void *) syscall(12, 0); // hardcode syscall number
  syscall(SYS_brk, cptr + 4096);  // use constant from header
  memset(cptr, 0xaa, 4096);

  return 0;
}

```


</div>



gcc -Werror  -g dynmemsyscalls.c -o dynmemsyscalls



#### You can use `mmap` and `munmap` in your C programs to manage your own  dynamic address space mapping


**man mmap**

It is very powerful but also has a lot of parameters 

In [86]:
TermShellCmd("man 2 mmap|head -40", noposttext=True, markdown=False)

$ man 2 mmap|head -40
MMAP(2)                    Linux Programmer's Manual                   MMAP(2)

NAME
       mmap, munmap - map or unmap files or devices into memory

SYNOPSIS
       #include <sys/mman.h>

       void *mmap(void *addr, size_t length, int prot, int flags,
                  int fd, off_t offset);
       int munmap(void *addr, size_t length);

       See NOTES for information on feature test macro requirements.

DESCRIPTION
       mmap()  creates a new mapping in the virtual address space of the call‐
       ing process.  The starting address for the new mapping is specified  in
       addr.   The  length argument specifies the length of the mapping (which
       must be greater than 0).

       If addr is NULL, then the kernel chooses the (page-aligned) address  at
       which to create the mapping; this is the most portable method of creat‐
       ing a new mapping.  If addr is not NULL, then the kernel takes it as  a
       hint about where

### Under the covers `malloc`, `free` and friends
- are calling `brk`, `mmap` and `munmap` 
- you and your code does not need to deal with the details
  - rather a simpler interface that is consistent across CPU types and OS
    - get me `n` bytes of memory and return a pointer to it 
    - give back memory at this address

#### Important points about dynamic memory

What is wrong with this code?

In [118]:
display(Markdown('<font size="6rem">' + FileCodeBox(
    file=appdir + "/badmsgcode.c", 
    lang="c", 
    title="<b>C: badmsgcode.c",
    h="100%", 
    w="100%"
)))
TermShellCmd("[[ -a badmsgcode.o ]] && rm badmsgcode.o;make badmsgcode.o", cwd=appdir, prompt='', noposttext=True)

<font size="6rem"><b>C: badmsgcode.c
<div style="width:100%; height:100%; font-size:inherit; overflow: auto;" >


``` c
#include <stdlib.h>
#include <assert.h>

// waits for request message to arrive: returns length in bytes and updates integer pointed to
// by idPtr with id of request
int getRequest(int *idPtr);
// read the data of request with id into memory pointed to by buffer
void readRequestData(int id, char *buffer);
// process request with id and data in memory pointed to by buffer, frees buffer when done
void processRequest(int id, char *buffer);

int
main(int argc, char **argv)
{
  int n;
  int *id_ptr;
  char *msg_buffer;

  // my server loop
  while (1) {
    id_ptr = malloc(sizeof(int));
    assert(id_ptr != 0);
    n = getRequest(id_ptr);
    msg_buffer = malloc(n);
    assert(msg_buffer != 0);
    readRequestData(*id_ptr, msg_buffer);
    processRequest(*id_ptr, msg_buffer);
  }
  // should never get here
  exit(0);
}

```


</div>



gcc  -g -Werror -c  badmsgcode.c -o badmsgcode.o



- this is why higher level languages have a lot of code to hide the details of dynamic memory
  - reference counting
  - garbage collectors

## Layers

What we see is a general pattern of layering

- Assembly is at the bottom
- OS kernel provides an assembly interface to its routines 
  - x86 Linux system calls invoked via `syscall` instruction and the right register values
- lowest layer of libc routines provides wrappers
- The other lib routines build on these to add more functionality eg. `malloc` and `free`
  - thus programmers and C applications are isolated from details of OS syscalls
    - simply need to learn libc which is ported to each computer OS pair

## IO is no different

### Lowest layer is wrappers for core UNIX I/O routines

- `open`, `close`, `read`, `write`, `dup`, `mkdir`, `exec`, `mmap`

### A layer up buffered and formated I/O

- `fopen`, `fclose`, `fread`, `fwrite`, 
- `fprintf` (`printf`), `fscanf` (`scanf`)
