Skip to content

Latest commit

 

History

History
executable file
·
2034 lines (1621 loc) · 70.2 KB

README.md

File metadata and controls

executable file
·
2034 lines (1621 loc) · 70.2 KB

Lab 4: Virtual Memory (and User Program)

Hints for this Lab

  1. Character sequencial order
    • VMware virtual machine and normal PC is using Little-Endian
    • In Nachos simulator processor is using Big-Endian
    • Be careful when using WordToMachine() and ShortToMachine() for proper transition
  2. How program is put in address space
    • Nachos described this in bin/noff.h
      • Struct segment represents a segmet of a program
      • Struct noffHeader define the code section, initialized data section and uninitialized data section
  3. Related source code
    • code/machine/machine.h
    • code/machine/machine.cc
    • code/machine/translate.h
    • code/machine/translate.cc
    • code/userprog/addrspace.h
    • code/userprog/addrspace.cc
    • code/userprog/exception.cc

There is a micro called HOST_IS_BIG_ENDIAN. The CheckEndian() in code/machine/machine.cc will check the format by using a array.

The exception supported by Nachos is defined in code/machine/machine.cc and code/machine/machine.h Note that page fault and no TLB entry shared PageFaultException depends on using linear page table or TLB.

// Textual names of the exceptions that can be generated by user program
// execution, for debugging.
static char* exceptionNames[] = { "no exception", "syscall", 
				"page fault/no TLB entry", "page read only",
				"bus error", "address error", "overflow",
				"illegal instruction" };
enum ExceptionType { NoException,           // Everything ok!
		     SyscallException,      // A program executed a system call.
		     PageFaultException,    // No valid translation found
		     ReadOnlyException,     // Write attempted to page marked 
					    // "read-only"
		     BusErrorException,     // Translation resulted in an 
					    // invalid physical address
		     AddressErrorException, // Unaligned reference or one that
					    // was beyond the end of the
					    // address space
		     OverflowException,     // Integer overflow in add or sub.
		     IllegalInstrException, // Unimplemented or reserved instr.
		     
		     NumExceptionTypes
};

The register is defined in code/machine/machine.h

Nachos implement all the thirty-two MIPS R2/3000 Registers. Additionally, with 8 more special purpose register for better simulation and debug usage.

// User program CPU state.  The full set of MIPS registers, plus a few
// more because we need to be able to start/stop a user program between
// any two instructions (thus we need to keep track of things like load
// delay slots, etc.)

// MIPS GPRs
#define StackReg	29	// User's stack pointer
#define RetAddrReg	31	// Holds return address for procedure calls

#define NumGPRegs	32	// 32 general purpose registers on MIPS

// Nachos SPRs
#define HiReg		32	// Double register to hold multiply result
#define LoReg		33
#define PCReg		34	// Current program counter
#define NextPCReg	35	// Next program counter (for branch delay) 
#define PrevPCReg	36	// Previous program counter (for debugging)
#define LoadReg		37	// The register target of a delayed load.
#define LoadValueReg 	38	// The value to be loaded by a delayed load.
#define BadVAddrReg	39	// The failing virtual address on an exception

#define NumTotalRegs 	40

The MIPS registers

Register Number Conventional Name Usage
$0 $zero Hard-wired to 0
$1 $at Reserved for pseudo-instructions
2−3 v0, v1 Return values from functions
4−7 a0−a3 Arguments to functions - not preserved by subprograms
8−15 t0−t7 Temporary data, not preserved by subprograms
16−23 s0−s7 Saved registers, preserved by subprograms
24−25 t8−t9 More temporary registers, not preserved by subprograms
26−27 k0−k1 Reserved for kernel. Do not use.
$28 $gp Global Area Pointer (base of global data segment)
$29 $sp Stack Pointer
$30 $fp Frame Pointer
$31 $ra Return Address

When define USER_PROGRAM (Compile code/userprog) the Machine will be create in code/threads/system.h

#ifdef USER_PROGRAM
#include "machine.h"
extern Machine* machine;	// user program memory and registers
#endif

And in code/threads/system.cc

void
Initialize(int argc, char **argv)
{
    // ...

#ifdef USER_PROGRAM
    bool debugUserProg = FALSE;	// single step user program
#endif

    // ...

#ifdef USER_PROGRAM
    if (!strcmp(*argv, "-s"))
        debugUserProg = TRUE;
#endif

    // ...

#ifdef USER_PROGRAM
    machine = new Machine(debugUserProg);	// this must come first
#endif
}

Debug using m for machine emulation (including exception) and a for address spaces. Use -s to debug user program in single step. (this will show machine registers including PC and assembly instructions)

I. TLB exception handling

Currently, Nachos memory management is using software simulate the TLB mechanism. Its working principle, exception handling, replacement algorithm is similar with paging memory management.

Exercise 1: Trace code

Read the code/userprog/progtest.cc, understand the procedure of Nachos executing user program and memory menagement related point

Read the following code, understand current Nachos TLB technique and Address Binding (地址轉換) mechanism

  • code/machine/machine.h
  • code/machine/machine.cc
  • code/userprog/exception.cc

Executing User Program

The code/userprog/progtest.cc is used to put all the test to simulate a user space program.

The entry point is in code/threads/main.cc. When using -x to execute Nachos, it will call StartProcess(), and it is defined in code/userprog/progtest.cc

In StartProcess() it will do three main step.

  1. Open the executable file

    • using FileSystem::Open() which is defined in code/filesys/filesys.cc
  2. Create a address space for the executable. Assign to current thread and then initial the register.

    space = new AddrSpace(executable);
    currentThread->space = space;
    
    space->InitRegisters();		// set the initial register values
    space->RestoreState();		// load page table register
  3. Finally, call Machine::Run() to run the user program. And it's defined in code/machine/mipssim.cc

TLB Technique and Address Binding

In code/machine/machine.cc there is a macro called USE_TLB that controls whether to use TLB. (which should be enabled while compiling. I'll mention in Exercise 2)

The main memory is also defined in code/machine/machine.cc as mainMemory as an array of char with the size of MemorySize.

The initial TLB class TranslationEntry is declared in code/machine/translate.h. The TLB object has been used in code/userprog/addrspace.h as *pageTable and in code/machine/machine.h as *tlb (read-only pointer) and *pageTable. => Shared with Pate Table and TLB!

As mention, machine has a pointer to the "page table of the current user process" and a pointer to "system TLB".

The memory translation is defined in code/machine/translate.cc as Machine::Translate() (but is declared in code/machine/machine.h). The procedure is as following.

  1. First get vpn and offset

    vpn = (unsigned) virtAddr / PageSize;
    offset = (unsigned) virtAddr % PageSize;
  2. There are two cases: use linear page table or use TLB (not both at the same time "in original implementation assumption")

    • If use page table

      if (vpn >= pageTableSize) {
          DEBUG('a', "virtual page # %d too large for page table size %d!\n", virtAddr, pageTableSize);
          return AddressErrorException;
      } else if (!pageTable[vpn].valid) {
          DEBUG('a', "virtual page # %d too large for page table size %d!\n", virtAddr, pageTableSize);
          return PageFaultException;
      }
      
      entry = &pageTable[vpn];
    • If use TLB

      for (entry = NULL, i = 0; i < TLBSize; i++)
          if (tlb[i].valid && (tlb[i].virtualPage == vpn)) {
          entry = &tlb[i];			// FOUND!
          break;
          }
      
      if (entry == NULL) {				// not found
              DEBUG('a', "*** no valid TLB entry found for this virtual page!\n");
              return PageFaultException;		// really, this is a TLB fault,
                          // the page may be in memory,
                          // but not in the TLB
      }
  3. Read-only error detection

  4. Get the page frame number

    pageFrame = entry->physicalPage
  5. Check if the page frame is valid

  6. Combine to get the physical address!

    *physAddr = pageFrame * PageSize + offset

The Machine::Translate() is called when using Machine::ReadMem() and Machine::WriteMem(). When the exception is raised it will called Machine::RaiseException(). In it will switch to System Mode and call the ExceptionHandler() in code/userprog/exception.cc. And then it will call Machine::ReadRegister(2) to get the type. But in initial state, this only handle the SC_Halt type SyscallException. (The system call types are defined in code/userprog/syscall.h)

Memory Related Definition

In code/machine/machine.h.

// Definitions related to the size, and format of user memory

#define PageSize 	SectorSize 	// set the page size equal to
					// the disk sector size, for
					// simplicity

#define NumPhysPages    32
#define MemorySize 	(NumPhysPages * PageSize)
#define TLBSize		4		// if there is a TLB, make it small

In code/machine/disk.h.

#define SectorSize 		128	// number of bytes per disk sector

So we can found that the total memory size MemorySize is 32 × 128 Bytes = 4KB.

Exercise 2: TLB MISS exception handling

Modify the ExceptionHandler() in code/userprog/exception.cc. Makes Nachos able to handle TLB exception

(When TLB exception, Nachos will throw PageFaultException. Reference code/machine/machine.cc)

To use TLB must define the USE_TLB macro that I've mentioned in Exercise 1.

So add -DUSE_TLB in code/userprog/Makefile

DEFINES = -DUSE_TLB

I was decided to change the original ExceptionHandler() in code/userprog/exception.cc to use switch case instead of if else. But I want to make as little modification as possible at this point. So maybe wait until implementing system call.

How Nachos Fetch Instruction

In code/machine/mipssim.cc

Machine::Run() {
    ...

    while (1) {
        OneInstruction(instr);

        ...
    }
}
Machine::OneInstruction(Instruction *instr)
{
    ...

    // Fetch instruction
    if (!machine->ReadMem(registers[PCReg], 4, &raw))
        return;			// exception occurred
    instr->value = raw;
    instr->Decode();

    ...

    // Compute next pc, but don't install in case there's an error or branch.
    int pcAfter = registers[NextPCReg] + 4;

    ...


    // Now we have successfully executed the instruction.

    ...

    // Advance program counters.
    registers[PrevPCReg] = registers[PCReg];	// for debugging, in case we
						// are jumping into lala-land
    registers[PCReg] = registers[NextPCReg];
    registers[NextPCReg] = pcAfter;

If Nachos fail fetch instruction (most likely to be Page Fault Exception). Then it won't execute PC+4 because machine->ReadMem() will return false.

Thus we only need to update TLB in exception handler, and Nachos will execute same instruction and try to translate again.

The BadVAddr Register in MIPS

This register (its name stands for Bad Virtual Address) will contain the memory address where the exception has occurred. An unaligned memory access, for instance, will generate an exception and the address where the access was attempted will be stored in BadVAddr.

Page Fault

There are two cases will lead to page fault.

  1. TLB miss
    • When doing either Machine::ReadMem() or Machine::WriteMem() in code/machine/translate.cc and TLB fail. It will pass addr and call Machine::RaiseException() in code/machine/machine.cc. And then it will preserve this address in BadVAddrReg

      registers[BadVAddrReg] = badVAddr;
    • Thus all we need to do is to read the address from the BadVAddrReg and then calculate the vpn in ExceptionHandler() in code/userprog/exception.cc. And there are two cases

      1. There is empty entry in TLB. Insert it into TLB.
      2. TLB is full. Replace with some algorithm which will be implement in Exercise 3.
  2. Page table failed
    • For now this won't happen because all the page frame are loaded in the memory.
      • When new AddrSpace(executable) in code/userprog/progtest.cc the StartProcess() function. While initializing a address space will check ASSERT(numPages <= NumPhysPages). (Thus the total user program is limited in 4KB) => So page table won't fail at this point.
      • The problem will be mentioned again and will be solved in Exercise 7 which will implement the demend paging technique.

Disable the ASSERT(tlb == NULL || pageTable == NULL); in code/machine/translate.cc to let TLB exist with the page table.

The way to seperate the exception caused by TLB or origianl linear page table is to check if(machine->tlb == NULL). Just like the default Nachos did in the Machine::Translate in code/machine/translate.cc.

In code/userprog/exception.cc append the ExceptionHandler() as the following:

void
ExceptionHandler(ExceptionType which)
{
    // Lab4: Page Fault Handling
    if (which == PageFaultException) {
        if (machine->tlb == NULL) { // linear page table page fault
            DEBUG('m', "=> Page table page fault.\n");
            // In current Nachos this shouldn't happen
            // because physical page frame == virtual page number
            // (can be found in AddrSpace::AddrSpace in userprog/addrspace.cc)
            // On the other hand, in our Lab we won't use linear page table at all
            ASSERT(FALSE);
        } else { // TLB miss (no TLB entry)
            // Lab4 Exercise2
            DEBUG('m', "=> TLB miss (no TLB entry)\n");
            int BadVAddr = machine->ReadRegister(BadVAddrReg); // The failing virtual address on an exception
            TLBMissHandler(BadVAddr);
        }
        return;
    }

    // System Call
    ...
}

Test Exercise 2

In this exercise, I test the TLB with only 2 entry to simplify the test. There is a reason that we couldn't simplify test with TLB with only 1 entry. Because in Machine::OneInstruction() it will always need to fetch the instruction using Machine::ReadMem(). But in some operation it will need to use Machine::WriteMem(). If we fail on the second Machine::Translate(), it will return False. And re-fetch the instruction again. And thus override the TLB miss handling that we've just made for the second one. This will end up cause the infinity loop.

Here is the simplified TLBMissHandler()

int TLBreplaceIdx = 0;

void
TLBMissHandler(int virtAddr)
{
    unsigned int vpn;
    vpn = (unsigned) virtAddr / PageSize;

    // ONLY USE FOR TESTING Lab4 Exercise2
    // i.e. assume TLBSize = 2
    machine->tlb[TLBreplaceIdx] = machine->pageTable[vpn];
    TLBreplaceIdx = TLBreplaceIdx ? 0 : 1;
}

Test using the executable test program in code/test. (If the file is too big will get Assertion failed: line 81, file "../userprog/addrspace.cc")

Using docker (build with cd Lab/Lab0_BuildNachos; ./build_subdir_nachos.sh userprog)

docker run nachos_userprog nachos/nachos-3.4/code/userprog/nachos -d am -x nachos/nachos-3.4/code/test/halt

Here is the result:

Initializing address space, num pages 10, size 1280
Initializing code segment, at 0x0, size 256
Initializing stack register to 1264
Starting thread "main" at time 10
Reading VA 0x0, size 4
        Translate 0x0, read: *** no valid TLB entry found for this virtual page!
Exception: page fault/no TLB entry
=> TLB miss (no TLB entry)
Reading VA 0x0, size 4
        Translate 0x0, read: phys addr = 0x0
        value read = 0c000034
At PC = 0x0: JAL 52
Reading VA 0x4, size 4
        Translate 0x4, read: phys addr = 0x4
        value read = 00000000
At PC = 0x4: SLL r0,r0,0
Reading VA 0xd0, size 4
        Translate 0xd0, read: *** no valid TLB entry found for this virtual page!
Exception: page fault/no TLB entry
=> TLB miss (no TLB entry)
Reading VA 0xd0, size 4
        Translate 0xd0, read: phys addr = 0xd0
        value read = 27bdffe8
At PC = 0xd0: ADDIU r29,r29,-24
Reading VA 0xd4, size 4
        Translate 0xd4, read: phys addr = 0xd4
        value read = afbf0014
At PC = 0xd4: SW r31,20(r29)
Writing VA 0x4ec, size 4, value 0x8
        Translate 0x4ec, write: *** no valid TLB entry found for this virtual page!
Exception: page fault/no TLB entry
=> TLB miss (no TLB entry)
Reading VA 0xd4, size 4
        Translate 0xd4, read: phys addr = 0xd4
        value read = afbf0014
At PC = 0xd4: SW r31,20(r29)
Writing VA 0x4ec, size 4, value 0x8
        Translate 0x4ec, write: phys addr = 0x4ec
Reading VA 0xd8, size 4
        Translate 0xd8, read: phys addr = 0xd8
        value read = afbe0010
At PC = 0xd8: SW r30,16(r29)
Writing VA 0x4e8, size 4, value 0x0
        Translate 0x4e8, write: phys addr = 0x4e8
Reading VA 0xdc, size 4
        Translate 0xdc, read: phys addr = 0xdc
        value read = 0c000030
At PC = 0xdc: JAL 48
Reading VA 0xe0, size 4
        Translate 0xe0, read: phys addr = 0xe0
        value read = 03a0f021
At PC = 0xe0: ADDU r30,r29,r0
Reading VA 0xc0, size 4
        Translate 0xc0, read: phys addr = 0xc0
        value read = 03e00008
At PC = 0xc0: JR r0,r31
Reading VA 0xc4, size 4
        Translate 0xc4, read: phys addr = 0xc4
        value read = 00000000
At PC = 0xc4: SLL r0,r0,0
Reading VA 0xe4, size 4
        Translate 0xe4, read: phys addr = 0xe4
        value read = 0c000004
At PC = 0xe4: JAL 4
Reading VA 0xe8, size 4
        Translate 0xe8, read: phys addr = 0xe8
        value read = 00000000
At PC = 0xe8: SLL r0,r0,0
Reading VA 0x10, size 4
        Translate 0x10, read: *** no valid TLB entry found for this virtual page!
Exception: page fault/no TLB entry
=> TLB miss (no TLB entry)
Reading VA 0x10, size 4
        Translate 0x10, read: phys addr = 0x10
        value read = 24020000
At PC = 0x10: ADDIU r2,r0,0
Reading VA 0x14, size 4
        Translate 0x14, read: phys addr = 0x14
        value read = 0000000c
At PC = 0x14: SYSCALL
Exception: syscall
Shutdown, initiated by user program.
Machine halting!

And it shows that the TLB mechanism worked!

If use original linear page table

Initializing address space, num pages 10, size 1280
Initializing code segment, at 0x0, size 256
Initializing stack register to 1264
Starting thread "main" at time 10
Reading VA 0x0, size 4
        Translate 0x0, read: phys addr = 0x0
        value read = 0c000034
At PC = 0x0: JAL 52
Reading VA 0x4, size 4
        Translate 0x4, read: phys addr = 0x4
        value read = 00000000
At PC = 0x4: SLL r0,r0,0
Reading VA 0xd0, size 4
        Translate 0xd0, read: phys addr = 0xd0
        value read = 27bdffe8
At PC = 0xd0: ADDIU r29,r29,-24
Reading VA 0xd4, size 4
        Translate 0xd4, read: phys addr = 0xd4
        value read = afbf0014
At PC = 0xd4: SW r31,20(r29)
Writing VA 0x4ec, size 4, value 0x8
        Translate 0x4ec, write: phys addr = 0x4ec
Reading VA 0xd8, size 4
        Translate 0xd8, read: phys addr = 0xd8
        value read = afbe0010
At PC = 0xd8: SW r30,16(r29)
Writing VA 0x4e8, size 4, value 0x0
        Translate 0x4e8, write: phys addr = 0x4e8
Reading VA 0xdc, size 4
        Translate 0xdc, read: phys addr = 0xdc
        value read = 0c000030
At PC = 0xdc: JAL 48
Reading VA 0xe0, size 4
        Translate 0xe0, read: phys addr = 0xe0
        value read = 03a0f021
At PC = 0xe0: ADDU r30,r29,r0
Reading VA 0xc0, size 4
        Translate 0xc0, read: phys addr = 0xc0
        value read = 03e00008
At PC = 0xc0: JR r0,r31
Reading VA 0xc4, size 4
        Translate 0xc4, read: phys addr = 0xc4
        value read = 00000000
At PC = 0xc4: SLL r0,r0,0
Reading VA 0xe4, size 4
        Translate 0xe4, read: phys addr = 0xe4
        value read = 0c000004
At PC = 0xe4: JAL 4
Reading VA 0xe8, size 4
        Translate 0xe8, read: phys addr = 0xe8
        value read = 00000000
At PC = 0xe8: SLL r0,r0,0
Reading VA 0x10, size 4
        Translate 0x10, read: phys addr = 0x10
        value read = 24020000
At PC = 0x10: ADDIU r2,r0,0
Reading VA 0x14, size 4
        Translate 0x14, read: phys addr = 0x14
        value read = 0000000c
At PC = 0x14: SYSCALL
Exception: syscall
Shutdown, initiated by user program.
Machine halting!

Exercise 3: Replacement algorithm

Implement at least two replacement algorithm, compare the replacement times between two algorithm.

There are preparations before test the algorithms

  1. TLB Miss Rate

    To show the TLB Miss Rate. I've declare TLBMissCount and TranslateCount in code/machine/machine.h and initialize in code/machine/translate.cc.

    • When call the Machine::Translate() the TranslateCount++.
    • When raise exception in Machine::ReadMem() or Machine::WriteMem() the TLBMissCount++.

    In the end, calculate the TLB Miss Rate when system halt (in code/machine/exception.cc). This will do only when enable TLB.

  2. Custom user program

    I have made a customized user program to test the miss rate.

    Because code/test/halt is too short to observe the TLB Miss. But the other program is too large for default Nachos (I'll use them after finish demand paging in Exercise 7).

    code/test/fibonacci.c

    /* fibonacci.c
    *	Simple program to test the TLB miss rate (Lab 4) that calculate the fibonacci series
    */
    
    #include "syscall.h"
    
    #define N 20
    
    int
    main()
    {
        int i;
        int result[N];
        result[0] = 0;
        result[1] = 1;
        for (i = 2; i < N; i++)
        {
            result[i] = result[i-1] + result[i-2];
        }
        Exit(result[N-1]);
    }

    And add the following things in code/test/Makefile

    all: ... fibonacci
    
    ...
    
    # For Lab4: Test the TLB Miss Rate
    fibonacci.o: fibonacci.c
        $(CC) $(CFLAGS) -c fibonacci.c
    fibonacci: fibonacci.o start.o
        $(LD) $(LDFLAGS) start.o fibonacci.o -o fibonacci.coff
        ../bin/coff2noff fibonacci.coff fibonacci

    This will need to compile the whole project but I've done that and copy the binary/executable file to local. So it's totally fine to just compile the userprog subdirectory using my docker build script.

  3. Switch for each algorithm

    I have used some define as the switch for choosing the replacement algorithm. And this will also manage some global variables, etc.

    • TLB_FIFO
    • TLB_CLOCK
    • TLB_LRU (TODO)

And then I'll test the program without other verbose information. Enable debug flag T to show the TLB handling message (miss rate)

docker run nachos_userprog nachos/nachos-3.4/code/userprog/nachos -d T -x nachos/nachos-3.4/code/test/fibonacci

Alternatively, you can build the docker with

cd Lab/Lab0_BuildNachos;
./build_modified_nachos.sh

and execute it. (use -d S to see the return value of Exit system call (to check the calculation result))

docker run nachos nachos/nachos-3.4/code/userprog/nachos -d T -x nachos/nachos-3.4/code/test/fibonacci

FIFO

void
TLBAlgoFIFO(TranslationEntry page)
{
    int TLBreplaceIdx = -1;
    // Find the empty entry
    for (int i = 0; i < TLBSize; i++) {
        if (machine->tlb[i].valid == FALSE) {
            TLBreplaceIdx = i;
            break;
        }
    }
    // If full then move everything forward and remove the last one
    if (TLBreplaceIdx == -1) {
        TLBreplaceIdx = TLBSize - 1;
        for (int i = 0; i < TLBSize - 1; i++) {
            machine->tlb[i] = machine->tlb[i+1];
        }
    }
    // Update TLB
    machine->tlb[TLBreplaceIdx] = page;
}

Result:

TLBSize = 2

TLB Miss: 87, TLB Hit: 764, Total Instruction: 851, Total Translate: 938, TLB Miss Rate: 10.22%
Machine halting!

Default TLBSize (= 4)

TLB Miss: 6, TLB Hit: 818, Total Instruction: 824, Total Translate: 830, TLB Miss Rate: 0.73%

Clock

Because the Nachos TLB (the class TranslationEntry in code/machine/translate.h) already has use (and dirty) bit.

We can implement the alternative second chance replacement algorithm by the clock algorithm.

The brief description of second chance replacement algorithm:

  • It is designed to avoid the problem of throwing out a heavily used page in FIFO
  • Introduce R bit (i.e. use bit)
    • 0: page is both old and unused, so it is replaced immediately
    • 1: the bit is cleared, the page is put onto the end of the list of TLB (and its load time is updated as though it had just arrived in memory)
int TLBreplaceIdx = 0; // When using TLB_CLOCK, this is circular pointer

void
TLBAlgoClock(TranslationEntry page)
{
    // Find the next one
    // if used then clear to 0 and continue find the next one.
    // until find the one that is not used.
    while (1) {
        TLBreplaceIdx %= TLBSize;
        if (machine->tlb[TLBreplaceIdx].valid == FALSE) {
            break;
        } else  {
            if (machine->tlb[TLBreplaceIdx].use) {
                // Found the entry is recently used
                // clear the R bit and find next
                machine->tlb[TLBreplaceIdx].use = FALSE;
                TLBreplaceIdx++;
            } else {
                // Evict the entry
                break;
            }
        }
    }

    // Update TLB
    machine->tlb[TLBreplaceIdx] = page;
    machine->tlb[TLBreplaceIdx].use = TRUE;
}

Result:

TLB Miss: 6, TLB Hit: 818, Total Instruction: 824, Total Translate: 830, TLB Miss Rate: 0.73%

LRU

TODO

Conclusion

In the fibonacci user program we can found that. Because in the scenario there is no chance to "throwing out a heavily used TLB" since the memory access is quite average. Thus the performance of FIFO and Clock algorithm are the same.

II. Paging Memory Management

In the current Nachos, The member variable AddrSpace* space used in Class Thread use TranslationEntry* pageTable to manage memory. When application program is starting up it will initialize it; When context switch, it will also do reserve and resume. (Such that Class Machine::TranslationEntry* pageTable will always pointing at the current running Thread's page table)

As mention in TLB Technique and Address Binding we know that system has the pointer to current thread page table. But we may have multiple address space (one for each thread) in the future.

We have seen in Executing User Program. We will call AddrSpace::RestoreState() to assign the current thread page table to machine page table pointer.

Exercise 4: Global data structure for memory management

Impelement a global data structure for memory allocation and recycle, and record the current memory usage status

e.g. Free Linked List (空閒鏈表)

Linked List

e.g. Bitmap(位圖)

Bitmap

I have used some define as the switch for choosing the data structure for memory management. And just enable in code/userprog/Makefile

  • USE_BITMAP
  • USE_LINKED_LIST (TODO)

Bitmap

I've divide the memory up into allocation unit with NumPhysPages and thus use unsigned int to store the bitmap.

Corresponding to each allocation unit is a bit in the bitmap:

  • zero (0) only if the unit is free
  • one (1) only if the unit is occupied

Add the following data structure in class Machine in code/machine/machine.h

class Machine {
  public:
  
    ...

    unsigned int bitmap; // This can record 32 allocation units (sizeof(int)*8 = 32). Current NumPhysPages is 32 too.
    int allocateFrame(void); // Find a empty allocation unit to put physical page frames
    void freeMem(void); // Free current page table physical page frames
}

And implement Machine::allocateFrame and Machine::freeMem in code/machine/machine.cc

  • Machine::allocateFrame is used to find an empty physical page frame to allocate

    //----------------------------------------------------------------------
    // Machine::allocateFrame
    //   	Find a free physical page frame to allocate memory.
    //      If not found, return -1.
    //----------------------------------------------------------------------
    
    int
    Machine::allocateFrame(void)
    {
        int shift;
        for (shift = 0; shift < NumPhysPages; shift++) {
            if (!(bitmap >> shift & 0x1)) { // found empty bit
                bitmap |= 0x1 << shift; // set the bit to used
                DEBUG('M', "Allocate physical page frame: %d\n", shift);
                return shift;
            }
        }
        DEBUG('M', "Out of physical page frame!\n", shift);
        return -1;
    }
  • Machine::freeMem is used when we want to release the page frames occupied by current page table

TODO: Nachos actually has Bitmap class in code/userprog/bitmap.cc and code/userprog/bitmap.h. If the physical memory need to be extend (current unsigned int can't represent) then switch to this.

Free Linked List

TODO

Other Modification

  1. We need to modify the default memory allocation from linear allocation to our own logic when initializing AddrSpace in code/userprog/addrspace.cc.

    • Bitmap

      pageTable[i].physicalPage = machine->allocateFrame();
  2. Exit system call

  3. We also need to free the memory and clear the record in our data structure.

    • Bitmap

      void AddressSpaceControlHandler(int type)
      {
          if (type == SC_Exit) {
      
              ...
      
      #ifdef USER_PROGRAM
              if (currentThread->space != NULL) {
      #ifdef USE_BITMAP
                  machine->freeMem(); // ONLY USE FOR TEST Lab4 Exercise4
      #endif
                  delete currentThread->space;
                  currentThread->space = NULL;
              }
      #endif
      
              ...
      
          }
      }

Result

I've add M for memory management debug flag

docker run nachos_userprog nachos/nachos-3.4/code/userprog/nachos -d M -x nachos/nachos-3.4/code/test/halt
  • Bitmap

    Allocate physical page frame: 0
    Allocate physical page frame: 1
    Allocate physical page frame: 2
    Allocate physical page frame: 3
    Allocate physical page frame: 4
    Allocate physical page frame: 5
    Allocate physical page frame: 6
    Allocate physical page frame: 7
    Allocate physical page frame: 8
    Allocate physical page frame: 9
    Bitmap after allocate: 000003FF
    Free physical page frame: 0
    Free physical page frame: 1
    Free physical page frame: 2
    Free physical page frame: 3
    Free physical page frame: 4
    Free physical page frame: 5
    Free physical page frame: 6
    Free physical page frame: 7
    Free physical page frame: 8
    Free physical page frame: 9
    Bitmap after freed: 00000000
    Machine halting!

Exercise 5: Support multi-threads

In the current Nachos, only single Thread can exist in memory. We need to break this restriction.

Catch up how we execute user program.

And there is additional define in code/threads/thread.h. We can assign each thread a specific address space.

class Thread {

    ...

  private:

    ...

#ifdef USER_PROGRAM // Lab4: Multi-thread user program
// A thread running a user program actually has *two* sets of CPU registers -- 
// one for its state while executing user code, one for its state 
// while executing kernel code.

    int userRegisters[NumTotalRegs];	// user-level CPU register state

  public:
    void SaveUserState();		// save user-level register state
    void RestoreUserState();		// restore user-level register state

    AddrSpace *space;			// User code this thread is running.
#endif

};

And the Thread::SaveUserState() and Thread::RestoreUserState() (manipulate CPU register)

Correspond to AddrSpace::SaveState() and AddrSpace::RestoreState() (manipulate memory (i.e. system page table and TLB))

//----------------------------------------------------------------------
// Thread::SaveUserState
//	Save the CPU state of a user program on a context switch.
//
//	Note that a user program thread has *two* sets of CPU registers -- 
//	one for its state while executing user code, one for its state 
//	while executing kernel code.  This routine saves the former.
//----------------------------------------------------------------------

void
Thread::SaveUserState()
{
    for (int i = 0; i < NumTotalRegs; i++)
	userRegisters[i] = machine->ReadRegister(i);
}

//----------------------------------------------------------------------
// Thread::RestoreUserState
//	Restore the CPU state of a user program on a context switch.
//
//	Note that a user program thread has *two* sets of CPU registers -- 
//	one for its state while executing user code, one for its state 
//	while executing kernel code.  This routine restores the former.
//----------------------------------------------------------------------

void
Thread::RestoreUserState()
{
    for (int i = 0; i < NumTotalRegs; i++)
	machine->WriteRegister(i, userRegisters[i]);
}

In /code/threads/scheduler.cc, when occur context switch, the scheduler will invoke both Thread::SaveUserState(), Thread::RestoreUserState() and AddrSpace::SaveState(), AddrSpace::RestoreState()

void
Scheduler::Run (Thread *nextThread)
{
    Thread *oldThread = currentThread;
    
#ifdef USER_PROGRAM			// ignore until running user programs 
    if (currentThread->space != NULL) {	// if this thread is a user program,
        currentThread->SaveUserState(); // save the user's CPU registers
	currentThread->space->SaveState();
    }
#endif

    ...

    // Context Switch to nextThread

    ...

#ifdef USER_PROGRAM
    if (currentThread->space != NULL) {		// if there is an address space
        currentThread->RestoreUserState();     // to restore, do it.
	currentThread->space->RestoreState();
    }
#endif
}

The Exit Syscall

As the Nachos comment said, machine->Run() never return, the address space exits by doing the syscall "exit".

This will also used in Exercise 4 to clean up the memory management data structure and the other previous Exercises which used to return the result value.

The defnition of Exit in the code/userprog/syscall.h said that. And there are also some address space related system call.

  • Exit
  • Exec
  • Join
/* Address space control operations: Exit, Exec, and Join */

/* This user program is done (status = 0 means exited normally). */
void Exit(int status);

/* A unique identifier for an executing user program (address space) */
typedef int SpaceId;

/* Run the executable, stored in the Nachos file "name", and return the
 * address space identifier
 */
SpaceId Exec(char *name);

/* Only return once the the user program "id" has finished.  
 * Return the exit status.
 */
int Join(SpaceId id);

The Exit system call: user process quits with status returned.

The kernel handles an Exit system call by

  1. destroying the process data structures and thread(s)
  2. reclaiming any memory assigned to the process (i.e. The memory management data structure)
  3. arranging to return the exit status value as the result of the Join on this process, if any.
    • (The Join related part will be completed in Lab6, we will now show the exit status value in debug message)

The following code is define in code/userprog/syscall.h and implement in code/userprog/exception.cc.

I've made some system call handler function for further preparation. Each handling a type of system call.

void AddressSpaceControlHandler(int type); // Exit, Exec, Join
void FileSystemHandler(int type); // Create, Open, Write, Read, Close
void UserLevelThreadsHandler(int type); // Fork, Yield

Because system call need to return to the next instruction (unlike the page fault exception). Thus we need to advance/increase the PC Registers. I've made this a function too.

//----------------------------------------------------------------------
// IncrementPCRegs
// 	Because when Nachos cause the exception. The PC won't increment
//  (i.e. PC+4) in Machine::OneInstruction in machine/mipssim.cc.
//  Thus, when invoking a system call, we need to advance the program
//  counter. Or it will cause the infinity loop.
//----------------------------------------------------------------------

void IncrementPCRegs(void) {
    // Debug usage
    machine->WriteRegister(PrevPCReg, machine->ReadRegister(PCReg));

    // Advance program counter
    machine->WriteRegister(PCReg, machine->ReadRegister(NextPCReg));
    machine->WriteRegister(NextPCReg, machine->ReadRegister(NextPCReg)+4);
}

And here is the Exit system call.

The TODOs is prepared for the furtuer Lab (maybe)

//----------------------------------------------------------------------
// AddressSpaceControlHandler
// 	Handling address space control related system call.
//  1. Exit
//  2. Exec
//  3. Join
//----------------------------------------------------------------------

void AddressSpaceControlHandler(int type)
{
    if (type == SC_Exit) {

        PrintTLBStatus(); // TLB debug usage

        int status = machine->ReadRegister(4); // r4: first arguments to functions

        currentThread->setExitStatus(status);
        if (status == 0) {
            DEBUG('S', COLORED(GREEN, "User program exit normally. (status 0)\n"));
        } else {
            DEBUG('S', COLORED(FAIL, "User program exit with status %d\n"), status);
        }

        // TODO: release children

#ifdef USER_PROGRAM
        if (currentThread->space != NULL) {
#ifdef USE_BITMAP
            machine->freeMem(); // ONLY USE FOR TEST Lab4 Exercise4
#endif
            delete currentThread->space;
            currentThread->space = NULL;
        }
#endif
        // TODO: if it has parent, then set this to zombie and signal
        currentThread->Finish();
    }
}

Build the Scenario

I've add StartTwoThread() in code/userprog/progtest.cc. And user can invoke this by using -X flag that I've add the functionality in code/threads/main.cc.

This is the hardest part so far. Because it's hard to understand how the simulator (i.e. Machine) work when executing user program that Machine::Run will run in infinity loop.

Thus for now we need to make sure every user program exit properly.

On the top level. Just after use -X to execute user program. The difference between original StartProcess is that this will create two threads using the same user program.

//----------------------------------------------------------------------
// StartTwoThread
// 	Run a user program.  Open the executable, load it into
//	memory, create two copy of the thread and jump to it.
//  (ps. use -X filename, detail information is in thread/main.c)
//----------------------------------------------------------------------

void
StartTwoThread(char *filename)
{
    OpenFile *executable = fileSystem->Open(filename);

    if (executable == NULL) {
	    printf("Unable to open file %s\n", filename);
	    return;
    }

    Thread *thread1 = CreateSingleThread(executable, 1);
    Thread *thread2 = CreateSingleThread(executable, 2);

    delete executable;			// close file

    thread1->Fork(UserProgThread, (void*)1);
    thread2->Fork(UserProgThread, (void*)2);

    currentThread->Yield();
}

When creating the thread, I've made the thread priority greater than main thread. And assign the address space that created from the executable to the thread's space.

//----------------------------------------------------------------------
// CreateSingleThread
// 	Run a user program.  Open the executable, load it into
//	memory, create a copy of it and return the thread.
//----------------------------------------------------------------------

Thread*
CreateSingleThread(OpenFile *executable, int number)
{
    printf("Creating user program thread %d\n", number);

    char ThreadName[20];
    sprintf(ThreadName, "User program %d", number);
    Thread *thread = new Thread(strdup(ThreadName), -1);

    AddrSpace *space;
    space = new AddrSpace(executable);
    thread->space = space;

    return thread;
}

And for each user program thread, because the thread will become currentThread. We need to initialize the machine register (use AddrSpace::InitRegisters) and load the page table (use AddrSpace::RestoreState). Then we're ready to go!

//----------------------------------------------------------------------
// UserProgThread
// 	A basic user program thread.
//----------------------------------------------------------------------

void
UserProgThread(int number)
{
    printf("Running user program thread %d\n", number);
    currentThread->space->InitRegisters();		// set the initial register values
    currentThread->space->RestoreState();		// load page table register
    currentThread->space->PrintState(); // debug usage
    machine->Run();	// jump to the user progam
    ASSERT(FALSE);			// machine->Run never returns;
                // the address space exits
                // by doing the syscall "exit"
}

Modify Address Space for Context Switch

When context switch, the TLB will fail because of using different address space.

But at this moment this will have no influence, because another thread will not interrupt or happen context switch when one thread is running.

//----------------------------------------------------------------------
// AddrSpace::SaveState
// 	On a context switch, save any machine state, specific
//	to this address space, that needs saving.
//
//	For now, nothing!
//----------------------------------------------------------------------

void AddrSpace::SaveState()
{
#ifdef USE_TLB // Lab4: Clean up TLB
    DEBUG('T', "Clean up TLB due to Context Switch!\n");
    for (int i = 0; i < TLBSize; i++) {
        machine->tlb[i].valid = FALSE;
    }
#endif
}

Test Exercise 5

I've made a simple code/test/exit.c user program to test.

/* halt.c
 *	Simple program to test multi-thread user program (Lab 4)
 */

#include "syscall.h"

int
main()
{
    int i;
    for (i = 0; i < 100; i++) {
        // do nothing
    }
    Exit(87);
}

And here is the result. (Debug message S for syscall, T for TLB, t for thread)

$ docker run nachos_userprog nachos/nachos-3.4/code/userprog/nachos -d STt -X nachos/nachos-3.4/code/test/exit
Creating user program thread 1
Creating user program thread 2
Forking thread "User program 1" with func = 0x80502b7, arg = 1
Putting thread User program 1 on ready list.
Forking thread "User program 2" with func = 0x80502b7, arg = 2
Putting thread User program 2 on ready list.
Yielding thread "main"
Putting thread main on ready list.
Switching from thread "main" to thread "User program 1"
Running user program thread 1
=== Address Space Information ===
numPages = 11
VPN     PPN     valid   RO      use     dirty
0       0       1       0       0       0
1       1       1       0       0       0
2       2       1       0       0       0
3       3       1       0       0       0
4       4       1       0       0       0
5       5       1       0       0       0
6       6       1       0       0       0
7       7       1       0       0       0
8       8       1       0       0       0
9       9       1       0       0       0
10      10      1       0       0       0
=================================
TLB Miss: 4, TLB Hit: 1322, Total Instruction: 1326, Total Translate: 1330, TLB Miss Rate: 0.30%
User program exit with status 87
Finishing thread "User program 1"
Sleeping thread "User program 1"
Switching from thread "User program 1" to thread "User program 2"
Running user program thread 2
=== Address Space Information ===
numPages = 11
VPN     PPN     valid   RO      use     dirty
0       11      1       0       0       0
1       12      1       0       0       0
2       13      1       0       0       0
3       14      1       0       0       0
4       15      1       0       0       0
5       16      1       0       0       0
6       17      1       0       0       0
7       18      1       0       0       0
8       19      1       0       0       0
9       20      1       0       0       0
10      21      1       0       0       0
=================================
TLB Miss: 4, TLB Hit: 2647, Total Instruction: 2651, Total Translate: 2655, TLB Miss Rate: 0.15%
User program exit with status 87
Finishing thread "User program 2"
Sleeping thread "User program 2"
Switching from thread "User program 2" to thread "main"
Now in thread "main"
Deleting thread "User program 2"
Finishing thread "main"
Sleeping thread "main"
No threads ready or runnable, and no pending interrupts.
Assuming the program completed.
Machine halting!

Ticks: total 2104, idle 0, system 60, user 2044

Improvement:

  • User space thread should be able to switch to each other when executing (maybe running with -rs or -rr).
  • Seems that the memory didn't load as we expect. Because of the original memory is loaded sequentially. But actually we need to load it on what the bitmap allocate to us. (in AddrSpace::AddrSpace)
  • Understand how Machine::Run work (why run multiple time (for each thread))

Exercise 6: Missing page interrupt handling

Based on TLB mechanism exception handling and the replacement algorithm. Implement missing page interrupt handler and page replacement algorithm.

(The TLB exception mechanism is loading the page in memory from memory to TLB. Thus, missing page handler is to load new page from disk to memory)

We need to generate page fault, that is the page in page table is invalid (unloaded) and we load it from disk to memory.

Virtual Memory

So I use a file to act as "virtual memory" that contain the file's code and data segment (if any).

And here is the modification on creating address space (AddrSpace::AddrSpace) (define DEMAND_PAGING to enable the code, see more explanaiton here)

AddrSpace::AddrSpace(OpenFile *executable)
{
    ...

    // Create a virtual memory with the size that the executable file need.
    DEBUG('a', "Demand paging: creating virtual memory!\n");
    bool success_create_vm = fileSystem->Create("VirtualMemory", size);
    ASSERT_MSG(success_create_vm, "fail to create virtual memory");

    // Initialize page table
    pageTable = new TranslationEntry[numPages];
    for (i = 0; i < numPages; i++) {
        ...

        pageTable[i].valid = FALSE;

        ...
    }

    bzero(machine->mainMemory, MemorySize);

    DEBUG('a', "Demand paging: copy executable to virtual memory!\n");

    OpenFile *vm = fileSystem->Open("VirtualMemory");

    char *virtualMemory_temp;
    virtualMemory_temp = new char[size];
    for (i = 0; i < size; i++)
        virtualMemory_temp[i] = 0;

    if (noffH.code.size > 0) {
        DEBUG('a', "\tCopying code segment, at 0x%x, size %d\n",
              noffH.code.virtualAddr, noffH.code.size);
        executable->ReadAt(&(virtualMemory_temp[noffH.code.virtualAddr]),
                           noffH.code.size, noffH.code.inFileAddr);
        vm->WriteAt(&(virtualMemory_temp[noffH.code.virtualAddr]),
                    noffH.code.size, noffH.code.virtualAddr*PageSize);
    }
    if (noffH.initData.size > 0) {
        DEBUG('a', "\tCopying data segment, at 0x%x, size %d\n",
              noffH.initData.virtualAddr, noffH.initData.size);
        executable->ReadAt(&(virtualMemory_temp[noffH.initData.virtualAddr]),
                           noffH.initData.size, noffH.initData.inFileAddr);
        vm->WriteAt(&(virtualMemory_temp[noffH.initData.virtualAddr]),
                    noffH.initData.size, noffH.initData.virtualAddr*PageSize);
    }
    delete vm; // Close the file
}

Here is the explination of the changes

  1. pageTable[i].valid = FALSE; means we don't load any physical segment from disk to page. (you can see the first page will page fault on later test). But it's okay to load the fist few pages at first to reduce the page fault rate.
  2. I've created a file call "VirtualMemory". This will be created at where the nachos executable at (the root directory, in this case code/userprog/).
    • Copy the code and data segment from "executable" to "vm"

Note that, when the position offset of "executable" is noffH.code.inFileAddr, where we need to write in virtual memory is noffH.code.virtualAddr*PageSize. Because in real file there is a header. But in virtual memory, there are only code and data segments!

(If miss this will mess the instruction decode, and probabily will cause infinity loop!)

Missing Page Handling

In the TLB exercise, I've left a space for Missing Page page fault in code/userprog/exception.cc.

When we failed to find a valid page, it will cause missing page page fault.

void
TLBMissHandler(int virtAddr)
{
    unsigned int vpn;
    vpn = (unsigned) virtAddr / PageSize;

    // Find the Page
    TranslationEntry page = machine->pageTable[vpn];
    if (!page.valid) { // Lab4: Demand paging
        DEBUG('m', COLORED(WARNING, "\t=> Page miss\n"));
        page = PageFaultHandler(vpn);
    }

    ...

And here is how the PageFaultHandler do:

  1. Find an empty space in memory with machine->allocateFrame() (this is implemented by Exercise 4)
  2. Load the page frame from disk to memory
    • If memory out of space then find a page to replace using NaivePageReplacement. This will be explain in the Exercise 7
TranslationEntry
PageFaultHandler(int vpn)
{
    // Get a Memory space (page frame) to allocate
    int pageFrame = machine->allocateFrame(); // ppn
    if (pageFrame == -1) { // Need page replacement
        pageFrame = NaivePageReplacement(vpn);
    }
    machine->pageTable[vpn].physicalPage = pageFrame;

    // Load the Page from virtual memory
    DEBUG('a', "Demand paging: loading page from virtual memory!\n");
    OpenFile *vm = fileSystem->Open("VirtualMemory"); // This file is created in userprog/addrspace.cc
    ASSERT_MSG(vm != NULL, "fail to open virtual memory");
    vm->ReadAt(&(machine->mainMemory[pageFrame*PageSize]), PageSize, vpn*PageSize);
    delete vm; // close the file

    // Set the page attributes
    machine->pageTable[vpn].valid = TRUE;
    machine->pageTable[vpn].use = FALSE;
    machine->pageTable[vpn].dirty = FALSE;
    machine->pageTable[vpn].readOnly = FALSE;

    currentThread->space->PrintState(); // debug with -d M to show bitmap
}

III. Lazy-loading (i.e. Demand Paging)

Notice that in code/userprog/Makefile it enable micros DEFINES = -DFILESYS_NEEDED -DFILESYS_STUB on default. That we need to use "disk" to build the scenario to make the "page fault"

I've also made the -DDEMAND_PAGING to enable demand paging when we need it.

Use -d s to enable single step debug

docker run -it nachos_userprog nachos/nachos-3.4/code/userprog/nachos -s -d amM -x nachos/nachos-3.4/code/test/halt

Exercise 7: Loading page on demand

Nachos allocate memory must be completed once the user program is loaded into memory. Thus the user program size is strictly restrict to be lower than 4KB. Implement a lazy-loading algorithm that load the page from disk to memory if and only if the missing page exception occur.

Naive Page Replacement

This is the continus part of the last Experiment

  1. Find an non-dirty page frame to replace.
  2. If not found, then replace a dirty page and write back to disk.
  3. Return the page frame number when founded or after replacement.
int
NaivePageReplacement(int vpn)
{
    int pageFrame = -1;
    for (int temp_vpn = 0; temp_vpn < machine->pageTableSize, temp_vpn != vpn; temp_vpn++) {
        if (machine->pageTable[temp_vpn].valid) {
            if (!machine->pageTable[temp_vpn].dirty) {
                pageFrame = machine->pageTable[temp_vpn].physicalPage;
                break;
            }
        }
    }
    if (pageFrame == -1) { // No non-dirty entry
        for (int replaced_vpn = 0; replaced_vpn < machine->pageTableSize, replaced_vpn != vpn; replaced_vpn++) {
            if (machine->pageTable[replaced_vpn].valid) {
                machine->pageTable[replaced_vpn].valid = FALSE;
                pageFrame = machine->pageTable[replaced_vpn].physicalPage;

                // Store the page back to disk
                OpenFile *vm = fileSystem->Open("VirtualMemory");
                ASSERT_MSG(vm != NULL, "fail to open virtual memory");
                vm->WriteAt(&(machine->mainMemory[pageFrame*PageSize]), PageSize, replaced_vpn*PageSize);
                delete vm; // close file
                break;
            }
        }
    }
    return pageFrame;
}

Test with user program

Halt user program

$ docker run -it nachos_userprog nachos/nachos-3.4/code/userprog/nachos -d TM -x nachos/nachos-3.4/code/test/halt
Allocate physical page frame: 0
=== Address Space Information ===
numPages = 10
VPN     PPN     valid   RO      use     dirty
0       0       1       0       0       0
1       0       0       0       0       0
2       0       0       0       0       0
3       0       0       0       0       0
4       0       0       0       0       0
5       0       0       0       0       0
6       0       0       0       0       0
7       0       0       0       0       0
8       0       0       0       0       0
9       0       0       0       0       0
Current Bitmap: 00000001
=================================
Allocate physical page frame: 1
=== Address Space Information ===
numPages = 10
VPN     PPN     valid   RO      use     dirty
0       0       1       0       0       0
1       1       1       0       0       0
2       0       0       0       0       0
3       0       0       0       0       0
4       0       0       0       0       0
5       0       0       0       0       0
6       0       0       0       0       0
7       0       0       0       0       0
8       0       0       0       0       0
9       0       0       0       0       0
Current Bitmap: 00000003
=================================
Allocate physical page frame: 2
=== Address Space Information ===
numPages = 10
VPN     PPN     valid   RO      use     dirty
0       0       1       0       0       0
1       1       1       0       0       0
2       0       0       0       0       0
3       0       0       0       0       0
4       0       0       0       0       0
5       0       0       0       0       0
6       0       0       0       0       0
7       0       0       0       0       0
8       0       0       0       0       0
9       2       1       0       0       0
Current Bitmap: 00000007
=================================
TLB Miss: 6, TLB Hit: 11, Total Instruction: 17, Total Translate: 23, TLB Miss Rate: 35.29%
Machine halting!

Ticks: total 28, idle 0, system 10, user 18

Exit user program

$ docker run -it nachos_userprog nachos/nachos-3.4/code/userprog/nachos -d TM -x nachos/nachos-3.4/code/test/exit
Allocate physical page frame: 0
=== Address Space Information ===
numPages = 11
VPN     PPN     valid   RO      use     dirty
0       0       1       0       0       0
1       0       0       0       0       0
2       0       0       0       0       0
3       0       0       0       0       0
4       0       0       0       0       0
5       0       0       0       0       0
6       0       0       0       0       0
7       0       0       0       0       0
8       0       0       0       0       0
9       0       0       0       0       0
10      0       0       0       0       0
Current Bitmap: 00000001
=================================
Allocate physical page frame: 1
=== Address Space Information ===
numPages = 11
VPN     PPN     valid   RO      use     dirty
0       0       1       0       0       0
1       1       1       0       0       0
2       0       0       0       0       0
3       0       0       0       0       0
4       0       0       0       0       0
5       0       0       0       0       0
6       0       0       0       0       0
7       0       0       0       0       0
8       0       0       0       0       0
9       0       0       0       0       0
10      0       0       0       0       0
Current Bitmap: 00000003
=================================
Allocate physical page frame: 2
=== Address Space Information ===
numPages = 11
VPN     PPN     valid   RO      use     dirty
0       0       1       0       0       0
1       1       1       0       0       0
2       0       0       0       0       0
3       0       0       0       0       0
4       0       0       0       0       0
5       0       0       0       0       0
6       0       0       0       0       0
7       0       0       0       0       0
8       0       0       0       0       0
9       0       0       0       0       0
10      2       1       0       0       0
Current Bitmap: 00000007
=================================
Allocate physical page frame: 3
=== Address Space Information ===
numPages = 11
VPN     PPN     valid   RO      use     dirty
0       0       1       0       0       0
1       1       1       0       0       0
2       3       1       0       0       0
3       0       0       0       0       0
4       0       0       0       0       0
5       0       0       0       0       0
6       0       0       0       0       0
7       0       0       0       0       0
8       0       0       0       0       0
9       0       0       0       0       0
10      2       1       0       0       0
Current Bitmap: 0000000F
=================================
TLB Miss: 8, TLB Hit: 1319, Total Instruction: 1327, Total Translate: 1335, TLB Miss Rate: 0.60%
Free physical page frame: 0
Free physical page frame: 1
Free physical page frame: 3
Free physical page frame: 2
Bitmap after freed: 00000000
No threads ready or runnable, and no pending interrupts.
Assuming the program completed.
Machine halting!

Ticks: total 1038, idle 0, system 10, user 1028

Challenges

Challenge 1

Add SUSPENDED state for Thread. And implement the switching between SUSPENDED, READY and BLOCKED

TODO

Challenge 2

The defect of Hierarchical Paging (multi-level page table) is the page table size is in direct ratio to virtual address space. To reduce the consumption on physical page table. Implement inverted page table on Nachos.

Inverted Page Table

As the research, the inverted page table is just like the Bitmap + Page Table. This is not a normal page table that is under the AddrSpace of each user program thread. Instead, it should be under machine, just like the bitmap.

Define INVERTED_PAGETABLE to test this challenge.

Modify the Structure

Add the thread ID attribute to page table entry in code/machine/translate.h

class TranslationEntry {

    ...

#ifdef INVERTED_PAGETABLE // Lab4: Inverted Page Table
    int threadId;
#endif
};

Because we only use the inverted page table under machine. Just modify the original context switch routine in code/userprog/addrspace.cc.

void AddrSpace::RestoreState()
{
// If using inverted page table, because there is only one page table (for a machine)
// So we won't need the address space page table
#ifndef INVERTED_PAGETABLE
    machine->pageTable = pageTable;
    machine->pageTableSize = numPages;
#endif
}

We'll share the same function name with bitmap but with different implementation in code/machine/machine.cc

int
Machine::allocateFrame(void)
{
    for (int i = 0; i < NumPhysPages; i++) {
        if (!pageTable[i].valid) {
            return i;
        }
    }
    ASSERT_MSG(FALSE, "Out of physical page frame! Current inverted page table don't support demand paging!")
    return -1;
}
void
Machine::freeMem(void)
{
    for (int i = 0; i < NumPhysPages; i++) {
        if (pageTable[i].threadId == currentThread->getThreadId()) {
            pageTable[i].valid = FALSE;
            DEBUG('M', "Free physical page frame: %d\n", i);
        }
    }
    DEBUG('M', "Freed the memory hold by thread \"%s\".\n", currentThread->getName());
}

Initializing the inverted page table

Just like bitmap, when we start/create the machine, we need to initialize it. (in code/machine/machine.cc)

Machine::Machine(bool debug)
{
#if USE_BITMAP && INVERTED_PAGETABLE
    ASSERT_MSG(FALSE, "we must have either a Bitmap or a Inverted Page Table, but not both!");
#endif

    ...

    pageTable = new TranslationEntry[NumPhysPages];
    // Initialize Inverted Page Table
    for (i = 0; i < NumPhysPages; i++) {
        pageTable[i].physicalPage = i;
        pageTable[i].virtualPage = i;
        pageTable[i].valid = FALSE;
        pageTable[i].dirty = FALSE;
        pageTable[i].readOnly = FALSE;
        pageTable[i].threadId = -1;
    }
    pageTableSize = MemorySize;

    ...
}

we don't need virtualPage actually.

And when we execute user program, we need to load the executable into memory. (in code/userprog/addrspace.cc)

AddrSpace::AddrSpace(OpenFile *executable)
{
    ...

    for (i = 0; i < numPages; i++) {
        machine->pageTable[i].physicalPage = machine->allocateFrame(); // Currently don't support demand paging
        machine->pageTable[i].valid = TRUE;
        machine->pageTable[i].use = FALSE;
        machine->pageTable[i].dirty = FALSE;
        machine->pageTable[i].readOnly = FALSE;

        machine->pageTable[i].threadId = currentThread->getThreadId(); // The additional part of inverted page table
    }
    DEBUG('M', "Initialized memory for thread \"%s\".\n", currentThread->getName());

    ...
}

Translate the Address

Finally, we need to translate the "physical page number + thread id" to the memory.

I've built on the original code. So the "vpn" variable is actually means "ppn".

ExceptionType
Machine::Translate(int virtAddr, int* physAddr, int size, bool writing)
{
    ...

    if (tlb == NULL) {
        if (vpn >= pageTableSize) {
            DEBUG('a', "virtual page # %d too large for page table size %d!\n",
                  virtAddr, pageTableSize);
            return AddressErrorException;
        } else if (!pageTable[vpn].valid) {
            DEBUG('a', "virtual page # %d is invalid!\n");
            return PageFaultException;
        } else if (!(pageTable[vpn].threadId == currentThread->getThreadId())) {
            ASSERT_MSG(FALSE, "A thread is accessing other thread's address space!");
        }
        entry = &pageTable[vpn];
    }

    ...
}

Pure inverted index page table test on user program

I use the exit user program to test and the pages is allocated and freed as we expected!

$ docker run -it nachos_userprog nachos/nachos-3.4/code/userprog/nachos -d MS -x nachos/nachos-3.4/code/test/exit
Initialized memory for thread "main".
User program exit with status 87
Free physical page frame: 0
Free physical page frame: 1
Free physical page frame: 2
Free physical page frame: 3
Free physical page frame: 4
Free physical page frame: 5
Free physical page frame: 6
Free physical page frame: 7
Free physical page frame: 8
Free physical page frame: 9
Free physical page frame: 10
Freed the memory hold by thread "main".
No threads ready or runnable, and no pending interrupts.
Assuming the program completed.
Machine halting!

Ticks: total 1030, idle 0, system 10, user 1020

TODO: with TLB, Demand Paging

Trouble Shooting

Copy file from the running docker container

Because I want to pick out the built user test program. That I don't want to compile the whole project every time.

docker cp 987fea7d8235:/nachos/nachos-3.4/code/test/fibonacci Nachos/nachos-3.4/code/test

Hexadecimal value in C

unsigned char hexVar = 0xAB;

printf("%x\n", hexVar);
printf("%X\n", hexVar);

ANSI Escape Code

  • \033 ESC character (ASCII 27) + [
  • SGR parameters (this can have multiple value seperated by ;)
  • End with m

Example:

  • \033[31m: red
  • \033[1;31m: bright red
  • \033[30;47m: get black letters on white background
  • \033[0m: reset all attributes

Link

Char* vs. Char[]

stack smashing detected Error

The string length is greater than the array size.

Improvable Todos

Resources

CSC546 - Operating Systems

  • Lecture 5
    • Write
    • Read
    • DEBUG
    • Address Spaces and Address Translation
    • Page Tables (versus TLB)
    • User Program Context Switches (Again)
    • Nachos Object File Format (Noff)
    • Loading User Programs into Mips Memory
  • Lecture 6 static char arrays
    • OpenFile (again)
    • Console I/O
    • Nachos (C++) Semaphore class
    • Virtual Memory
    • BitMap
    • Adding a C++ class to Nachos
    • CoreMap
    • Exceptions: machine/machine.h
    • PageFaultException

Book

  • Nachos Study Book Part III Storage Management
    • Ch6 Memory Management
      • Ch6.1 From Programs To Address Space
      • Ch6.4 MIPS Simulator
      • Ch6.5 Nachos User Programs
      • Ch6.6 Address Space of User Process in Nachos
      • Ch6.7 Memory Management in Nachos
      • Ch6.8 From Thread to User Process
    • Ch7 Implementation of System Calls
      • Ch7.3 Exception and Trap
    • Ch8 Virtual Memory

Example

TLB Related

Other

Another

seems didn't consider TLB

Yet Another

Multi-thread Related