# MIPS Assembly Disassembler

## Overview

This Jupyter Notebook contains a MIPS assembly language disassembler, originally written in **2013** as part of the "Introduction to Computer Science" course at UFSC (Federal University of Santa Catarina). The disassembler is designed to take MIPS machine code instructions in hexadecimal format, convert them to binary, and then decode them into their corresponding MIPS assembly language instructions.

## Background

The program was developed during the course to demonstrate the process of decoding and understanding machine code instructions. It covers various MIPS instruction formats, including R-type, I-type, and J-type instructions. More info: https://en.wikipedia.org/wiki/MIPS_architecture

## Usage

To use the disassembler, follow the menu prompts within the Jupyter Notebook. You can input MIPS machine code instructions either directly or from a file. The program will convert the hexadecimal instructions to binary and decode them into human-readable MIPS assembly instructions. The disassembled instructions will be displayed in the console.

## Note

This code was written in **2013** during my "Introduction to Computer Science" course at UFSC (Federal University of Santa Catarina). It represents my learning journey at that specific time and does not showcase my current Python programming and knowledge level. It is pasted here without any modification. 

In [2]:
def get_register_name(register_index):
    register_names = {
        0: "$zero", 1: "$at", 2: "$v0", 3: "$v1", 4: "$a0", 5: "$a1", 6: "$a2", 7: "$a3",
        8: "$t0", 9: "$t1", 10: "$t2", 11: "$t3", 12: "$t4", 13: "$t5", 14: "$t6", 15: "$t7",
        16: "$s0", 17: "$s1", 18: "$s2", 19: "$s3", 20: "$s4", 21: "$s5", 22: "$s6", 23: "$s7",
        24: "$t8", 25: "$t9", 26: "$k0", 27: "$k1", 28: "$gp", 29: "$sp", 30: "$fp", 31: "$ra"
    }
    return register_names.get(register_index, "INEXISTENTE")

def two_comp(val, bits):
    if( (val&(1<<(bits-1))) != 0 ):
        val = val - (1<<bits)
    return val

def get_shift_amount(shift_binary):
    shamt = two_comp(int(shift_binary, 2), len(shift_binary))
    return str(shamt)

def get_immediate_value(immediate_binary):
    imm = two_comp(int(immediate_binary, 2), len(immediate_binary))
    return str(imm)

def is_type_r(instruction_binary):
    funct_code = instruction_binary[6:]
    instruction_binary = funct_code[:-6]
    rd_bin = instruction_binary[10:-5]
    rd_int = int(rd_bin, 2)
    rd = get_register_name(rd_int)

    rs_bin = instruction_binary[:-15]
    rs_int = int(rs_bin, 2)
    rs = get_register_name(rs_int)

    rt_bin = instruction_binary[5:-10]
    rt_int = int(rt_bin, 2)
    rt = get_register_name(rt_int)

    shift_bin = instruction_binary[15:]
    shift_int = get_shift_amount(shift_bin)

    if funct_code.endswith("100000"):
        return f"add {rd}, {rs}, {rt}"
    elif funct_code.endswith("100001"):
        return f"addu {rd}, {rs}, {rt}"
    elif funct_code.endswith("100010"):
        return f"sub {rd}, {rs}, {rt}"
    elif funct_code.endswith("100011"):
        return f"subu {rd}, {rs}, {rt}"
    elif funct_code.endswith("100100"):
        return f"and {rd}, {rs}, {rt}"
    elif funct_code.endswith("100111"):
        return f"nor {rd}, {rs}, {rt}"
    elif funct_code.endswith("100101"):
        return f"or {rd}, {rs}, {rt}"
    elif funct_code.endswith("101010"):
        return f"slt {rd}, {rs}, {rt}"
    elif funct_code.endswith("000000"):
        return f"sll {rd}, {rt}, {shift_int}"
    elif funct_code.endswith("000010"):
        return f"srl {rd}, {rt}, {shift_int}"
    else:
        return " ERRO: Instrução não definida no Instruction Set pedido"


def is_type_i(instruction_binary, memory_access):
    rs_bin = instruction_binary[:-21]
    rs_int = int(rs_bin, 2)
    rs = get_register_name(rs_int)

    rt_bin = instruction_binary[5:-16]
    rt_int = int(rt_bin, 2)
    rt = get_register_name(rt_int)

    imm_bin = instruction_binary[10:]
    imm_int = get_immediate_value(imm_bin)

    if memory_access:
        offset = imm_int
        return f"{rt}, {offset}({rs})"
    return f"{rt}, {rs}, {imm_int}"

def is_type_j(instruction_binary):
    address = instruction_binary
    address_hex = str(hex(int(address, 2)))
    return address_hex

def decode_opcode(instruction_binary):
    opcode_r = "000000"
    opcode_j = "000010"
    opcode_addi = "001000"
    opcode_andi = "001100"
    opcode_beq = "000100"
    opcode_bne = "000101"
    opcode_lw = "100011"
    opcode_ori = "001101"
    opcode_slti = "001010"
    opcode_sw = "101011"

    if instruction_binary.startswith(opcode_r):
        return is_type_r(instruction_binary)
    elif instruction_binary.startswith(opcode_j):
        instruction_binary = instruction_binary[6:]
        return f"j {is_type_j(instruction_binary)}"
    elif instruction_binary.startswith(opcode_addi):
        instruction_binary = instruction_binary[6:]
        return f"addi {is_type_i(instruction_binary, False)}"
    elif instruction_binary.startswith(opcode_andi):
        instruction_binary = instruction_binary[6:]
        return f"andi {is_type_i(instruction_binary, False)}"
    elif instruction_binary.startswith(opcode_beq):
        instruction_binary = instruction_binary[6:]
        return f"beq {is_type_i(instruction_binary, False)}"
    elif instruction_binary.startswith(opcode_bne):
        instruction_binary = instruction_binary[6:]
        return f"bne {is_type_i(instruction_binary, False)}"
    elif instruction_binary.startswith(opcode_lw):
        instruction_binary = instruction_binary[6:]
        return f"lw {is_type_i(instruction_binary, True)}"
    elif instruction_binary.startswith(opcode_ori):
        instruction_binary = instruction_binary[6:]
        return f"ori {is_type_i(instruction_binary, False)}"
    elif instruction_binary.startswith(opcode_slti):
        instruction_binary = instruction_binary[6:]
        return f"slti {is_type_i(instruction_binary, False)}"
    elif instruction_binary.startswith(opcode_sw):
        instruction_binary = instruction_binary[6:]
        return f"sw {is_type_i(instruction_binary, True)}"
    else:
        return "ERRO: Instrução não definida no Instruction Set pedido"

x = ""

while x != "0":
    print("Disassembler MIPS v1.0\nMENU:\n1. Ler do teclado\n2. Ler de um arquivo\n0. Sair\n")
    x = input("")

    if x == "1":
        x = input("Digite um codigo de instrução em Hexadecimal (começando obrigatoriamente por 0x)\t")
        while len(x) > 10:
            x = input("Digite um código de instrução válido (ex. 0x2A560000)\t")

        ins_hexa = x.replace("0x","")
        ins_bin = bin(int(ins_hexa, 16))[2:].zfill(32)

        print(f"Instrução em Binário: {ins_bin}")
        print(f"Instrução em Assembly: {decode_opcode(ins_bin)}\n\n")

    elif x == "2":
        nome = input("Digite o nome do arquivo: (ex. instrucoes.txt)\n")
        file = open(nome, "r")
        for line in file:
            inst_line = line[:10-len(line)]
            ins_hexa = inst_line.replace("0x","")
            ins_bin = bin(int(ins_hexa, 16))[2:].zfill(32)

            print(f"Instrução Lida: {inst_line}")
            print(f"Instrução em Binário: {ins_bin}")
            print(f"Instrução em Assembly: {decode_opcode(ins_bin)}\n\n")

Disassembler MIPS v1.0
MENU:
1. Ler do teclado
2. Ler de um arquivo
0. Sair



Instrução Lida: 0x00011020
Instrução em Binário: 00000000000000010001000000100000
Instrução em Assembly: add $v0, $zero, $at


Instrução Lida: 0x21280007
Instrução em Binário: 00100001001010000000000000000111
Instrução em Assembly: addi $t0, $t1, 7


Instrução Lida: 0x01F18024
Instrução em Binário: 00000001111100011000000000100100
Instrução em Assembly: and $s0, $t7, $s1


Instrução Lida: 0x32CB0002
Instrução em Binário: 00110010110010110000000000000010
Instrução em Assembly: andi $t3, $s6, 2


Instrução Lida: 0x1002FFFE
Instrução em Binário: 00010000000000101111111111111110
Instrução em Assembly: beq $v0, $zero, -2


Instrução Lida: 0x16930008
Instrução em Binário: 00010110100100110000000000001000
Instrução em Assembly: bne $s3, $s4, 8


Instrução Lida: 0x0800000C
Instrução em Binário: 00001000000000000000000000001100
Instrução em Assembly: j 0xc


Instrução Lida: 0x8D14FFFF
Instrução em Binário: 10001101000101001111111111111111
Instrução em Assembly: lw $s4, -1($t0)


Instrução Lida: