# QTM 350 - Data Science Computing

## Assignment 01 - Computational Literacy
### Due 11 September 2024

### Instructions

This assignment evaluates your understanding of topics covered in the first two weeks of class, including binary and hexadecimal number systems, ASCII encoding, and programming language fundamentals.

You must complete this assignment individually. While you may use available resources such as notes, books, and AI tools, you are expected to submit original work. Please acknowledge all resources used, including input from classmates and AI. If you are unsure about permissible resources or proper acknowledgement, please consult the instructor.

Present your solutions clearly and systematically, showing your problem-solving process. Please ensure that any code is well-commented.

### Submission

Please submit your solutions as either a single Jupyter notebook or a Quarto PDF file. Follow the instructions provided in each section carefully. Submit your completed assignment to Canvas or via email (danilo.freire@emory.edu) by Wednesday, September 11, at 11:59 PM.

### Question 01

Convert the decimal number 53 to binary. Show your work.

In [102]:
# initialize variable for 53
decimal = 53

def DecimalToBinary(decimal:int, showwork:bool) -> str:
    """
    Takes a decimal value returns its binary representation,
    printing each step to show work

    Params:
        decimal: integer value to be converted
        showwork: bool which indicates whether to print work
    """

    # list to keep track of bits for binary representation
    binary = []

    # track bit number
    i = 0
    # divide until 0
    while decimal > 0:
        # print iterations to show work
        if showwork:
            print(f"{decimal} / 2 = {decimal // 2}, remainder of {decimal % 2}, bit #{i} is {decimal % 2}")
        # insert bit to beginning of list
        binary.insert(0, str(decimal % 2))
        # divide current value
        decimal //= 2
        # next bit
        i += 1

    binary = ''.join(binary)
    # convert our list of bits to a string
    return binary

# save binary representaion of 53
binary = DecimalToBinary(decimal, True)

print(f"Binary representation of 53: {binary}")


53 / 2 = 26, remainder of 1, bit #0 is 1
26 / 2 = 13, remainder of 0, bit #1 is 0
13 / 2 = 6, remainder of 1, bit #2 is 1
6 / 2 = 3, remainder of 0, bit #3 is 0
3 / 2 = 1, remainder of 1, bit #4 is 1
1 / 2 = 0, remainder of 1, bit #5 is 1
Binary representation of 53: 110101


### Question 02

Convert the binary number 1011001 to decimal. Show your work.

In [105]:
# initialize binary number as str
binary = "1011001"

def BinaryToDecimal(binary: str, showwork:bool) -> int:
    """
    Takes a string representing a binary number
    and returns its decimal counterpart as an int

    Params:
        binary: str value to be converted
        showwork: bool which indicates whether to print work
    """

    # number of bits in binary number
    bit_num = len(binary) - 1
    # initialize var for decimal result
    decimal = 0

    # iterate over bits
    for bit in binary:
        # value to add
        to_add = int(bit) * 2**bit_num
        # print step to show work
        if showwork:
            print(f"adding {bit} * {2} ** {bit_num} = {to_add} to {decimal}")
        # add to decimal value
        decimal += to_add
        # next bit index
        bit_num -= 1

    return decimal

decimal = BinaryToDecimal(binary, True)
print(f"decimal representation of {binary}: {decimal}")

adding 1 * 2 ** 6 = 64 to 0
adding 0 * 2 ** 5 = 0 to 64
adding 1 * 2 ** 4 = 16 to 64
adding 1 * 2 ** 3 = 8 to 80
adding 0 * 2 ** 2 = 0 to 88
adding 0 * 2 ** 1 = 0 to 88
adding 1 * 2 ** 0 = 1 to 88
decimal representation of 1011001: 89


### Question 03

What is the hexadecimal representation of the RGB colour (128, 64, 255)? Explain your answer.

In [131]:
RGB = (128, 64, 255)

print(f"The hexadecimal representation of {RGB} is #8040FF.\n\
Observe that we can convert each of the 3 components in the RGB\n\
to binary, and then to hexadecimal:\n{RGB[0]} becomes {DecimalToBinary(RGB[0], False)},\
 then 1000 becomes 8 and 0000 becomes 0 so 128 is 80 in hexadecimal\n\
{RGB[1]} becomes {DecimalToBinary(RGB[1], False)}, 0100 becomes and 0000 becomes 0\
, so 64 is 40 in hexadecimal\n\
{RGB[2]} becomes {DecimalToBinary(RGB[2], False)}, both 1111s become F, \
so 255 becomes FF\n\
Putting these three hexadecimal representations together, the resulting hex code is #8040FF.")

The hexadecimal representation of (128, 64, 255) is #8040FF.
Observe that we can convert each of the 3 components in the RGB
to binary, and then to hexadecimal:
128 becomes 10000000, then 1000 becomes 8 and 0000 becomes 0 so 128 is 80 in hexadecimal
64 becomes 1000000, 0100 becomes and 0000 becomes 0, so 64 is 40 in hexadecimal
255 becomes 11111111, both 1111s become F, so 255 becomes FF
Putting these three hexadecimal representations together, the resulting hex code is #8040FF.


### Question 04

Convert the hexadecimal colour #2A9F3B to its RGB components. Show your steps.

In [175]:
hex = '2A9F3B'

def HexColorToRGB(hex:str, showwork:bool) -> tuple:
    """
    Takes a hex code for a color and returns its rgb
    representation in the form of a tuple

    Params:
        hex: 6 character long str hex code to be converted
        showwork: boolean indicating whether to show work or not
    
    """

    # initialize result
    RGB = []
    
    # define dictionary to lookup binary represenations of hex bits
    HexToBinary = {
    '0': '0000', '1': '0001', '2': '0010', '3': '0011',
    '4': '0100', '5': '0101', '6': '0110', '7': '0111',
    '8': '1000', '9': '1001', 'A': '1010', 'B': '1011',
    'C': '1100', 'D': '1101', 'E': '1110', 'F': '1111'
    }

    # iterate over pairs of hex bits
    i = 1
    while i < 6:
        # using previously defined
        print(''.join(hex[i-1:i+1]), 'becomes', HexToBinary[hex[i-1]] + HexToBinary[hex[i]],
              'which is then converted to', BinaryToDecimal(HexToBinary[hex[i-1]] + HexToBinary[hex[i]], False))
        RGB.append(BinaryToDecimal(HexToBinary[hex[i-1]] + HexToBinary[hex[i]], False))
        i += 2
    return tuple(RGB)


rgb = HexColorToRGB('2A9F3B', True)

print(hex, 'becomes', rgb)

    

2A becomes 00101010 which is then converted to 42
9F becomes 10011111 which is then converted to 159
3B becomes 00111011 which is then converted to 59
2A9F3B becomes (42, 159, 59)


### Question 05

Using the coin representation system described in the lecture (c[quarters][dimes][nickels][pennies]), convert $1.37 to coin representation. Explain your reasoning.

In [204]:
def DollarToCoins(USD:float) -> str:
    """
    Takes a dollar amount and converts it to the
    coin representation taught in class (str)

    Params:
        USD: float of a dollar amount to convert
    """
    q, d, n, p = 0, 0, 0, 0

    while USD >= .25:
        q += 1
        USD -= .25
    while USD >= 0.1:
        d += 1
        USD -= .1
    while USD >= .05:
        n += 1
        USD -= .05
    while USD > 0.01:
        p += 1
        USD -= 0.01
    return f"c{q}{d}{n}{p}"

print(f"$1.37 in our coin system can be represented as {DollarToCoins(1.37)}\n\
The thought process was to use as many quarters as possible first, then\n\
to use dimes, then nickels, then pennies as to use the smallest amount\n\
of coins as possible.")

$1.37 in our coin system can be represented as c5102
The thought process was to use as many quarters as possible first, then
to use dimes, then nickels, then pennies as to use the smallest amount
of coins as possible.


### Question 06

What is the Unicode representation of the word "Emory"? Use the Unicode table provided in the lecture to find out.

In [231]:
print("Using the provided table, Emory in unicode can be represented as: \\u0045\\u006d\\u006f\\u0072\\u0079")

Using the provided table, Emory in unicode can be represented as: \u0045\u006d\u006f\u0072\u0079


### Question 07

Explain the difference between ASCII and Unicode. Why was Unicode developed, and what advantages does it offer over ASCII?

In [227]:
print("ASCII includes only unaccented characters, whereas Unicode includes a significantly greater amount of characters\n\
including all languages, symbols, emojis, etc. Unicode provides representations for many more characters.")

ASCII includes only unaccented characters, whereas Unicode includes a significantly greater amount of characters
including all languages, symbols, emojis, etc. Unicode provides representations for many more characters.


### Question 08

Describe the Von Neumann architecture and its significance in modern computing. What is the Von Neumann bottleneck, and how does it affect computer performance?

In [235]:
print("The Von Neumann architecure was a turning point for computing, as it transitioned computers from \n\
being mechanical to electronic. A computer following the Von Nuemann architecure contains a central\n\
unit responsible for arithmetic logic and instructions, as well as a memory unit which stores data\n\
and instructions. This memory unit acts a slow storage medium, where data/instructions can then be\n\
transferred to a faster storage unit, RAM, to be executed. This helped distinguish programs, and data\n\
allowing for increased efficiency and robustness in handling/manipulating data. The bottleneck\n\
is that performance is limited due to the fact that instructions and data are processed sequentially\n\
leading to the CPU having to wait for data, as it is faster than the memory unit.")

The Von Neumann architecure was a turning point for computing, as it transitioned computers from 
being mechanical to electronic. A computer following the Von Nuemann architecure contains a central
unit responsible for arithmetic logic and instructions, as well as a memory unit which stores data
and instructions. This memory unit acts a slow storage medium, where data/instructions can then be
transferred to a faster storage unit, RAM, to be executed. This helped distinguish programs, and data
allowing for increased efficiency and robustness in handling/manipulating data. The bottleneck
is that performance is limited due to the fact that instructions and data are processed sequentially
leading to the CPU having to wait for data, as it is faster than the memory unit.


### Question 09

Compare and contrast low-level and high-level programming languages. Give two examples of each and explain when you might choose to use one over the other.

In [245]:
print("Low-level languages and high-level languages are both used to interact with machines\n\
directly, allowing for the translation of a readable language into binary code, which is \n\
unreadable by humans, but necessary to program. Low-level languages are more similar to\n\
to machine code, and are consequentially more complicated, and harder to learn than\n\
higher-level languages. Another difference is that high-level languages deviate from\n\
having to specify hardware details, allowing them to be far more portable across systems.\n\n\
Assembly and Intel86 are examples of low-level languages. Low-level languages are much\n\
faster and more efficient, therefore can be used in situations where speed and performace\n\
is critical, as these languages are far less computationally expensive.\n\n\
Some high-level languages are Java and C++. These are excellent when\n\
developing a less complex program, such as some form of analysis or web development\n\
as they are much easier to write, learn and understand.")

Low-level languages and high-level languages are both used to interact with machines
directly, allowing for the translation of a readable language into binary code, which is 
unreadable by humans, but necessary to program. Low-level languages are more similar to
to machine code, and are consequentially more complicated, and harder to learn than
higher-level languages. Another difference is that high-level languages deviate from
having to specify hardware details, allowing them to be far more portable across systems.

Assembly and Intel86 are examples of low-level languages. Low-level languages are much
faster and more efficient, therefore can be used in situations where speed and performace
is critical, as these languages are far less computationally expensive.

Some high-level languages are Java and C++. These are excellent when
developing a less complex program, such as some form of analysis or web development
as they are much easier to write, learn and understand.


### Question 10

Discuss the concept of abstraction in computer science, using the representation of images in computers as an example. How does this abstraction process impact data analysis and predictive modelling in image-related tasks?

In [249]:
print("Abstraction is a central theme in computer science. It involves reducing the number\n\
of variables and moving parts, and prioritizing only crucial details. This is the concept\n\
behind higher-level languages, as it allows for programmers to focus on elements separate\n\
from hardware details, or other low-level processess. Using images as an example, we see\n\
that at the lowest level, images are simply a grid of pixels, each encoded as an rgb tuple\n\
representing a color. At the higher-levels, image recognition software which uses CNNs\n\
are capable of identifying certain things in images by identifying reocurring patterns\n\
in pixels. Abstraction pushes the boundary of what a computer can interpret, allowing for\n\
increased efficiency in analysis and prediction. An example - in the past, linear regression\n\
required a far more manual and tedious process, however now, it can be as simple as importing\n\
a library in Python. Of course there are cons, such as oversimplification, or using a model\n\
despite the data not fitting the assumptions.")

Abstraction is a central theme in computer science. It involves reducing the number
of variables and moving parts, and prioritizing only crucial details. This is the concept
behind higher-level languages, as it allows for programmers to focus on elements separate
from hardware details, or other low-level processess. Using images as an example, we see
that at the lowest level, images are simply a grid of pixels, each encoded as an rgb tuple
representing a color. At the higher-levels, image recognition software which uses CNNs
are capable of identifying certain things in images by identifying reocurring patterns
in pixels. Abstraction pushes the boundary of what a computer can interpret, allowing for
increased efficiency in analysis and prediction. An example - in the past, linear regression
required a far more manual and tedious process, however now, it can be as simple as importing
a library in Python. Of course there are cons, such as oversimplification, or using a model
despite the d