In [1]:
!pip3 install otter-grader



In [5]:
# Initialize Otter
import otter
grader = otter.Notebook("assignment1.ipynb")

# DS 453 / 653: Programming Assignment 1

**Due date**: Thursday, January 25 at 8pm on [Gradescope](https://www.gradescope.com/courses/710247).

_You must follow the Academic Code of Conduct and Collaboration Policy stated in the course syllabus at all times while working on this assignment._

This assignment contains 5 questions, each worth 1 point. You must receive at least 4 points to pass the assignment.

To begin, please execute the code block below:

In [3]:
import otter
grader = otter.Notebook()

**Question 1 (Integer to Hex):** Write a function `int_byte_to_hex` that receives a Python integer in the range 0 to 255, and returns a Python string corresponding to the same number in base 16 / hexadecimal format.

As background: remember that a _byte_ of information contains 8 bits, so it can be used to represent a number between 0 and 255 (inclusive). In crypto, we often convert between different representations of a byte of information. One common representation that we will use often in this class is hexadecimal format, in which each symbol (0-9 or a-f) corresponds to a number between 0 and 15, and a byte can be represented with exactly two symbols.

In [4]:
def int_byte_to_hex(the_input):
    """ Read a single, decimal byte from the user and return a string of its
    hexidecimal value. This string should use lowercase and should always be
    exactly two characters long. Make sure you pad the beginning with a 0 if
    necessary, and make sure the string does NOT start with '0x'.

    Example test cases:

        255 -> "ff"
        10 -> "0a"
        65 -> "41"
        161 -> "a1"
    """
    assert(type(the_input) == int)               # the input is an integer
    assert(the_input >= 0 and the_input <= 255)  # between 0 and 255, inclusive

    map = {10:"a", 11:"b", 12:"c", 13:"d", 14:"e", 15:"f"}
    
    # First digit --> d1
    d1 = the_input//16
    if d1 > 9:
        d1 = map[d1]

    # Second digit --> d2
    d2 = the_input%16
    if d2 > 9:
        d2 = map[d2]

    res = str(d1) + str(d2)
    assert(res.startswith("0x") == False)

    return res

As you will see throughout this course, crypto code can be incredibly precise and finicky. For this reason: most programming questions will be accommpanied by **test cases**, which are example inputs and outputs to a function that allow you to check whether your code is working properly.

In this question, we provide test cases in human-readable format in the comment above, and we also provide unit tests for you to use below.

In [5]:
grader.check("q1")

**Question 2 (Binary string to Hex):** Write a function `string_to_hexstring` that receives a Python bytestring as input, and returns the hexadecimal representation of the same bytestring.

As a hint: the `binascii` library includes a function that will do this for you. Take a look at its documentation here: https://docs.python.org/3/library/binascii.html

In [3]:
def string_to_hexstring(the_input):
    """ Take in a string (byte object), and return the hex string of the bytes
    corresponding to it. While the hexlify() command will do this for you,
    we ask that you instead solve this question by combining the methods
    you have written so far in this assignment.

    Example test case: b"puzzle" -> b"70757a7a6c65"
    """
    import binascii
    
    return binascii.b2a_hex(the_input)

Once again, we provide several unit tests for you. Note that in Python, bytestrings _must_ begin with the letter `b` before the quotation marks at the start of a string. Remember **always** to do this when creating bytestrings in this course! (Otherwise, you can find yourself stuck with an annoying error that is difficult to debug.)

The prefix `b` changes the type of the Python variable from `string` to `bytes`. We will usually prefer to use `bytes` in this course.

In [7]:
print(type("puzzle"), type(b"puzzle"))

<class 'str'> <class 'bytes'>


In [6]:
grader.check("q2")

**Question 3 (Length of bytestring vs hexadecimal string):** Note from the previous question that the length of a Python bytestring is different from the length of its hexadecimal encoding! Let's explore that in more detail in this question.

Write a function `compare_lengths` that receives an integer length, constructs a bytestring of the given length, and measures the length of the resulting hexadecimal string. (We've actually done the first part for you.)

In [23]:
def hex_length(input_length):
    """ Construct a test_bytestring of the provided input_length,
        convert it to hexadecimal format, and then output
        the length of the resulting hex string.

    Example test case: 6 -> 12
    """
    test_bytestring = input_length * b'\x00'
    print(test_bytestring)

    import binascii
    test_hexstring = binascii.b2a_hex(test_bytestring)
    print(test_hexstring)

    return len(test_hexstring)

In [24]:
hex_length(5)

b'\x00\x00\x00\x00\x00'
b'0000000000'


10

This time, we have only provided one publicly-visible test. But you should try to run this function on many inputs, and observe the resulting pattern.

In [10]:
grader.check("q3")

**Question 4 (Hex to Base64):** Write a function `hexstring_to_base64` that receives a hexadecimal encoding of a bytestring, and returns the Base 64 encoding of the same bytestring.

(Note: this is actually the first challenge in the [Cryptopals](https://cryptopals.com/sets/1/challenges/1) crypto coding challenges. We might see some other challenges from Cryptopals later in the semester.)

This time, you might find the `base64` library to be useful. Whenever you explore a new Python library in this course, it is always a good idea to take a few minutes to read its documentation. Here's the documentation for `base64`: https://docs.python.org/3/library/base64.html

In [11]:
import base64
import binascii

def hexstring_to_base64(the_input):
    """ Convert the_input from hex to base64 format.

    Example test cases:
    b'1de965a3ef96a2b95d' -> b'Hello++World'
    b'49276d206b696c6c696e6720796f757220627261696e206c696b65206120706f69736f6e6f7573206d757368726f6f6d' -> b'SSdtIGtpbGxpbmcgeW91ciBicmFpbiBsaWtlIGEgcG9pc29ub3VzIG11c2hyb29t'
    """
    # Convert the hexstring input into bytestring
    bytestr = binascii.a2b_hex(the_input)

    # Return the base-64 encoding of the bytestring
    return base64.b64encode(bytestr)

In [12]:
grader.check("q4")

**Question 5 (One time pad):** Convert the inputs `hexMessage` and `hexKey` from hex to bytestrings, take the XOR of the two strings, and convert the result back to hex form.

In more detail: this procedure performs our first cryptographic function, a _one-time pad_. The one-time pad takes two inputs, typically called the `message` and the `key`, and returns an output called the `ciphertext`. We will explore the one-time pad in more detail in a future week. For now, all that matters is that the `ciphertext` is defined as `message` XOR `key` -- where the XOR is done on a byte-by-byte basis, and the XOR is done over the raw bytestrings.

However, it is often convenient for the inputs and outputs of a function to be hexadecimal representations of strings. The reason for that is simple: raw bytestrings might not always be printable, but hex strings always are. Consider for example the following 4-byte string.

In [13]:
example_str = b'\x00\x01\x02\x03'
print("The length of example_str is:\t\t\t\t",         len(example_str))
print("If we try print it, the output is difficult to parse:\t",     example_str)
print("The hex representation is easy to read and copy:\t", string_to_hexstring(example_str))

The length of example_str is:				 4
If we try print it, the output is difficult to parse:	 b'\x00\x01\x02\x03'
The hex representation is easy to read and copy:	 b'00010203'


For this reason, we will encode our inputs in the hexadecimal representation. Your code must convert them to bytestrings first, and then perform the XOR function. There is a Python function in the `Crypto.Util` module that performs the string XOR for you.

(If you are running this code on your own machine rather than Google Colab: in order use this package, you must install the `pycrypto` package on your computer using a command such as `pip3 install pycrypto`.)

In [14]:
!pip3 install pycrypto



In [15]:
from Crypto.Util.strxor import strxor
import base64
import binascii

def one_time_pad(hexMessage, hexKey):
    """ Convert both inputs from hexadecimal to raw bytestrings,
        take the xor of each byte of their strings, convert the
        result back into hexadecimal format, and output it.

        You can assume as a precondition that the two inputs have the same length.
        We provide some test cases below.
    """
    assert(len(hexMessage) == len(hexKey))
    bytestrMessage = binascii.a2b_hex(hexMessage)
    bytestrKey = binascii.a2b_hex(hexKey)
    res = strxor(bytestrMessage, bytestrKey)
    return(binascii.b2a_hex(res))

In [16]:
grader.check("q5")

## Submitting the Assignment

Congratulations on completing the first assignment! Here's how to submit it and receive credit.

**Documenting collaborators, sources, and AI tools:** In accordance with the collaboration policy, use the space below to report if you used any resources to complete this homework assignment, aside from the lecture notes and the course textbooks/videos. Specifically, please report:

1. Names of all classmates you worked with, and a short description of the work that you performed together.
2. All written materials that you used, such as books or websites (besides the lecture notes or textbooks). Please include links to any web-based resources, or citations to any physical works.
3. All code that you used from other sources. In particular, if you used an AI tool, then you must include the entire exchange with the AI tool, as per the [CDS Generative AI Assistance Policy](https://www.bu.edu/cds-faculty/culture-community/gaia-policy/).

_Your response:_

1. N/A

2. N/A

3. N/A

**Sending to Gradescope:**

After completing the assignment:
    if you did the assignment on Colab, download it in `.ipynb` format
    if you did the assignment locally on your machine, all you need to do is to find it in your directory

Then, submit only the `.ipynb` file to Coding Assignment 1 on the GradeScope and it takes a while for the auto grading system to check your work.

In [2]:
from hashlib import sha256
from binascii import hexlify

redacted = 'h'
stri = redacted.encode('utf-8')
print(stri)

hashed = sha256(stri).digest()
print(hashed)

hashedhex = hexlify(hashed)
print(hashedhex)

result = hexlify(hashedhex).decode('utf-8')
print(result)

b'h'
b'\xaa\xa9@&d\xf1\xa4\x1f@\xeb\xbcR\xc9\x99>\xb6j\xeb6f\x02\x95\x8f\xdf\xaa(;q\xe6M\xb1#'
b'aaa9402664f1a41f40ebbc52c9993eb66aeb366602958fdfaa283b71e64db123'
61616139343032363634663161343166343065626263353263393939336562363661656233363636303239353866646661613238336237316536346462313233
