## Analysis and Refactoring of ```test_scripts``` Code

As part of the challenge, I have analyzed the provided sample code to identify areas for improvement, security, and best practices. This notebook presents my findings and a refactored solution in Python.

--- 

### 1. Security Audit and Critique of the Original Code
The shell script create_hashes_functions.sh has several vulnerabilities and deficiencies.

Critical Security Risk (eval): The script uses the eval command, which is a dangerous practice. If the input is manipulated, it could lead to a command injection.

Inefficiency and External Dependency: The script relies on external commands like openssl and base64, making it inefficient by creating new subprocesses for each operation.

Lack of Modularity and Maintainability: The code is repetitive and monolithic, making it difficult to read and maintain.

---

### 2. Environment Preparation
To run the refactored solution, the following Python libraries are needed:

In [5]:
# Install the necessary libraries
# The '!' command runs the command in the terminal
! pip install blake3 base58 -q

### 3. Refactored Solution in Python

This section presents the refactored Python code, which is safer, more modular, and portable. It uses native Python libraries like hashlib and base64, which eliminates the need for external subprocesses.

### Code Explanation: Refactored Hash Generation Script
This script refactors the original ```create_hashes_functions.sh``` shell script into a more secure, efficient, and maintainable Python solution. It replaces external command-line tool calls with native Python libraries and functions.

---

### How the Code Works

Library Imports and Error Handling

* Purpose: This block imports the necessary modules and gracefully handles cases where optional third-party libraries (```blake3```, ```base58```) are not installed.

* Functionality: It uses a ```try...except ImportError``` block for each optional library. If a library isn't found, it sets the corresponding variable to ```None``` and prints a warning message. This prevents the script from crashing and allows it to run with partial functionality.

```generate_hashes``` Function

* Purpose: This is the core function that generates various hashes and encodings from an input string.

* Workflow:

1. Input Encoding: It first converts the input string into bytes using ```.encode('utf-8')```, which is a necessary step for cryptographic hashing.

2. Standard Hashes: It uses Python's ```hashlib``` module to compute common hashes like MD5, SHA1, SHA256, and RIPEMD160. This is significantly more efficient than calling external commands.

3. Encodings: It uses the ```base64``` module to perform Base64 encoding natively.

4. Optional Hashes: It checks if the ```blake3``` and ```base58``` libraries are available. If they are, it uses them to generate the hashes; otherwise, it assigns a message stating that the library is unavailable.

5. Return Value: The function returns a dictionary, which is a highly readable and structured way to store the key-value pairs of hash names and their results.

```create_csv``` Function

* Purpose: This function safely and correctly writes the hash results to a CSV file.

* Workflow:

1. Error Check: It first checks if the ```hash_results``` dictionary is empty and exits if it is, preventing an empty file from being created.

2. Header and Data Preparation: It creates a ```header``` list from the dictionary's keys and a ```data_row``` list from its values. It also adds a column for the original string.

3. File Handling: It uses Python's standard ```with open(...)``` block, which ensures the file is automatically closed, even if an error occurs.

4. CSV Module: It utilizes the ```csv``` module's ```writer``` object. This is a critical improvement over the original script's manual string concatenation, as it correctly handles commas and special characters, preventing formatting errors and data corruption.

5. Writing: It uses ```writer.writerow()``` to write both the header and the data row to the file.

6. User Feedback: It prints a message confirming that the file has been created.

---

### Benefits of the Refactored Code

* Enhanced Security: By using native libraries and removing the dangerous ```eval``` command, the script is no longer vulnerable to command injection attacks.

* Improved Efficiency: It avoids the overhead of creating new sub-processes for each hashing operation, resulting in faster execution.

* Increased Portability: The script relies on standard Python libraries, making it easily executable on any system with a Python interpreter, regardless of which external command-line tools are installed.

* Higher Maintainability: The code is more modular, readable, and easier to debug, thanks to clear function definitions, docstrings, and a structured data flow. This adheres to Python's best practices (PEP 8), making it more professional and scalable.

In [6]:
import hashlib
import base64
import csv
import sys

try:
    import blake3
except ImportError:
    blake3 = None
    print("Warning: The 'blake3' library is not installed. Some hash functions may not work.")

try:
    import base58
except ImportError:
    base58 = None
    print("Warning: The 'base58' library is not installed. Some encoding functions may not work.")

def generate_hashes(input_string):
    """
    Generates various hashes and encodings for a given input string.
    
    Args:
        input_string (str): The text string to process.
        
    Returns:
        dict: A dictionary with the hash names as keys and the results as values.
    """
    hash_results = {}
    input_bytes = input_string.encode('utf-8')

    # Hashes from the standard hashlib module
    hash_results["MD5 hash"] = hashlib.md5(input_bytes).hexdigest()
    hash_results["SHA1 hash"] = hashlib.sha1(input_bytes).hexdigest()
    hash_results["SHA256 hash"] = hashlib.sha256(input_bytes).hexdigest()
    hash_results["SHA512 hash"] = hashlib.sha512(input_bytes).hexdigest()
    hash_results["RIPEMD160 hash"] = hashlib.new('ripemd160', input_bytes).hexdigest()

    # Encodings
    hash_results["Base64 encoding"] = base64.b64encode(input_bytes).decode('utf-8')

    # Hashes from external libraries (if available)
    if blake3:
        hash_results["Blake3 hash"] = blake3.blake3(input_bytes).hexdigest()
    else:
        hash_results["Blake3 hash"] = "blake3 library not available"

    if base58:
        hash_results["Base58 encoding"] = base58.b58encode(input_bytes).decode('utf-8')
    else:
        hash_results["Base58 encoding"] = "base58 library not available"
        
    return hash_results

def create_csv(csv_file, hash_results):
    """
    Writes the hash results to a CSV file.
    
    Args:
        csv_file (str): The name of the CSV file to create.
        hash_results (dict): The dictionary with the hash results.
    """
    if not hash_results:
        print("No hash results to write to CSV.")
        return

    header = ["Original string"] + list(hash_results.keys())
    data_row = ["Your input string here"] + list(hash_results.values())
    
    with open(csv_file, 'w', newline='') as f:
        writer = csv.writer(f)
        writer.writerow(header)
        writer.writerow(data_row)
    
    print(f"\nCSV file created: {csv_file}")

### 4. Demonstration and Results
This final section of the notebook demonstrates how to use the functions you've refactored and defined. It ties everything together to showcase the complete and functional solution, providing a practical example of its use.

---

### How the Code Works

Test Text String (```input_text```)

* Purpose: This line defines the string that will be processed. It acts as the test input to demonstrate that the ```generate_hashes``` function works correctly.

Generate the Hashes (```results = generate_hashes(input_text)```)

* Purpose: This is a function call that executes the core logic of your script. It passes the ```input_text``` string to the ```generate_hashes``` function, which computes all the specified hashes and encodings.

* Output: The returned dictionary containing all the hash results is stored in the ```results``` variable.

Print the Results

* Purpose: This block of code provides immediate feedback to the user by printing the generated hashes directly to the console.

* Functionality: It iterates through the ```results``` dictionary and prints each hash name and its corresponding value in a clear, formatted list. This allows for a quick visual check of the output without needing to open a file.

Create the CSV File (```create_csv("refactored_hashes.csv", results)```)

* Purpose: This line calls the ```create_csv``` function to save the results in a structured format.

* Functionality: It passes the desired filename (```refactored_hashes.csv```) and the ```results``` dictionary to the function. This action creates a new file on the local system, storing the data in a comma-separated value format that can be easily viewed in a spreadsheet program. This confirms that the entire workflow—from computation to file output—is working as intended.

In [7]:
# Test text string
input_text = "The company's home assignment is interesting."

# Generate the hashes
results = generate_hashes(input_text)

# Print the results
print("Hash Results:")
for key, value in results.items():
    print(f"- {key}: {value}")

# Create the CSV file
create_csv("refactored_hashes.csv", results)

Hash Results:
- MD5 hash: 8ee0a1933adc156d1de42de8ce80507c
- SHA1 hash: 253da8490160d4bcad4120a9f2d410996d6c3a57
- SHA256 hash: 405897a95b98c34a797a39a7e72a68c437d13fbfc2a8fad70944f308db8f2d4a
- SHA512 hash: a1417afb139b9358467be1541f9b655d191cedaa65b28fd03ca4ae8dfde83b9c128867ed20d45279201b72bca7a684e52f9a7cc352d6240bd5b6a5fa71787c3f
- RIPEMD160 hash: 2e34060ce1a2762130e4c195ccad4701bef08c2b
- Base64 encoding: VGhlIGNvbXBhbnkncyBob21lIGFzc2lnbm1lbnQgaXMgaW50ZXJlc3Rpbmcu
- Blake3 hash: 5737c635db64cb9e4666faf019a0b41d50ac0b441c8dc6307bd3666cfdee7ff3
- Base58 encoding: 368hQnHiV8SKYx7HhEHEVu3Q8Ze8SSVwFyeoBGyqw1j42TDb6L6UH4YgtG8HpZ

CSV file created: refactored_hashes.csv
