**Student Name:** Sneha Tamang

**University ID:** 2358894

# Task 3
**# Password Cracking using CUDA (30% - 100 marks)**

Using a similar concept as question 2, you will now crack passwords using CUDA. As a kernel function cannot use the crypt library, you will be given an encryption function instead which will generate a password for you.  Your program will take in an encrypted password and decrypt it using many threads on the GPU. CUDA allows multidimensional thread configurations so your kernel function (which runs on the GPU) will need to be modified according to how you call your function.

**Generate encrypted password in the kernel function (using CudaCrypt function) to be compared to original encrypted password (25 marks)**

**Allocating the correct amount of memory on the GPU based on input data. Memory is freed once used (15 marks)**

**Program works with multiple blocks and threads – the number of blocks and threads will depend on your kernel function. You will not be penalised if your program only works with a set number of blocks and threads however, your program must use more than one block (axis is up to you) and more than one thread (axis is up to you) (40 marks)**

**Decrypted password sent back to the CPU and printed (20 marks)**



In [None]:
!nvidia-smi

Wed Jan  8 15:57:08 2025       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.104.05             Driver Version: 535.104.05   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|   0  Tesla T4                       Off | 00000000:00:04.0 Off |                    0 |
| N/A   38C    P8               9W /  70W |      0MiB / 15360MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                    

In [None]:
!nvcc --version

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Tue_Aug_15_22:02:13_PDT_2023
Cuda compilation tools, release 12.2, V12.2.140
Build cuda_12.2.r12.2/compiler.33191640_0


In [None]:
%%writefile Encrypt.cu
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <cuda_runtime.h>

// CUDA function to perform encryption using custom rules
__device__ char* CudaCrypt(char* rawPassword) {
    char *newPassword = (char *) malloc(sizeof(char) * 11);

    newPassword[0] = rawPassword[0] + 2;
    newPassword[1] = rawPassword[0] - 2;
    newPassword[2] = rawPassword[0] + 1;
    newPassword[3] = rawPassword[1] + 3;
    newPassword[4] = rawPassword[1] - 3;
    newPassword[5] = rawPassword[1] - 1;
    newPassword[6] = rawPassword[2] + 2;
    newPassword[7] = rawPassword[2] - 2;
    newPassword[8] = rawPassword[3] + 4;
    newPassword[9] = rawPassword[3] - 4;
    newPassword[10] = '\0';

    // Apply limits for lowercase and numbers
    for (int i = 0; i < 10; i++) {
        if (i >= 0 && i < 6) {  // Lowercase letters
            if (newPassword[i] > 122) {
                newPassword[i] = (newPassword[i] - 122) + 97;
            } else if (newPassword[i] < 97) {
                newPassword[i] = (97 - newPassword[i]) + 97;
            }
        } else {  // Digits (0-9)
            if (newPassword[i] > 57) {
                newPassword[i] = (newPassword[i] - 57) + 48;
            } else if (newPassword[i] < 48) {
                newPassword[i] = (48 - newPassword[i]) + 48;
            }
        }
    }

    return newPassword;
}

// CUDA kernel to apply CudaCrypt function on a password
__global__ void encrypt_kernel(char *inputPassword, char *encryptedPassword) {
    int index = blockIdx.x * blockDim.x + threadIdx.x;

    // Assume we are only processing 4 characters for simplicity here
    if (index < 1) {
        char rawPassword[4];
        rawPassword[0] = inputPassword[0];
        rawPassword[1] = inputPassword[1];
        rawPassword[2] = inputPassword[2];
        rawPassword[3] = inputPassword[3];

        char* encrypted = CudaCrypt(rawPassword);

        // Copy the encrypted password back to global memory
        for (int i = 0; i < 10; i++) {
            encryptedPassword[i] = encrypted[i];
        }
    }
}

int main(int argc, char *argv[]) {
    // Ensure a password is provided as argument
    if (argc < 2) {
        printf("Usage: %s <password>\n", argv[0]);
        return 1;
    }

    // Allocate memory for the output encrypted password
    char encryptedPassword[11];  // 10 characters + null terminator

    // Allocate device memory
    char *d_inputPassword, *d_encryptedPassword;
    cudaMalloc((void**)&d_inputPassword, sizeof(char) * 4);  // Assuming password is 4 characters
    cudaMalloc((void**)&d_encryptedPassword, sizeof(char) * 11);

    // Copy the input password to device memory
    cudaMemcpy(d_inputPassword, argv[1], sizeof(char) * 4, cudaMemcpyHostToDevice);

    // Launch kernel
    encrypt_kernel<<<1, 1>>>(d_inputPassword, d_encryptedPassword);
    cudaDeviceSynchronize();

    // Copy the result back to host memory
    cudaMemcpy(encryptedPassword, d_encryptedPassword, sizeof(char) * 11, cudaMemcpyDeviceToHost);

    // Print the result (in plain text)
    printf("Encrypted Password: %s\n", encryptedPassword);

    // Free device memory
    cudaFree(d_inputPassword);
    cudaFree(d_encryptedPassword);

    return 0;
}


Writing Encrypt.cu


In [None]:
!nvcc Encrypt.cu -o cudacrup

In [None]:
!./cudacrup hp22

Encrypted Password: jfismo4062


In [12]:
%%writefile passwordCrack.cu
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

// CUDA functions for string copying and comparison, designed for GPU usage
// We cannot use standard C functions like strcpy() or strcmp() in GPU code

// Function to copy a string from source to destination on the GPU
__device__ char * copyStr(char *dest, const char *src){
  int i = 0;
    // Copy each character from the source to the destination until the null terminator is reached
  do {
    dest[i] = src[i];
  }
  while (src[i++] != 0); // Increment index until null terminator is found
  return dest; // Return pointer to the destination string
}

// Function to compare two strings on the GPU
__device__ bool compareStr(const char *strA, const char *strB, unsigned len = 11){
  unsigned i = 0;
  // Loop through the characters of both strings to compare them
  while (i < len) {
    if (strA[i] != strB[i]) {
      return false;  // Return false immediately if characters don't match
    }
    i++;
  }
  return true;  // Return true if all characters match
}

// Kernel function for encrypting the password
__device__ void cudaCrypt(char *newPassword, const char *rawPassword) {
  // Encrypt the generated password
  newPassword[0] = rawPassword[0] + 2;
  newPassword[1] = rawPassword[0] - 2;
  newPassword[2] = rawPassword[0] + 1;
  newPassword[3] = rawPassword[1] + 3;
  newPassword[4] = rawPassword[1] - 3;
  newPassword[5] = rawPassword[1] - 1;
  newPassword[6] = rawPassword[2] + 2;
  newPassword[7] = rawPassword[2] - 2;
  newPassword[8] = rawPassword[3] + 4;
  newPassword[9] = rawPassword[3] - 4;
  newPassword[10] = '\0';

  // Apply range checks to ensure characters stay within certain limits

  for (int i = 0; i < 10; i++) {
    if (i >= 0 && i < 6) {
      // Ensure characters remain in the lowercase 'a' to 'z' range
      if (newPassword[i] > 122) {
        newPassword[i] = (newPassword[i] - 122) + 97;  // Wrap around if greater than 'z'
      } else if (newPassword[i] < 97) {
        newPassword[i] = (97 - newPassword[i]) + 97;   // Wrap around if less than 'a'
      }
    } else {
      // Ensure characters remain in the digit '0' to '9' range
      if (newPassword[i] > 57) {
        newPassword[i] = (newPassword[i] - 57) + 48; // Wrap around if greater than '9'
      } else if (newPassword[i] < 48) {
        newPassword[i] = (48 - newPassword[i]) + 48;  // Wrap around if less than '0'
      }
    }
  }
}

// Kernel function to try all possible combinations for cracking the password
__global__ void findPassword(char *D_chars, char *D_digits, char *D_encPwd, bool *D_passwordFound) {
  char rawPassword[4];
  char newPassword[11];  // To hold the encrypted password

  // Generate a password combination using grid and thread indices
  // The grid's blockIdx.x and blockIdx.y will determine which characters are picked from D_chars
  // The thread's threadIdx.x and threadIdx.y will determine which digits are picked from D_digits
  rawPassword[0] = D_chars[blockIdx.x]; // character based on blockIdx.x
  rawPassword[1] = D_chars[blockIdx.y]; // character based on blockIdx.y
  rawPassword[2] = D_digits[threadIdx.x]; // digit based on threadIdx.x
  rawPassword[3] = D_digits[threadIdx.y]; // digit based on threadIdx.y

  // Encrypt the generated password using the cudaCrypt function
  cudaCrypt(newPassword, rawPassword);

  // Compare the generated encrypted password with the target encrypted password
  if (compareStr(newPassword, D_encPwd)) {
    // Password matched, set flag to true
    *D_passwordFound = true;
    copyStr(D_encPwd, rawPassword);  // Store the decrypted password
  }
}

int main(int argc, char **argv) {
  // Define possible characters and digits for password cracking
  char H_availableChars[26] = {'a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z'};
  char H_availableDigits[26] = {'0','1','2','3','4','5','6','7','8','9'};
  char H_newPassword[11];  // For user input
  char * H_decryptedPwd = (char *)malloc(sizeof(char) * 11);  // Ensure we allocate enough space
  bool * H_passwordFound = (bool *)malloc(sizeof(bool));  // Flag for password found

  // Get the encrypted password from the user, ensuring it's not empty
  while (1) {
    printf("Enter the Encrypted password: ");
    fgets(H_newPassword, sizeof(H_newPassword), stdin);
    // Remove newline character if any
    if (H_newPassword[strlen(H_newPassword) - 1] == '\n') {
      H_newPassword[strlen(H_newPassword) - 1] = '\0';
    }

    if (strlen(H_newPassword) > 0) {
      break;  // Exit loop if password is not empty
    } else {
      printf("The encryption cannot be empty. Please write your encrypted password: \n");
    }
  }

  // Allocate memory on GPU (Device memory)
  char * D_chars, * D_digits, * D_encPwd;
  bool * D_passwordFound;
  cudaMalloc((void**)&D_chars, sizeof(char) * 26);
  cudaMemcpy(D_chars, H_availableChars, sizeof(char) * 26, cudaMemcpyHostToDevice);

  cudaMalloc((void**)&D_digits, sizeof(char) * 26);
  cudaMemcpy(D_digits, H_availableDigits, sizeof(char) * 26, cudaMemcpyHostToDevice);

  cudaMalloc((void**)&D_encPwd, sizeof(char) * 11);  // Use correct length for encrypted password
  cudaMemcpy(D_encPwd, H_newPassword, sizeof(char) * 11, cudaMemcpyHostToDevice);

  // Allocate memory for passwordFound flag
  cudaMalloc((void**)&D_passwordFound, sizeof(bool));
  cudaMemcpy(D_passwordFound, H_passwordFound, sizeof(bool), cudaMemcpyHostToDevice);

  // Start measuring time
  cudaEvent_t startTime, endTime;
  float elapsedTime;
  cudaEventCreate(&startTime);
  cudaEventCreate(&endTime);
  cudaEventRecord(startTime);

  // Launch the kernel to crack the password
  findPassword<<<dim3(26, 26, 1), dim3(10, 10, 1)>>>(D_chars, D_digits, D_encPwd, D_passwordFound);
  cudaDeviceSynchronize();  // Wait for the kernel to finish execution

  cudaEventRecord(endTime);
  cudaEventSynchronize(endTime);
  cudaEventElapsedTime(&elapsedTime, startTime, endTime);

  // Copy the flag back to host memory
  cudaMemcpy(H_passwordFound, D_passwordFound, sizeof(bool), cudaMemcpyDeviceToHost);
  cudaMemcpy(H_decryptedPwd, D_encPwd, sizeof(char) * 11, cudaMemcpyDeviceToHost);  // Correct length

  if (*H_passwordFound) {
    printf("\nPassword Found!");
    printf("\nDecrypted Password: %s\n", H_decryptedPwd);
  } else {
    printf("\nPassword not found.\n");
  }

  printf("Time taken: %fs\n", elapsedTime / 1000);

  // Clean up
  free(H_decryptedPwd);
  free(H_passwordFound);
  cudaFree(D_chars);
  cudaFree(D_digits);
  cudaFree(D_encPwd);
  cudaFree(D_passwordFound);
  cudaEventDestroy(startTime);
  cudaEventDestroy(endTime);

  return 0;
}

Overwriting passwordCrack.cu


In [13]:
!nvcc passwordCrack.cu -o password_cracker

In [15]:
!./password_cracker

Enter the Encrypted password: jfismo4062

Password Found!
Decrypted Password: hp22
Time taken: 0.000184s


CUDA program ***passwordCrack.cu*** simulates password cracking through brute force, specifically for a password encryption scheme.

**Overview**

The program works as follows:

* Takes an encrypted password as input from the user.

* Uses CUDA to generate all possible combinations of lowercase letters (a-z) and digits (0-9).

* Encrypts each combination using a custom algorithm and compares it to the input password.

* If a match is found, the program outputs the decrypted password.

* Measures and displays the time taken for the process.

**How it works**

**1. User Input:**

* The user provides an encrypted password (a string of 10 characters generated using a specific encryption logic).
* The program ensures the input is valid (not empty) and stores it for further processing.

**2. Prepare Data for GPU:**

* The program defines all possible lowercase letters (a to z) and digits (0 to 9).
* These characters and the encrypted password are copied to the GPU's memory. This is necessary because the GPU operates independently of the CPU.

**3. Password Generation and Encryption:**

* The password to be cracked has a specific format: two letters followed by two digits (e.g., ab12).
* The program uses the GPU to generate all possible combinations of such passwords. Each combination is then encrypted using the same logic used to create the user's encrypted password.

**4. Comparison**:

* The encrypted version of each generated password is compared with the user's encrypted password.
* If a match is found, the corresponding raw password is saved, and a flag (passwordFound) is set to indicate success.

**5. Parallel Execution:**

* The GPU's threads handle different combinations simultaneously. For example:
* One thread might check aa00, another aa01, and so on.
This parallel processing greatly speeds up the task, as thousands of combinations can be checked simultaneously.

**6. Result Collection:**

Once the kernel (the part of the program that runs on the GPU) finishes execution, the results (whether the password was found and what it is) are copied back to the CPU's memory.

**7. Output:**

If the password is found, it is displayed to the user.
The program also calculates and displays the time taken to crack the password.


**Functions and GPU Kernels:**

***copyStr():***

A device function to copy a string from one location to another. It is used to copy strings between device memory locations.

***compareStr():***

A device function that compares two strings for equality. It compares character by character up to a specified length (default is 11 characters).


***cudaCrypt():***

A device function that performs encryption on a raw password. It encrypts the password by modifying each character using simple arithmetic operations (adding/subtracting values) and ensures the characters remain within certain ranges (lowercase 'a' to 'z' for letters and '0' to '9' for digits). It does this for 10 characters of the password.

***findPassword():***

 A kernel function that generates all possible combinations of characters and digits for the password, encrypts each combination using the cudaCrypt function, and compares it with a target encrypted password. If the password matches, it sets a flag (D_passwordFound) and stores the decrypted password in the device memory.

**2. Host Code (Main Program):**

***Password Input:***

The user is prompted to enter an encrypted password. The program ensures that the password is not empty and reads it into H_newPassword.

***Memory Allocation:***

Memory is allocated on both the host (CPU) and device (GPU) for necessary data:
H_availableChars and H_availableDigits hold the set of possible characters and digits to try for cracking the password. D_chars, D_digits, D_encPwd, and D_passwordFound hold the corresponding device versions of these arrays.

***Kernel Launch:***

The findPassword kernel is launched with a grid size of 26x26 (for 26 characters in the alphabet), and each thread block attempts to generate a potential password by picking a combination of characters and digits. The grid and thread indices are used to select characters and digits for each password attempt.

***Time Measurement:***

The program uses CUDA events (cudaEvent_t) to measure the time taken to execute the kernel and crack the password.

***Result Handling:***

If the password is found, the program prints the encrypted and decrypted passwords; otherwise, it reports that the password was not found. The time taken to crack the password is also displayed.

**CUDA Memory Management and Synchronization**

***Memory Allocation:***

The program allocates memory on the GPU for the character set, digits, encrypted password, and password found flag.

***Memory Copy:***

The data is transferred between the host and the device using cudaMemcpy.

***Kernel Execution:***

The findPassword kernel is launched on the GPU, where each thread attempts a password combination.

***Synchronization:***

The program uses cudaDeviceSynchronize() to ensure that the kernel finishes execution before proceeding.



**Prerequisites:**

* CUDA-enabled GPU.

* CUDA Toolkit installed.


## **To obtain encrypted version of the password.**

**Compilation:**

In [None]:
!nvcc Encrypt.cu -o Encrypt

**Execution:**

In [None]:
!./Encrypt <password>

Replace <password> with 2 letters and 2 digits for example: ab00

**Example**

## **To obtain the decrypted password.**














**Compilation:**

In [None]:
!nvcc passwordCrack.cu -o passwordCrack

**Execution:**

In [None]:
!./passwordCrack

Enter the encrypted password when prompted.