# Regex Explanation for Function Counting

This code snippet counts the number of function definitions in source code files based on the programming language. Let me explain each regex pattern in detail:

## 1. C/C++ Function Counter



In [None]:
grep -E '^[[:space:]]*[A-Za-z_][A-Za-z0-9_]+[[:space:]]+[A-Za-z_][A-Za-z0-9_]+[[:space:]]*\([^)]*\)[[:space:]]*\{' "$source_file" | wc -l



Breaking this down:
- `^` - Start of a line
- `[[:space:]]*` - Zero or more whitespace characters (spaces or tabs)
- `[A-Za-z_][A-Za-z0-9_]+` - Return type (starts with letter/underscore, followed by letters/numbers/underscores)
- `[[:space:]]+` - One or more whitespace characters (separating return type and function name)
- `[A-Za-z_][A-Za-z0-9_]+` - Function name (follows same pattern as return type)
- `[[:space:]]*` - Zero or more whitespace characters
- `\(` - Opening parenthesis (escaped with backslash)
- `[^)]*` - Any characters that are not closing parenthesis (function parameters)
- `\)` - Closing parenthesis
- `[[:space:]]*` - Zero or more whitespace characters
- `\{` - Opening curly brace

This matches C/C++ function definitions like: `int main(int argc, char* argv[]) {`

## 2. Java Function Counter



In [None]:
grep -E '^[[:space:]]*([A-Za-z_$][A-Za-z0-9_$<>]*[[:space:]]+)+[A-Za-z_$][A-Za-z0-9_$]*[[:space:]]*\([^)]*\)[[:space:]]*\{' "$source_file" | wc -l



This is more complex:
- `^[[:space:]]*` - Start of line plus whitespace
- `([A-Za-z_$][A-Za-z0-9_$<>]*[[:space:]]+)+` - This captures modifiers and return types:
  - Starts with letter/underscore/dollar sign
  - Followed by letters/numbers/underscores/dollar signs/angle brackets (for generics)
  - Followed by whitespace
  - The `()+` means this pattern repeats one or more times (for multiple modifiers)
- `[A-Za-z_$][A-Za-z0-9_$]*` - Method name
- `[[:space:]]*\([^)]*\)[[:space:]]*\{` - Same as C/C++: whitespace, parameters in parentheses, whitespace, opening brace

This matches Java methods like: `public static void main(String[] args) {`

## 3. Python Function Counter



In [None]:
grep -c '^[[:space:]]*def ' "$source_file"



Much simpler:
- `^` - Start of line
- `[[:space:]]*` - Zero or more whitespace characters
- `def ` - The literal string "def " with a space

This matches Python function definitions like: `def calculate_average(numbers):`

The `-c` flag with grep directly counts matching lines instead of piping to `wc -l` like the others.

All these patterns are executed via `grep` to find lines matching function definitions, and the count becomes the value of the `fc` variable.

# Bash Parameter Expansion for Filename Parsing

This code extracts the student name and ID from the filename using Bash parameter expansion techniques. Let me explain each operation:

## 1. Get Base Filename



In [None]:
filename=$(basename "$zipfile")



The `basename` command extracts just the filename without the directory path. For example, if `$zipfile` is `/path/to/John_Doe_1234567_submission_2105221.zip`, `filename` becomes `John_Doe_1234567_submission_2105221.zip`.

## 2. Extract Student Name



In [None]:
student_name="${filename%%_*}"



This uses parameter expansion with `%%` which removes the **longest** matching pattern (`_*`) from the end:
- `%%_*` means "delete everything from the first underscore to the end"
- For `John_Doe_1234567_submission_2105221.zip`, this gives `John`

## 3. Extract Student ID (two steps)



In [None]:
student_id="${filename##*_}"

First, we extract the last part after the final underscore:
- `##*_` means "delete everything from the beginning up to the last underscore"
- For `John_Doe_1234567_submission_2105221.zip`, this gives `2105221.zip`

Then, we remove the file extension:


In [None]:
student_id="${student_id%.zip}"

- `%` is another parameter expansion that removes the **shortest** matching pattern from the end
- `%.zip` means "delete .zip from the end"
- `2105221.zip` becomes `2105221`

## Example

For a zip file named `Carol_Danvers_1000001_submission_2105221.zip`:
- `student_name` becomes `Carol` (everything before first underscore)
- `student_id` becomes `2105221` (everything after last underscore, minus .zip)

This parsing assumes a consistent naming pattern in the zip files.

# Finding Code Files with the `find` Command

This code snippet searches for programming source files within a temporary directory where a student's submission ZIP file has been extracted. Let me break it down:



In [None]:
codefile=$(find "$tmpdir" -type f \
    \( -iname '*.c' -o -iname '*.cpp' -o -iname '*.java' -o -iname '*.py' \) \
    | sort | head -n1)



## Command Components

1. **Starting the search**: `find "$tmpdir" -type f`
   - Begins searching in the temporary directory (`$tmpdir`)
   - `-type f` restricts the search to files only (excluding directories, symlinks, etc.)

2. **File type filter**: `\( -iname '*.c' -o -iname '*.cpp' -o -iname '*.java' -o -iname '*.py' \)`
   - The `\(` and `\)` are escaped parentheses to group the conditions
   - `-iname` performs case-insensitive filename matching
   - `-o` represents the logical OR operator
   - This searches for files with extensions:
     - `.c` (C source files)
     - `.cpp` (C++ source files)
     - `.java` (Java source files)
     - `.py` (Python source files)
   
3. **Processing the results**: `| sort | head -n1`
   - `sort` arranges the found files in alphabetical order
   - `head -n1` selects only the first file from the sorted list

4. **Variable assignment**: The entire output is captured in the `codefile` variable

## Purpose in the Script

This command ensures the script:
1. Recursively searches through all subdirectories in the extracted submission
2. Finds all programming source files in common languages
3. Selects exactly one file (the alphabetically first one) for further processing
4. Handles different programming languages uniformly

If no matching file is found, the `codefile` variable will be empty, which the script checks for later with `if [ -z "$codefile" ]`.