# CSCI 1360 - Assignment 3
## Question 1

- Total points 100
- For each part, write your solution in the given space between the two Python comment lines: **Begin your solution** and **End your solution**.
- Do not change the code outside the two comment lines.

### Part A (Functions), 30 points

An account registration web page uses Python to validate the submitted user name and to generate a random password. A function that implements the two features is needed. Create that function and name it `register_account`. The function should accept two arguments as the following:

- First, the user name to be validated.
- Second, the length of the password to be generated.

The function should check if the length of the provided user name is greater than 5, if true, then the function should generate a password of random numbers, the length of the password will be determined using the second argument. However, if the second argument value is not specified when the function call happens, then use a default value of 10 as the length of the password.

The function should return a tuple of two items. First, a Boolean value indicating if the user name is valid or not. Second, a string of the generated password or an empty string when the user name is not valid.

#### Grading Rubric
- Function header (5 points)
- Correct random number generation (5 points)
- Correct condition (5 points)
- Correct return (5 points)
- Passes test cases (10 points)


In [None]:

# Begin your solution below this line

import random

def register_account(username, password_length = 10):
    # check user name
    if len(username) > 5:
        # generate password
        password = ''
        for i in range(password_length):
            password = password + str( random.randint(0, 9) )

        return (True, password)
    else:
        return (False, '')


# End your solution, do not change anything below this line



In [None]:
# These assert lines help in testing your solution
# However, the test cases used during grading are not
# limited to the ones used below.

assert register_account('abc') == (False, '')
call_result = register_account('abcdef')
assert call_result[0] == True
assert len(call_result[1]) > 9 
import re
password = re.search('[0-9]{10}', call_result[1])
assert password != None
call_result = register_account('abcdef', 2)
password = re.search('[0-9]{2}', call_result[1])
assert password != None

### Part B (File I/O), 40 + 15 points

Comma-separated values (CSV) is a text file format that uses commas to separate values. A CSV file stores tabular data (numbers and text) in plain text, where each line of the file typically represents one data record. [Read more on Wikipedia](https://en.wikipedia.org/wiki/Comma-separated_values)

You are provided with a sample of an airlines dataset that has a CSV format in a file named [Airline_Dataset.csv](Airline_Dataset.csv), originally downloaded from [Kaggle](https://www.kaggle.com/datasets/iamsouravbanerjee/airline-dataset/). You need to create a function that takes one keyword argument named `path`. The argument represents the path to the dataset file. your function should perform the following:

- The function should have the name `mean_age`.
- Read the file, line by line. Each line represents information about a passenger. Remember to skip the first line since it contains the column names.
- Find the mean passengers' age by transforming each line's content into a structured format. You can perform that using the split function ( [Read more about split](https://docs.python.org/3/library/stdtypes.html#str.split) ). The split function returns a list of a string parts. Remember, values in the file are separated by commas.
- Make your function return the mean value. Printing the value only does not fulfil this requirement.
- Make your function robust to errors. Thus, when an error happens, print a message and return `0`.
- **\$BONUS (15 points)\$** For each passenger who is 60 years or older, create a folder and name it with the passenger ID, inside the folder create a file that has the passenger's full name and ends with the extension **.txt**. Inside the file, write the passenger age. Note: you do not need to submit the generated folders and files from this step. Also, your solution should run on any OS.

#### Grading Rubric

- Function header (5 points)
- Robust to errors  (10 points)
- Correct file read (10 points)
- Correct split (5 points)
- Correct return value (5 points)
- Passes test cases (5 points)


In [None]:
# Begin your solution below this line

import os

def mean_age(path):
    try:
        with open(path, "r") as file_object:
            lines_of_text = file_object.readlines()
            total = 0
            for i, line in enumerate(lines_of_text):
                if i == 0:
                    continue

                items = line.split(',')
                total += float(items[4])

                if float(items[4]) > 60:
                    os.mkdir(items[0])
                    file_path = os.path.join(items[0], items[1] + ' ' + items[2] + '.txt'  )
                    with open( file_path , "a") as file_object:
                        file_object.write(items[4])
            
            return total / (len(lines_of_text) - 1)

    except Exception as e:
        print('Try again. ' + str(e))
        return 0

# End your solution, do not change anything below this line

In [None]:
# These assert lines help in testing your solution
# However, the test cases used during grading are not
# limited to the ones used below.

result = mean_age('Airline_Dataset.csvv')
assert result == 0
result = mean_age('Airline_Dataset.csv')
difference = abs( 43.42857142857143 - result)
assert difference < 0.01

### Part C (Numpy Indexing), 30 points
An HR dataset stores the work starting hour for employees over weekdays. The values show the data for a single week only, so the dataset has seven columns only. The first column contains the starting hour in Monday for all the employees. The second column is for Tuesday, and so on. The number of employees varies from week to week.

You need to write a function that converts the work starting hours to the correct shift character `A`,`B` or`C`. Where `A` is for the hours 8 - 15 (Military time), `B` for 16 - 23, And `C` for the rest. 

You need to write a function named `work_shift` that takes an array as the input and perform the following:

- Convert the array to a Numpy matrix.
- Create the **output** Numpy matrix using the same shape of the input matrix. You can use the [empty](https://numpy.org/doc/stable/reference/generated/numpy.empty.html) function to create an empty matrix and to specify its data type, which is string for your solution.
- Use Numpy Boolean indexing to create a mask for all the items that fall into the shift `A` category. Use the mask to assign `A` to all the items in the output array. Note that the mask will hold all the indices of the items that match the filter.
- Use the previous step for the `B` and `C` shifts.
- return the Numpy matrix with all items holding a string value of either `A`, `B` or `C`.


#### Grading Rubric

- Correct import (5 points)
- Function header (5 points)
- Converts to Numpy (5 points)
- Correct output matrix (5 points)
- Correct Numpy masks (5 points)
- Passes test cases (5 points)


In [None]:
# Begin your solution below this line

import numpy as np

def work_shift(arr):
    arr = np.array(arr)
    arr2 = np.empty((5,7), dtype=str)
    mask = arr < 8
    arr2[mask] = 'C'
    mask = (arr < 16) & (arr > 7)
    arr2[mask] = 'A'
    mask = arr > 15
    arr2[mask] = 'B'
    return arr2

# End your solution, do not change anything below this line

In [None]:
# These assert lines help in testing your solution
# However, the test cases used during grading are not
# limited to the ones used below.

week1_start_hour = [[8, 13, 7, 21, 16, 19, 20],
[23, 1, 11, 19, 22, 23, 10],
[23, 12, 9, 9, 18, 19, 6],
[4, 6, 17, 1, 20, 10, 19],
[0, 14, 1, 11, 22, 5, 9]]

expected = [['A', 'A', 'C', 'B', 'B', 'B', 'B'],
 ['B', 'C', 'A', 'B', 'B', 'B', 'A'],
 ['B', 'A', 'A', 'A', 'B', 'B', 'C'],
 ['C', 'C', 'B', 'C', 'B', 'A', 'B'],
 ['C', 'A', 'C', 'A', 'B', 'C', 'A']]

import numpy as np
expected = np.array(expected)

result = work_shift(week1_start_hour)
np.testing.assert_array_equal(result , expected)
