## Introduction

This assignment focuses on:

- Understanding Named Tuples
- Using the Faker Library to generate random data for testing
- Comparing the performance of NamedTuples vs Dictionary implementations

## Task 1: Use the **Faker** library to get 10000 random profiles. Using namedtuple, calculate the largest blood type, mean-current_location, oldest_person_age, and average age.

### Profile

A named tuple representing a profile with the following fields:
- `blood_type`: Blood type of the person.
- `latitude`: Latitude of the current location.
- `longitude`: Longitude of the current location.
- `age`: Age of the person.

In [2]:
# Generating random profiles using Faker module
from datetime import date
from faker import Faker
from collections import namedtuple, Counter


# Create Faker instance
fake = Faker()

# Define namedtuple
Profile = namedtuple('Profile', 'blood_type latitude longitude age')

def generate_profiles_namedtuple(n: int) -> tuple:
    """
    Generates `n` profiles using namedtuple.

    Args:
        n (int): Number of profiles to generate.

    Returns:
        tuple: A tuple containing `n` Profile namedtuples.
    """
    profiles = []
    append = profiles.append
    for _ in range(n):
        profile = fake.profile()
        birthdate = profile['birthdate']
        append(Profile(
            blood_type=profile['blood_group'],
            latitude=profile['current_location'][0],
            longitude=profile['current_location'][1],
            age=date.today().year - birthdate.year
        ))
    return tuple(profiles)

In [3]:
# Timer function to time executions
from functools import wraps
from time import perf_counter

def timing_decorator(func):
    """Decorator to measure execution time of a function."""
    @wraps(func)
    def wrapper(*args, **kwargs):
        start = perf_counter()
        result = func(*args, **kwargs)
        end = perf_counter()
        print(f"Function {func.__name__} Execution Time: {end - start:.6f} seconds")
        return result
    return wrapper

In [5]:
# Defining functions as per assignment
from typing import Tuple

@timing_decorator
def largest_blood_type_namedtuple(profiles: Tuple) -> str:
    """
    Returns the blood type with the highest frequency.

    Args:
        profiles (Tuple): A tuple of Profile namedtuples.

    Returns:
        Optional[str]: The blood type with the highest frequency. If multiple blood types have the same frequency, one of them is returned.
        Returns None if the profiles list is empty.
    """
    if not profiles:
        return None
    blood_type_counts = Counter(profile.blood_type for profile in profiles)
    return blood_type_counts.most_common(1)[0][0]

@timing_decorator
def mean_current_location_namedtuple(profiles: Tuple) -> Tuple[float, float]:
    """
    Returns the mean latitude and longitude from a list of namedtuple profiles.

    Args:
        profiles (Tuple): A tuple of Profile namedtuples.

    Returns:
        Tuple[float, float]: A tuple containing the mean latitude and mean longitude as floats.
    """
    total_lat = sum(profile.latitude for profile in profiles)
    total_long = sum(profile.longitude for profile in profiles)
    count = len(profiles)
    return (float(total_lat / count), float(total_long / count))

@timing_decorator
def oldest_person_age_namedtuple(ages: Tuple[int, ...]) -> int:
    """
    Returns the age of the oldest person.

    Args:
        ages (Tuple[int, ...]): A tuple of integers representing ages.

    Returns:
        int: The age of the oldest person.
    """
    return int(max(ages))

@timing_decorator
def average_age_namedtuple(ages: Tuple[int, ...]) -> float:
    """
    Returns the average age from a list of namedtuple profiles.

    Args:
        ages (Tuple[int, ...]): A tuple of integers representing ages.

    Returns:
        float: The average of age.
    """
    return round(sum(ages) / len(ages), 2)


Calculating metrics using Named Tuple operations
Function largest_blood_type_namedtuple Execution Time: 0.002114 seconds
Largest blood group in randomly generate 10000 profiles is AB+ 

Function mean_current_location_namedtuple Execution Time: 0.003581 seconds
Average Location in randomly generate 10000 profiles is (0.4331223516, -0.5974053573) 

Function oldest_person_age_namedtuple Execution Time: 0.000186 seconds
Oldest persons age in randomly generate 10000 profiles is 116 

Function average_age_namedtuple Execution Time: 0.000325 seconds
Average age of people in randomly generate 10000 profiles is 58 


 Total Elapsed time for named Tuple calculation 0.0068


### Task 1 test functions

In [23]:
# Checking functionality

n=10000

profiles=generate_profiles_namedtuple(n)

ages=tuple(profile.age for profile in profiles)

start=perf_counter()

print("Calculating metrics using Named Tuple operations")

print("Largest blood group in randomly generate {0} profiles is {1} \n".format(n,largest_blood_type_namedtuple(profiles)))

print("Average Location in randomly generate {0} profiles is {1} \n".format(n,mean_current_location_namedtuple(profiles)))

print("Oldest persons age in randomly generate {0} profiles is {1} \n".format(n,oldest_person_age_namedtuple(ages)))

print("Average age of people in randomly generate {0} profiles is {1:.0f} \n".format(n,average_age_namedtuple(ages)))

end=perf_counter()

total_elapsed_named_tuple=end-start

print("\n Total Elapsed time for named Tuple calculation {:.4f}".format(total_elapsed_named_tuple))


Calculating metrics using Named Tuple operations
Function largest_blood_type_namedtuple Execution Time: 0.002927 seconds
Largest blood group in randomly generate 10000 profiles is A- 

Function mean_current_location_namedtuple Execution Time: 0.004393 seconds
Average Location in randomly generate 10000 profiles is (-0.42134796865, -0.3013190974) 

Function oldest_person_age_namedtuple Execution Time: 0.000186 seconds
Oldest persons age in randomly generate 10000 profiles is 116 

Function average_age_namedtuple Execution Time: 0.000083 seconds
Average age of people in randomly generate 10000 profiles is 58 


 Total Elapsed time for named Tuple calculation 0.0083


In [16]:
 # Test 1: The function returns the correct number of profiles as specified by the input parameter `n`.
 # Test 2: The returned value is a tuple.
 # Test 3: Each element within the returned tuple is an instance of the Profile namedtuple.

def test_generate_profiles_namedtuple():
    """
    Tests the generate_profiles_namedtuple function to ensure it generates the correct output.

    This test verifies the following:
    1. The function returns the correct number of profiles as specified by the input parameter `n`.
    2. The returned value is a tuple.
    3. Each element within the returned tuple is an instance of the Profile namedtuple.

    The test will raise an assertion error if any of these conditions are not met.
    If all checks pass, a confirmation message is printed.

    """
    
    profiles = generate_profiles_namedtuple(100)
    
    assert len(profiles) == 100, "Profile count does not match"
    
    assert isinstance(profiles, tuple), "Profiles should be a tuple"
    
    assert all(isinstance(profile, Profile) for profile in profiles), "All elements should be of type Profile"
    
    print("test_generate_profiles_namedtuple passed")

test_generate_profiles_namedtuple()

test_generate_profiles_namedtuple passed


In [17]:
#Test 4: Tests the largest_blood_type_namedtuple function to ensure it returns the correct output type.

def test_largest_blood_type_namedtuple():
    """
    Tests the largest_blood_type_namedtuple function to ensure it returns the correct output type.

    This test verifies the following:
    1. The function returns a value of type `str`, which indicates the blood type that occurs most frequently in the provided profiles.

    The test will raise an assertion error if the returned value is not a string.
    If the check passes, a confirmation message is printed.

    Example:
        >>> test_largest_blood_type_namedtuple()
        test_largest_blood_type_namedtuple passed
    """
    profiles = generate_profiles_namedtuple(100)
    result = largest_blood_type_namedtuple(profiles)
    
    assert isinstance(result, str), "Result should be a string"
    
    print("test_largest_blood_type_namedtuple passed")

test_largest_blood_type_namedtuple()

Function largest_blood_type_namedtuple Execution Time: 0.000050 seconds
test_largest_blood_type_namedtuple passed


In [18]:
#Few Tests for mean_current_location_namedtuple function.
#Test 5 : The function returns a value of type `tuple`.
#Test 6 : The returned tuple contains exactly two elements.
#Test 7 : Both elements of the tuple are of type `float`, representing the mean latitude and mean longitude.
def test_mean_current_location_namedtuple():
    """
    Tests the mean_current_location_namedtuple function to ensure it returns the correct output format.

    This test verifies the following:
    1. The function returns a value of type `tuple`.
    2. The returned tuple contains exactly two elements.
    3. Both elements of the tuple are of type `float`, representing the mean latitude and mean longitude.

    The test will raise an assertion error if any of these conditions are not met.
    If all checks pass, a confirmation message is printed.

    Example:
        >>> test_mean_current_location_namedtuple()
        test_mean_current_location_namedtuple passed
    """

    profiles = generate_profiles_namedtuple(100)
    result = mean_current_location_namedtuple(profiles)
    
    assert isinstance(result, tuple), "Result should be a tuple"
    
    assert len(result) == 2, "Result tuple should have two elements"
    
    assert isinstance(result[0], float) and isinstance(result[1], float), "Tuple elements should be floats"
    
    print("test_mean_current_location_namedtuple passed")

test_mean_current_location_namedtuple()

Function mean_current_location_namedtuple Execution Time: 0.000054 seconds
test_mean_current_location_namedtuple passed


In [19]:
# Test 8: Tests the oldest_person_age_namedtuple function to ensure it returns the correct output format.

def test_oldest_person_age_namedtuple():
    """
    Tests the oldest_person_age_namedtuple function to ensure it returns the correct output format.

    This test verifies the following:
    1. The function returns a value of type `int`, representing the age of the oldest person.
    2. The returned age is a non-negative integer.

    The test will raise an assertion error if any of these conditions are not met.
    If all checks pass, a confirmation message is printed.

    Example:
        >>> test_oldest_person_age_namedtuple()
        test_oldest_person_age_namedtuple passed
    """
    profiles = generate_profiles_namedtuple(100)
    ages = tuple(profile.age for profile in profiles)
    result = oldest_person_age_namedtuple(ages)
    
    assert isinstance(result, int), "Result should be an integer"
    assert result >= 0, "Age should be non-negative"
    
    print("test_oldest_person_age_namedtuple passed")

test_oldest_person_age_namedtuple()


Function oldest_person_age_namedtuple Execution Time: 0.000005 seconds
test_oldest_person_age_namedtuple passed


In [20]:
#Test 9:  Tests the average_age_namedtuple function to ensure it returns the correct output format.

def test_average_age_namedtuple():
    """
    Tests the average_age_namedtuple function to ensure it returns the correct output format.

    This test verifies the following:
    1. The function returns a value of type `float`, representing the average age.
    2. The returned average age is a non-negative float.

    The test will raise an assertion error if any of these conditions are not met.
    If all checks pass, a confirmation message is printed.

    Example:
        >>> test_average_age_namedtuple()
        test_average_age_namedtuple passed
    """
    profiles = generate_profiles_namedtuple(100)
    ages=tuple(profile.age for profile in profiles)
    result = average_age_namedtuple(ages)
    
    assert isinstance(result, float), "Result should be a float"
    assert result >= 0, "Average age should be non-negative"
    
    print("test_average_age_namedtuple passed")

test_average_age_namedtuple()


Function average_age_namedtuple Execution Time: 0.000011 seconds
test_average_age_namedtuple passed


In [21]:
#Test 6: Tests the functions using a predefined list of namedtuple profiles with known values to check if it returns expected output

Profile = namedtuple('Profile', 'blood_type latitude longitude age')

# Sample profiles with known values
sample_profiles = [
    Profile('A+', -40.7, -74.0, 30),  
    Profile('B-', 34.0, 118.2, 25), 
    Profile('AB+', 41.8, -87.6, 35), 
    Profile('O-', -37.7, -122.4, 28), 
    Profile('A+', 40.7, 74.0, 40),  
    Profile('B-', -34.0, -118.2, 22), 
    Profile('AB+', 41.8, 87.6, 31), 
    Profile('A+', 37.7, -122.4, 29), 
]

expected_largest_blood_type = 'A+'
expected_mean_current_location = (10.45, -30.6)
expected_oldest_person_age = 40
expected_average_age = 30

def test_functions_with_sample_data():
    """
    Tests the functions using a predefined list of namedtuple profiles with known values.

    This test verifies that the following functions return the expected results when applied to a 
    set of sample profiles:

    The test will raise an assertion error if any of these functions do not return the expected 
    results. If all assertions pass, a confirmation message is printed.

    Example:
        >>> test_functions_with_sample_data()
        All tests passed
    """
    # Sample data
    profiles = sample_profiles
    ages = tuple(profile.age for profile in profiles)
    
    # Test largest_blood_type_namedtuple
    result_largest_blood_type = largest_blood_type_namedtuple(profiles)
    assert result_largest_blood_type == expected_largest_blood_type, (
        f"Expected {expected_largest_blood_type}, got {result_largest_blood_type}"
    )
    
    # Test mean_current_location_namedtuple
    result_mean_location = mean_current_location_namedtuple(profiles)
    assert result_mean_location == expected_mean_current_location, (
        f"Expected {expected_mean_current_location}, got {result_mean_location}"
    )
    
    # Test oldest_person_age_namedtuple
    result_oldest_age = oldest_person_age_namedtuple(ages)
    assert result_oldest_age == expected_oldest_person_age, (
        f"Expected {expected_oldest_person_age}, got {result_oldest_age}"
    )
    
    # Test average_age_namedtuple
    result_average_age = average_age_namedtuple(ages)
    assert result_average_age == expected_average_age, (
        f"Expected {expected_average_age}, got {result_average_age}"
    )

    print("All tests passed")

test_functions_with_sample_data()


Function largest_blood_type_namedtuple Execution Time: 0.000024 seconds
Function mean_current_location_namedtuple Execution Time: 0.000008 seconds
Function oldest_person_age_namedtuple Execution Time: 0.000002 seconds
Function average_age_namedtuple Execution Time: 0.000005 seconds
All tests passed


## Task 2: Dictionaries Implementation.

## Do the same thing above using a dictionary. Prove that namedtuple is faster. - 250 (including 5 test cases)

In [25]:
from typing import Dict,List

# Create Faker instance
fake = Faker()

def generate_profiles_dict(n: int) -> List[Dict]:
    """
    Generates `n` profiles using dictionaries.

    Args:
        n (int): Number of profiles to generate.

    Returns:
        List[Dict]: A list of dictionaries, each representing a profile with keys 
                              'blood_type', 'latitude', 'longitude', and 'age'.
    """
    profiles = []
    append = profiles.append 
    for _ in range(n):
        profile = fake.profile()
        birthdate = profile['birthdate']
        append({
            'blood_type': profile['blood_group'],
            'latitude': profile['current_location'][0],
            'longitude': profile['current_location'][1],
            'age': date.today().year - birthdate.year
        })
    return profiles

@timing_decorator
def largest_blood_type_dict(profiles: List[Dict]) -> str:
    """
    Returns the blood type with the highest frequency using dictionary operations.

    Args:
        profiles (List[Dict): A list of dictionaries representing profiles.

    Returns:
        str: The blood type that occurs most frequently.
    """
    blood_type_counts = {}
    
    for profile in profiles:
        blood_type = profile['blood_type']
        blood_type_counts[blood_type] = blood_type_counts.get(blood_type, 0) + 1
    
    # Find the blood type with the maximum frequency
    return max(blood_type_counts, key=blood_type_counts.get)

@timing_decorator
def mean_current_location_dict(profiles: List[Dict]) -> Tuple[float, float]:
    """
    Returns the mean latitude and longitude using dictionary operations.

    Args:
        profiles (List[Dict): A list of dictionaries representing profiles.

    Returns:
        Tuple[float, float]: A tuple containing the mean latitude and longitude.
    """
    latitude_sum = 0
    longitude_sum = 0
    count = len(profiles)
    
    for profile in profiles:
        latitude_sum += profile['latitude']
        longitude_sum += profile['longitude']
    
    return (float(latitude_sum / count), float(longitude_sum / count))

@timing_decorator
def oldest_person_age_dict(profiles: List[Dict]) -> int:
    """
    Returns the age of the oldest person using dictionary operations.

    Args:
        profiles (List[Dict]): A list of dictionaries representing profiles.

    Returns:
        int: The age of the oldest person.
    """
    return max(profile['age'] for profile in profiles)

@timing_decorator
def average_age_dict(profiles: List[Dict]) -> float:
    """
    Returns the average age using dictionary operations.

    Args:
        profiles (List[Dict]): A list of dictionaries representing profiles.

    Returns:
        float: The average age.
    """
    age_sum = sum(profile['age'] for profile in profiles)
    return round(age_sum / len(profiles),2)


### Task 2 test functions

In [26]:
# Test 1: The function returns the correct number of profiles as specified by the input parameter `n`.
def test_generate_profiles_dict():
    profiles = generate_profiles_dict(100)
    assert len(profiles) == 100, "Profile count does not match"
    print("Test 1 Passed: The function returns the correct number of profiles.")

# Test 2: The function returns a list.
def test_generate_profiles_dict_returns_list():
    profiles = generate_profiles_dict(100)
    assert isinstance(profiles, list), "Profiles should be a list"
    print("Test 2 Passed: The function returns a list.")

# Test 3: Each element within the returned list is a dictionary.
def test_generate_profiles_dict_elements_are_dict():
    profiles = generate_profiles_dict(100)
    assert all(isinstance(profile, dict) for profile in profiles), "All elements should be of type dict"
    print("Test 3 Passed: Each element within the returned list is a dictionary.")

# Test 4: The function largest_blood_type_dict returns a string.
def test_largest_blood_type_dict():
    profiles = generate_profiles_dict(100)
    result = largest_blood_type_dict(profiles)
    assert isinstance(result, str), "Result should be a string"
    print("Test 4 Passed: The function largest_blood_type_dict returns a string.")

# Test 5: The function mean_current_location_dict returns a tuple.
# Test 6: The returned tuple contains exactly two elements.
# Test 7: Both elements of the tuple are of type `float`, representing the mean latitude and mean longitude.
def test_mean_current_location_dict():
    profiles = generate_profiles_dict(100)
    result = mean_current_location_dict(profiles)
    assert isinstance(result, tuple), "Result should be a tuple"
    assert len(result) == 2, "Result tuple should have two elements"
    assert isinstance(result[0], float) and isinstance(result[1], float), "Tuple elements should be floats"
    print("Test 5, 6, 7 Passed: The function mean_current_location_dict returns a tuple of two float elements.")

# Test 8: The function oldest_person_age_dict returns an integer.
# Test 9: The returned age is a non-negative integer.
def test_oldest_person_age_dict():
    profiles = generate_profiles_dict(100)
    result = oldest_person_age_dict(profiles)
    assert isinstance(result, int), "Result should be an integer"
    assert result >= 0, "Age should be non-negative"
    print("Test 8, 9 Passed: The function oldest_person_age_dict returns a non-negative integer.")

# Test 10: The function average_age_dict returns a float.
# Test 11: The returned average age is a non-negative float.
def test_average_age_dict():
    profiles = generate_profiles_dict(100)
    result = average_age_dict(profiles)
    assert isinstance(result, float), "Result should be a float"
    assert result >= 0, "Average age should be non-negative"
    print("Test 10, 11 Passed: The function average_age_dict returns a non-negative float.")

# Test 12: Tests the functions using a predefined list of dictionary profiles with known values.
def test_functions_with_sample_data():
    sample_profiles = [
        {'blood_type': 'A+', 'latitude': -40.7, 'longitude': -74.0, 'age': 30},
        {'blood_type': 'B-', 'latitude': 34.0, 'longitude': 118.2, 'age': 25},
        {'blood_type': 'AB+', 'latitude': 41.8, 'longitude': -87.6, 'age': 35},
        {'blood_type': 'O-', 'latitude': -37.7, 'longitude': -122.4, 'age': 28},
        {'blood_type': 'A+', 'latitude': 40.7, 'longitude': 74.0, 'age': 40},
        {'blood_type': 'B-', 'latitude': -34.0, 'longitude': -118.2, 'age': 22},
        {'blood_type': 'AB+', 'latitude': 41.8, 'longitude': 87.6, 'age': 31},
        {'blood_type': 'A+', 'latitude': 37.7, 'longitude': -122.4, 'age': 29},
    ]

    expected_largest_blood_type = 'A+'
    expected_mean_current_location = (10.45, -30.6)
    expected_oldest_person_age = 40
    expected_average_age = 30.0

    result_largest_blood_type = largest_blood_type_dict(sample_profiles)
    assert result_largest_blood_type == expected_largest_blood_type, (
        f"Expected {expected_largest_blood_type}, got {result_largest_blood_type}"
    )
    print("Test 12.1 Passed: largest_blood_type_dict returns the expected result.")

    result_mean_location = mean_current_location_dict(sample_profiles)
    assert result_mean_location == expected_mean_current_location, (
        f"Expected {expected_mean_current_location}, got {result_mean_location}"
    )
    print("Test 12.2 Passed: mean_current_location_dict returns the expected result.")

    result_oldest_age = oldest_person_age_dict(sample_profiles)
    assert result_oldest_age == expected_oldest_person_age, (
        f"Expected {expected_oldest_person_age}, got {result_oldest_age}"
    )
    print("Test 12.3 Passed: oldest_person_age_dict returns the expected result.")

    result_average_age = average_age_dict(sample_profiles)
    assert result_average_age == expected_average_age, (
        f"Expected {expected_average_age}, got {result_average_age}"
    )
    print("Test 12.4 Passed: average_age_dict returns the expected result.")

# Run all tests
test_generate_profiles_dict()
test_generate_profiles_dict_returns_list()
test_generate_profiles_dict_elements_are_dict()
test_largest_blood_type_dict()
test_mean_current_location_dict()
test_oldest_person_age_dict()
test_average_age_dict()
test_functions_with_sample_data()


Test 1 Passed: The function returns the correct number of profiles.
Test 2 Passed: The function returns a list.
Test 3 Passed: Each element within the returned list is a dictionary.
Function largest_blood_type_dict Execution Time: 0.000042 seconds
Test 4 Passed: The function largest_blood_type_dict returns a string.
Function mean_current_location_dict Execution Time: 0.000055 seconds
Test 5, 6, 7 Passed: The function mean_current_location_dict returns a tuple of two float elements.
Function oldest_person_age_dict Execution Time: 0.000029 seconds
Test 8, 9 Passed: The function oldest_person_age_dict returns a non-negative integer.
Function average_age_dict Execution Time: 0.000027 seconds
Test 10, 11 Passed: The function average_age_dict returns a non-negative float.
Function largest_blood_type_dict Execution Time: 0.000006 seconds
Test 12.1 Passed: largest_blood_type_dict returns the expected result.
Function mean_current_location_dict Execution Time: 0.000004 seconds
Test 12.2 Passed:

In [30]:
n=10000

profiles = generate_profiles_dict(n)

start = perf_counter()

print("Calculating metrics using Dictionary operations")

print("Largest blood group in randomly generated {0} profiles is {1} \n".format(
    n, largest_blood_type_dict(profiles)
))

print("Average Location in randomly generated {0} profiles is {1} \n".format(
    n, mean_current_location_dict(profiles)
))

print("Oldest person's age in randomly generated {0} profiles is {1} \n".format(
    n, oldest_person_age_dict(profiles)
))

print("Average age of people in randomly generated {0} profiles is {1:.0f} \n".format(
    n, average_age_dict(profiles)
))

end = perf_counter()

total_elapsed_dict = end - start

print("Total Elapsed time for Dictionary calculation {:.4f}".format(total_elapsed_dict))


Calculating metrics using Dictionary operations
Function largest_blood_type_dict Execution Time: 0.002150 seconds
Largest blood group in randomly generated 10000 profiles is B- 

Function mean_current_location_dict Execution Time: 0.003143 seconds
Average Location in randomly generated 10000 profiles is (0.524005751, 0.9614474934) 

Function oldest_person_age_dict Execution Time: 0.001240 seconds
Oldest person's age in randomly generated 10000 profiles is 116 

Function average_age_dict Execution Time: 0.000825 seconds
Average age of people in randomly generated 10000 profiles is 58 

Total Elapsed time for Dictionary calculation 0.0080


### Checking if NamedTuple calculations is faster than Dictionary

In [31]:

n=10000

profiles_tuples=generate_profiles_namedtuple(n)

# ages=tuple(profile.age for profile in profiles_tuples)

start=perf_counter()

print("Calculating metrics using Named Tuple operations")

largest_blood_type_namedtuple(profiles_tuples)
mean_current_location_namedtuple(profiles_tuples)
oldest_person_age_namedtuple(ages)
average_age_namedtuple(ages)

end=perf_counter()

total_elapsed_named_tuple=end-start

print("\n Total Elapsed time for named Tuple calculation {:.4f} \n".format(total_elapsed_named_tuple))

profiles_dict = generate_profiles_dict(n)

start = perf_counter()
print("Calculating metrics using Dictionary operations")

largest_blood_type_dict(profiles_dict)
mean_current_location_dict(profiles_dict)
oldest_person_age_dict(profiles_dict)
average_age_dict(profiles_dict)

end = perf_counter()
total_elapsed_dict = end - start

print("Total Elapsed time for Dictionary calculation {:.4f} \n".format(total_elapsed_dict))

if (total_elapsed_dict / total_elapsed_named_tuple)>1:
    print(f"Namedtuple is {total_elapsed_dict / total_elapsed_named_tuple:.2f} times faster than Dictionary")

else:
    print(f"Dictionary is {total_elapsed_named_tuple / total_elapsed_dict:.2f} times faster than NamedTuple")


Calculating metrics using Named Tuple operations
Function largest_blood_type_namedtuple Execution Time: 0.002095 seconds
Function mean_current_location_namedtuple Execution Time: 0.003904 seconds
Function oldest_person_age_namedtuple Execution Time: 0.000182 seconds
Function average_age_namedtuple Execution Time: 0.000083 seconds

 Total Elapsed time for named Tuple calculation 0.0067 

Calculating metrics using Dictionary operations
Function largest_blood_type_dict Execution Time: 0.002340 seconds
Function mean_current_location_dict Execution Time: 0.003537 seconds
Function oldest_person_age_dict Execution Time: 0.000857 seconds
Function average_age_dict Execution Time: 0.000767 seconds
Total Elapsed time for Dictionary calculation 0.0080 

Namedtuple is 1.19 times faster than Dictionary


## Task 3: Fake Stock Exchange

## Create fake data (you can use Faker for company names) for an imaginary stock exchange for the top 100 companies (name, symbol, open, high, close). Assign a random weight to all the companies. Srock Market Value would be the sum of each_stock_value*random number/(sum of random values) or for 100 companies.
## Calculate and show what value the stock market started at, what was the highest value during the day, and where did it end. Make sure your open, high, and close are not totally random. You can only use namedtuple. - 500  (including 10 test cases)

In [34]:
import random
import re
# Define the Stock namedtuple
Stock = namedtuple('Stock', ['name', 'symbol', 'open', 'high', 'low', 'close', 'weight'])

fake = Faker()

def generate_stock_data(num_stocks: int = 100, start_range: int = 10, end_range: int = 500) -> Tuple[Stock, ...]:
    """
    Generates fake stock data for a specified number of stocks.

    Args:
        num_stocks (int): The number of stocks to generate. Default is 100.
        start_range (int): The minimum value for the stock price. Default is 10.
        end_range (int): The maximum value for the stock price. Default is 500.

    Returns:
        Tuple[Stock, ...]: A tuple of Stock namedtuples, each containing:
            - name (str): The name of the company.
            - symbol (str): The stock ticker symbol.
            - open (float): The opening price of the stock.
            - high (float): The highest price of the stock during the day.
            - low (float): The lowest price of the stock during the day.
            - close (float): The closing price of the stock.
            - weight (float): The weight of the stock in a portfolio, normalized to sum to 1.

    Raises:
        ValueError: If the start_range is not less than the end_range.
    """
    if start_range >= end_range:
        raise ValueError("start_range must be less than end_range")

    stocks = []
    total_weight = 0.0
    weights = []
    used_symbols = set() 

    while len(stocks) < num_stocks:
        name = fake.company()
        x = ''.join(set(re.sub(r'[^a-zA-Z]', '', name.upper())))

        for _ in range(20):
            symbol = ''.join(random.choices(x, k=3))
            if symbol not in used_symbols:
                used_symbols.add(symbol)
                break
        else:
            continue  # Retry with a new company name if no unique symbol was found

        open_price = round(random.uniform(start_range, end_range), 4)
        high_price = round(random.uniform(1.001, 1.15) * open_price, 4)
        low_price = round(random.uniform(0.85, 1) * open_price, 4)
        close_price = round(random.uniform(low_price, high_price), 4)
        
        weight = round(random.random(), 4)
        weights.append(weight)
        total_weight += weight
        
        stock = Stock(name, symbol, open_price, high_price, low_price, close_price, weight)
        stocks.append(stock)
    
    # Normalize weights so they sum to 1
    normalized_stocks = tuple(stock._replace(weight=round(stock.weight / total_weight, 4)) for stock in stocks)
    
    return tuple(normalized_stocks)

def calculate_market_values(stocks: Tuple[Stock, ...]) -> Tuple[float, float, float]:
    """
    Calculate the weighted market values for open, high, and close prices.

    Args:
        stocks (Tuple[Stock, ...]): A tuple of Stock namedtuples, each containing the
                                    open, high, close prices, and a weight.

    Returns:
        Tuple[float, float, float]: The weighted market open, high, and close values,
                                    rounded to two decimal places.
    """
    market_open = sum(stock.open * stock.weight for stock in stocks)
    market_high = sum(stock.high * stock.weight for stock in stocks)
    market_close = sum(stock.close * stock.weight for stock in stocks)
    return round(market_open, 4), round(market_high, 4), round(market_close, 4)


In [37]:
#sample data generate
x=generate_stock_data(10,start_range=10,end_range=200)
x

(Stock(name='Owen-Davenport', symbol='REE', open=89.0643, high=100.1531, low=85.6091, close=86.4635, weight=0.0379),
 Stock(name='Jones-Harrell', symbol='LLS', open=197.5444, high=205.2763, low=195.3802, close=197.2844, weight=0.1638),
 Stock(name='Mcguire PLC', symbol='EIP', open=82.7411, high=86.9341, low=79.8272, close=83.7532, weight=0.1732),
 Stock(name='Hernandez, White and Maddox', symbol='HHM', open=107.7362, high=117.2611, low=99.0844, close=107.7344, weight=0.0229),
 Stock(name='Bryan PLC', symbol='NRR', open=164.9593, high=184.2199, low=158.1642, close=172.2794, weight=0.0041),
 Stock(name='Wood LLC', symbol='WWC', open=10.8148, high=11.056, low=9.2178, close=11.0497, weight=0.0938),
 Stock(name='Vazquez-Melton', symbol='UUL', open=178.8545, high=183.3704, low=174.5044, close=181.1915, weight=0.108),
 Stock(name='Ortega Inc', symbol='CIR', open=85.0351, high=94.0285, low=77.3011, close=78.5892, weight=0.1616),
 Stock(name='Nguyen-Taylor', symbol='YAE', open=140.4736, high=15

In [38]:
# Example usage of market index
stock_data = generate_stock_data(50, 100, 2000)

# Calculate market values
market_open, market_high, market_close = calculate_market_values(stock_data)

# Display market values
print(f"Market Open Value: {market_open}")
print(f"Highest Value During the Day: {market_high}")
print(f"Market Close Value: {market_close}")


Market Open Value: 1029.3822
Highest Value During the Day: 1099.8399
Market Close Value: 1031.4504


### Tests for Fake Stock Exchange

In [39]:
# Test 1: Check if the return type is tuple 
# Test 2: Check if all elements of tuple is Named Tuple
def test_return_type(stock_data):
    """
    Test that generate_stock_data returns a tuple of Stock namedtuples.
    """
    assert isinstance(stock_data, tuple), "The returned data is not a tuple."
    assert all(isinstance(stock, Stock) for stock in stock_data), "Not all elements are Stock namedtuples."
    print("Test 1 and 2 Passed: The function returns a tuple of Stock namedtuples.")

# Test 3: Check if Company name is of string type
def test_company_name_is_string(stock_data):
    """
    Test that the company name is a string.
    """
    for stock in stock_data:
        assert isinstance(stock.name, str), f"Company name {stock.name} for stock {stock.symbol} is not a string."
    print("Test 3 Passed: The company name is a string.")

# Test 4: Check if stock symbol contains alphabets only and not special characters
# Test 5: Check if length of all ticker symbols generated is of length 3
def test_symbol_is_alphabetic_and_length(stock_data):
    """
    Test that the symbol contains only alphabetic characters and is exactly 3 characters long.
    """
    for stock in stock_data:
        assert stock.symbol.isalpha(), f"Symbol {stock.symbol} contains non-alphabetic characters."
        assert len(stock.symbol) == 3, f"Symbol {stock.symbol} is not exactly 3 characters long."
    print("Test 4 and 5 Passed: The stock symbol is alphabetic and exactly 3 characters long.")

# Test 6: Check if all stock symbols generated are unique
def test_unique_stock_symbols(stock_data):
    """
    Test that all stock symbols are unique.
    """
    symbols = [stock.symbol for stock in stock_data]
    assert len(symbols) == len(set(symbols)), "Duplicate stock symbols found."
    print("Test 6 Passed: All stock symbols are unique.")

# Test 7: Check constraints on stock prices (high >= open, close <= high, close >= low, low <= high)
def test_stock_value_constraints(stock_data):
    """
    Test stock value constraints to ensure high >= open, close <= high and close >= low.
    """
    for stock in stock_data:
        assert stock.high >= stock.open, f"High price {stock.high} is less than open price {stock.open} for stock {stock.symbol}."
        assert stock.high >= stock.close, f"High price {stock.high} is less than close price {stock.close} for stock {stock.symbol}."
        assert stock.close >= stock.low, f"Close price {stock.close} is less than low price {stock.low} for stock {stock.symbol}."
        assert stock.low <= stock.high, f"Low price {stock.low} is greater than high price {stock.high} for stock {stock.symbol}."
    print("Test 7 Passed: Stock value constraints are satisfied.")

# Test 8: Check market value constraints to ensure high >= open, close between low and high, and low <= high and open
def test_market_value_constraints(stock_data):
    """
    Test market values to ensure high >= open, close between low and high, and low <= high and open.
    """
    market_open, market_high, market_close = calculate_market_values(stock_data)
    assert market_high >= market_open, f"Market high value {market_high} is less than market open value {market_open}."
    assert market_high >= market_close, f"Market high value {market_high} is less than market close value {market_close}."
    assert market_close >= min(stock.low for stock in stock_data), f"Market close value {market_close} is less than the minimum low value."
    assert market_close <= market_high, f"Market close value {market_close} is greater than market high value {market_high}."
    print("Test 8 Passed: Market value constraints are satisfied.")

# Test 9: Check if ValueError is raised if start_range is greater than end_range
def test_value_error_on_invalid_range():
    """
    Test that a ValueError is raised when start_range is not less than end_range.
    """
    try:
        generate_stock_data(10, 200, 100)
        print("Test Failed: ValueError was not raised for an invalid range.")
    except ValueError:
        print("Test 9 Passed: ValueError is raised correctly for an invalid range.")

# Test 10: Check if weights are normalized to sum to 1
def test_weight_normalization(stock_data):
    """
    Test that the weights are normalized to sum to 1.
    """
    total_weight = sum(stock.weight for stock in stock_data)
    assert round(total_weight) == 1.0, f"Total weight {total_weight} does not sum to 1.0."
    print("Test 10 Passed: Weights are normalized to sum to 1.")

# Test 11: Check if the length of the stock data tuple matches the expected number of stocks
def test_stock_data_length():
    """
    Test that the length of the stock data tuple matches the expected number of stocks.
    """
    expected_length = 100
    stock_data = generate_stock_data(expected_length, 100, 2000)
    assert len(stock_data) == expected_length, f"Length of stock data tuple is {len(stock_data)}, expected {expected_length}."
    print("Test 11 Passed: Length of the stock data tuple matches the expected number of stocks.")

n_stocks=random.randint(10,10000)
start_range=random.randint(10,10000)
end_range=random.randint(start_range,10000)

stock_data=generate_stock_data(n_stocks,start_range,end_range)


test_return_type(stock_data)
test_company_name_is_string(stock_data)
test_symbol_is_alphabetic_and_length(stock_data)
test_unique_stock_symbols(stock_data)
test_stock_value_constraints(stock_data)
test_market_value_constraints(stock_data)
test_value_error_on_invalid_range()
test_weight_normalization(stock_data)
test_stock_data_length()


Test 1 and 2 Passed: The function returns a tuple of Stock namedtuples.
Test 3 Passed: The company name is a string.
Test 4 and 5 Passed: The stock symbol is alphabetic and exactly 3 characters long.
Test 6 Passed: All stock symbols are unique.
Test 7 Passed: Stock value constraints are satisfied.
Test 8 Passed: Market value constraints are satisfied.
Test 9 Passed: ValueError is raised correctly for an invalid range.
Test 10 Passed: Weights are normalized to sum to 1.
Test 11 Passed: Length of the stock data tuple matches the expected number of stocks.


## End of Assignment