## **CS22B Exam 3 Colab notebook**
Spring 2025

You will be using the US Health Insurance dataset found at 'https://raw.githubusercontent.com/csbfx/cs133/main/us_health_insurance_2020.csv'.

The exam will cover OOP, generator and exception handling. Be sure that your Python code is written so that it is reusable and modular.

#**Dataset - US Health Insurance**

SAHIE is the Small Area Health Insurance Estimates program of the U.S. Census Bureau. SAHIE estimates of health insurance coverage for counties and states. SAHIE publishes STATE and COUNTY estimates of population with and without health insurance coverage.

Here are the descriptions of this dataset:
- `year` - Year of Estimate
- `statefips` - Unique FIPS code for each state
- `countyfips` - Unique FIPS code for each county within a state
- `geocat` - Geography category, 40 – State geographic identifier, 50 – County geographic identifier
- `agecat` - Age category
- `racecat` - Race category
- `sexcat` - Sex category
- `iprcat` - Income category
- `NIPR` - Number in demographic group for income category
- `NUI` - Number uninsured
- `NIC` - Number insured
- `PCTUI` - Percent uninsured in demographic group for income category
- `PCTIC` - Percent insured in demographic group for income category
- `PCTELIG` - Percent uninsured in demographic group for all income levels
- `PCTLIIC` - Percent insured in demographic group for all income levels
- `state_name` - State name

Data Source Credit: Small Area Health Insurance Estimates Program, U.S. Census Bureau.



# Load the dataset with exception handling
Use try-except-else to load the dataset. If the dataset fails to load, raise an exception with custom message "Failed to load dataset". Otherwise, print "Dataset loaded successfully" and print the first 5 lines of the dataset.

In [14]:
import pandas as pd
link = 'https://raw.githubusercontent.com/csbfx/cs133/main/us_health_insurance_2020.csv'

try :
  df = pd.read_csv(link)
except Exception as e:
  raise Exception("Failed to load dataset") from e
else:
    print("Dataset loaded successfully")
    print(df.head())


Dataset loaded successfully
   year  statefips  countyfips  geocat          agecat  \
0  2020          1           0      40  18 to 64 years   
1  2020          1           0      40  18 to 64 years   
2  2020          1           0      40  18 to 64 years   
3  2020          1           0      40  18 to 64 years   
4  2020          1           0      40  18 to 64 years   

                     racecat sexcat                          iprcat    NIPR  \
0  White alone, not Hispanic   Male     At or below 200% of poverty  204748   
1  White alone, not Hispanic   Male     At or below 250% of poverty  274778   
2  White alone, not Hispanic   Male     At or below 138% of poverty  123241   
3  White alone, not Hispanic   Male     At or below 400% of poverty  488459   
4  White alone, not Hispanic   Male  Between 138% - 400% of poverty  365218   

      NUI     NIC  PCTUI  PCTIC  PCTELIG  PCTLIIC  \
0   58255  146493   28.5   71.5      6.3     15.8   
1   72162  202616   26.3   73.7      7.8  

# Load the dataset with exception handling
Use try-except-else to load the dataset. If the dataset fails to load, raise an exception with custom message "Failed to load dataset". Otherwise, print "Dataset loaded successfully" and print the first 5 lines of the dataset.

# Load the dataset with exception handling
Use try-except-else to load the dataset. If the dataset fails to load, raise an exception with custom message "Failed to load dataset". Otherwise, print "Dataset loaded successfully" and print the first 5 lines of the dataset.

### Follow up questions. Answer below:  
Q1: Why is it better to handle exceptions using try-except block rather than allowing the program to crash?

*Your answer:
Exception handling allows programs to respond to errors instead of crashing this ensures the program runs predictably even when an unexpected issue may occur.

## OOP Integration of State Insurance Data Management System

In this coding challenge, you are tasked with creating a State Insurance Data Management System using Python's Object-Oriented Programming (OOP) concepts.
  

Create a class called `StateInsuranceData` that stores information about a U.S. state's health insurance coverage:

###**State Data Generator Class:**

**Attributes:**  

state (str, public): State  

insured_percent (float, public): Insured percentage  

uninsured_percent (float, public): Uninsured percentage   


**Methods:**

Constructor to initialize the state, insured %, and uninsured %.  

__init__() to initialize the attributes  

is_high_risk(): If the uninsured % is > 10, is_high_risk is True, otherwise False

In [15]:
class StateInsuranceData:
    def __init__(self, state, insured_percent, uninsured_percent):
        self.state = state
        self.insured_percent = insured_percent
        self.uninsured_percent = uninsured_percent

    def is_high_risk(self):
        return self.uninsured_percent > 10



### Follow-up questions   
Q2: If you were to create a subclass RegionalInsuranceData from StateInsuranceData, what attributes or methods might you override or extend?  
Q3: Imagine this program grows to handle global health data. How would an OOP structure help with scaling and reusability?

Your answer:
Question 2: I might add region name and a list of states region to represent the grouped state data. I could overide high risk method to access risk based on average unisured rate accross a specific region that would make it more simplified and efficient.

Question 3 :The OOP structure allows the program to scale up uasing inheritance making it easier to adapt to new regions. Polymorphyism maybe used as well for better scailing.


## Generator function
Write a function a generator called 'state_data_generator' that iterates through each row in the health insurance dataframe and yield the state ('state_name'), population ('NIPR'), insured ('NIC'), uninsured ('NUI'), insured ('PCTIC'), and uninsured % ('PCTUI').

Wraps the object creation in a try-except block to catch `KeyError` and print a custom message "Missing column in row" when a required column is missing.
  
Run the generator, and return the first 3 values.

In [19]:
## Your code
def state_data_generator(dataframe):
    for index, row in dataframe.iterrows():
        try:
            yield {
                'state_name': row['state name'],
                'population': row['NIPR'],
                'insured': row['NIC'],
                'uninsured': row['NUI'],
                'insured_percent': row['PCTIC'],
                'uninsured_percent': row['PCTUI']
            }
        except KeyError:
            print("Missing column in row") #When a required column is missing an exception is thrown

generator= state_data_generator(health_df)
first_three = [next(gen) for _ in range(3)]

for item in first_three:
    print(item)

generator = state_data_generator(health_df)

{'state_name': 'Alabama                                                               ', 'population': 370519, 'insured': 323128, 'uninsured': 47391, 'insured_percent': 87.2, 'uninsured_percent': 12.8}
{'state_name': 'Alabama                                                               ', 'population': 132034, 'insured': 92850, 'uninsured': 39184, 'insured_percent': 70.3, 'uninsured_percent': 29.7}
{'state_name': 'Alabama                                                               ', 'population': 167011, 'insured': 119742, 'uninsured': 47269, 'insured_percent': 71.7, 'uninsured_percent': 28.3}


Compare your generator code to the list code below.
```
try:
    states_list = [
        {
            "state": row["state_name"],
            "population": row["NIPR"],
            "insured": row["NIC"],
            "uninsured": row["NUI"]
        }
        for i, row in health_df.iterrows()
    ]
    print(states_list[0])
except KeyError as e:
    print(f"Missing expected column: {e}")
```

### Follow up questions. Answer below:  
Q4 : Why does indexing works with list but not generator?  
Q5: When would a list approach be more useful over a generator?

*Your answer*  
Q4 sample:
Indexing works with a list because it stores all elements in memory allowing random access to any element, whereas ggenerator produces one value at a time and does not store an entire sequence.

Q5 sample:
A list is more useful then generator when you need to access item multiple times, sort or randomly acess elements. Its ideal for small to moderate-sized datasets.


### Use the object-based generator you previously defined.
Implement a try-except exception handler to iterate through the dataframe and yield a value. If the generator has been exhausted (ie., reach the last item), raise the built-in exception StopIteration.

In [22]:
def state_object_generator(dataframe):
    for index, row in dataframe.iterrows():
        try:
            yield StateInsuranceData(
                state=row['state_name'],
                insured_percent=row['PCTIC'],
                uninsured_percent=row['PCTUI']
            )
        except KeyError:
            print("Missing column in row")


obj_gen = state_object_generator(df) # Assuming 'df' is the dataframe loaded earlier

while True:
    try:
        obj = next(obj_gen)
        print(f"{obj.state}: {obj.insured_percent}% insured, High risk? {obj.is_high_risk()}")
    except StopIteration:
        print("All states processed – generator exhausted.")
        break

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
Florida                                                               : 69.8% insured, High risk? True
Florida                                                               : 67.2% insured, High risk? True
Florida                                                               : 73.9% insured, High risk? True
Florida                                                               : 77.0% insured, High risk? True
Florida                                                               : 77.7% insured, High risk? True
Florida                                                               : 78.9% insured, High risk? True
Florida                                                               : 76.8% insured, High risk? True
Florida                                                               : 81.4% insured, High risk? True
Florida                                                               : 83.8% insured, High risk? True
Florida 

Follow-up question:  
Q6: What happens when a generator is exhausted? Why is it important to handle that case with StopIteration?

*Your answer*
When a generator is exchausted it raises StopIteration exception to signal that there is noting else to yield. Its important to handle the case with StopIteration to ensure that it ends the loop and avoid runtime errors when consuing generator values.

Bonus Questions (extra credit):
1. What was the most useful thing you learned in this course?
2. What is something we did not cover that you wished we had? Or wish we had covered in more details?
3. Any suggestions for this course moving forth?

*Your answer*
1.I found pandas and matplotlib to the most useful as its easier to explore and visuialize real world data in a more meaningful way.

2.I wish we had spend more time in classes going through object-orient design because it helps organize codes clearly makign dealing with real data application more easier.

3.I would suggest adding a mini project or case study in the middle of the semester.
