# Python Programming Final

## Part 1 – Housing Data Analysis (60 pts)
1. Create a dataframe from the housing.csv file.
2. Create a smaller dataframe containing only lot size and price.
3. Compute the mean values of both columns.
4. Compute recentered data series: value minus mean.
5. Compute the correlation coefficient manually using the recentered values.
6. Use .corr() to verify the correlation matches.

## Part 2 – Histogram Equalization (40 pts)

Given the following 5×5 image matrix with gray levels 0–7:

```
3 4 2 0 0
1 1 4 0 0
0 2 4 1 1
0 0 1 4 2
0 0 1 2 5
```

## Answer to Part 1:

In [2]:
import pandas as pd
# Answer to 1: Create a DataFrame from the housing.csv data
data = pd.read_csv('housing.csv')

df = pd.DataFrame(data)

# Answer to 2: Make a smaller DataFrame with only 'lotsize' and 'price'
small_df = df[['lotsize', 'price']]

# Answer to 3: Compute the mean values
mean_lotsize = small_df['lotsize'].mean()
mean_price = small_df['price'].mean()

# Answer to 4: Compute recentered data series
recentered_lotsize = small_df['lotsize'] - mean_lotsize
recentered_price = small_df['price'] - mean_price

# Answer to 5: Compute correlation coefficient manually
numerator = (recentered_lotsize * recentered_price).sum()
denominator = (recentered_lotsize**2).sum()**0.5 * (recentered_price**2).sum()**0.5
correlation_manual = numerator / denominator

# Answer to 6: Compute correlation matrix using .corr()
correlation_matrix = small_df.corr()
correlation_corr_method = correlation_matrix.loc['lotsize', 'price']

# Print all results
print(f"Mean lot size: {mean_lotsize:.4f}")
print(f"Mean price: {mean_price:.4f}")
print(f"Manual correlation coefficient: {correlation_manual:.4f}")
print(f"Correlation from .corr() method: {correlation_corr_method:.4f}")

Mean lot size: 5150.2656
Mean price: 68121.5971
Manual correlation coefficient: 0.5358
Correlation from .corr() method: 0.5358


In [None]:
# Conclusion: As we can see, the result from the manual calculation 
# matches the output of Python’s built-in function. This suggests 
# that our implementation is correct and confirms that the code is working as expected.

## Answer to Part 2:

In [3]:
# Step 1: Flatten the image and compute histogram
image = [
    [3, 4, 2, 0, 0],
    [1, 1, 4, 0, 0],
    [0, 2, 4, 1, 1],
    [0, 0, 1, 4, 2],
    [0, 0, 1, 2, 5]
]

rows = len(image)
cols = len(image[0])
L = 8  # number of gray levels: 0-7 from task

# Step 2: Compute histogram
hist = [0] * L
for row in image:
     for pixel in row:
         hist[pixel] += 1

# Step 3: Compute normalized cumulative histogram (CDF)
total_pixels = rows * cols
cdf = [0] * L
cdf[0] = hist[0]
for i in range(1, L):
    cdf[i] = cdf[i - 1] + hist[i]

# Step 4: Normalize the CDF to [0, L-1]
cdf_min = next(c for c in cdf if c > 0)
equalized = [round((cdf[i] - cdf_min) / (total_pixels - cdf_min) * (L - 1)) for i in range(L)]

# Step 5: Map original image using equalized values
equalized_image = [[equalized[pixel] for pixel in row] for row in image]

# Print result
print("Original Image:")
for row in image:
    print(row)

print("\nEqualized Image:")
for row in equalized_image:
    print(row)

Original Image:
[3, 4, 2, 0, 0]
[1, 1, 4, 0, 0]
[0, 2, 4, 1, 1]
[0, 0, 1, 4, 2]
[0, 0, 1, 2, 5]

Equalized Image:
[5, 7, 4, 0, 0]
[3, 3, 7, 0, 0]
[0, 4, 7, 3, 3]
[0, 0, 3, 7, 4]
[0, 0, 3, 4, 7]


In [None]:
# Conclusion: Histogram equalization helped to make the image look better 
# by increasing the contrast. In the original image, most of the pixels 
# had low gray levels like 0, 1, or 2. This made the image have only a 
# small range of brightness. After using histogram equalization, the pixel 
# values were spread more evenly from 0 to 7. This made the image have 
# more different brightness levels. Because of that, the image now shows 
# more detail and looks clearer. This means that our algorithm worked well.