### Instructions:
* You need to code in this jupyter notebook only.
* Download this notebokk and import in your jupyter lab.
* You need to write a partial code for step 0 to step 8 mentioned with prefix ##
* Fill the blanks where it is instructed in comments. 
* Leave other codes, structure as it is.
* Follow all the instructions commented in a cells.
* Upload this jupyter notebook after completion with your partial code.
* Also upload the resulting image showing all the selected points and boundary line between them after LDA analysis.
* Duetime: 1:30 PM 

In [10]:
import numpy as np
import cv2
import matplotlib
import matplotlib.pyplot as plt
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
matplotlib.use('TkAgg')

##---------------------------------------------------
## Step 0: Install all other dependencies that occue at run time if  any module not found.
##---------------------------------------------------

In [None]:
Number_of_points = 25  ## Number of points you want select from each strip. Recommended >= 20 

img = cv2.imread('Indian_Flag.jpg') ## Read the given image

def select_points(img, title):
    fig, ax = plt.subplots()
    #------------------------------------------
    ## step 1: Convert the img from BGR to RGB using cv2 and display it using cv2.imshow
    image_RGB = (cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
    plt.imshow(image_RGB)

    ## step 2: Put title of the image
    plt.title(title)

    ##-----------------------------------------
    
    # Set the cursor style to a plus sign
    fig.canvas.manager.set_window_title('Select Points')
    cursor = matplotlib.widgets.Cursor(ax, useblit=True, color='red', linewidth=1)
    plt.show(block=False)  # Show the image without blocking

    k = 0
    points = [] ## Create here an empty list to store points 

    while k < Number_of_points:
        xy = plt.ginput(1, timeout=0)  # Non-blocking input
        if len(xy) > 0:
            col, row = map(int, xy[0])  # Convert to integer
            ##-----------------------------------------------
            ## Step 3: Collect RGB values at the clicked positions (col, row) and print it. 
            ##-----------------------------------------------
            rgb_value = img[row, col]
            print(f"RGB value at ({row}, {col}): {rgb_value}")
            
            k += 1
            points.append([row, col, img[row, col]])  # Store RGB values in empty list points.
            
            # Display colored dot on the image
            plt.scatter(col, row, c='black', marker='o', s=10)

            # Redraw the image to include the dot
            plt.draw()

    plt.close()  # Close the window after all points are collected
    return points ## Fill this blank


In [15]:
##-----------------------------------------------------------------
## Step4: fill the blanks for Selected points from saffron strip
pts_saffron = select_points(img, "Select Points from Saffron Strip")
## Step5: fill the blanks for Selected points from white strip)
pts_white = select_points(img, "Select Points from White Strip")
## Step6: fill the blanks for Selected points from green strip
pts_green = select_points(img, "Select Points from Green Strip")
##-----------------------------------------------------------------

RGB value at (269, 536): [ 24  89 251]
RGB value at (274, 535): [ 26  91 253]
RGB value at (278, 535): [ 16  87 245]
RGB value at (284, 533): [ 23  85 245]
RGB value at (279, 538): [ 14  87 245]
RGB value at (279, 544): [ 18  88 242]
RGB value at (279, 553): [ 19  89 243]
RGB value at (281, 564): [ 24  96 250]
RGB value at (280, 584): [ 26  94 255]
RGB value at (278, 605): [  9  79 240]
RGB value at (276, 622): [ 14  79 242]
RGB value at (276, 635): [  6  76 243]
RGB value at (275, 640): [  9  80 247]
RGB value at (288, 619): [  5  84 247]
RGB value at (292, 563): [ 16  89 247]
RGB value at (290, 554): [ 18  88 242]
RGB value at (288, 570): [ 20  88 249]
RGB value at (292, 574): [ 18  84 243]
RGB value at (282, 575): [  6  76 237]
RGB value at (290, 595): [ 11  82 246]
RGB value at (287, 608): [  6  80 246]
RGB value at (281, 630): [ 10  80 247]
RGB value at (287, 629): [ 15  85 252]
RGB value at (286, 641): [  4  75 242]
RGB value at (278, 651): [  4  75 242]
RGB value at (309, 532): 

In [16]:
# Convert RGB values to Lab color space
def rgb_to_lab(rgb):
    return cv2.cvtColor(np.uint8([[rgb]]), cv2.COLOR_RGB2Lab)[0][0]

saffron_lab = np.array([rgb_to_lab(rgb) for _, _, rgb in pts_saffron])
white_lab = np.array([rgb_to_lab(rgb) for _, _, rgb in pts_white])
green_lab = np.array([rgb_to_lab(rgb) for _, _, rgb in pts_green])

## Step7: Extract a* and b* components from Lab color space
a_features = np.hstack((saffron_lab[:, 1], white_lab[:, 1], green_lab[:, 1]))
b_features = np.hstack((saffron_lab[:, 2], white_lab[:, 2], green_lab[:, 2]))

In [17]:
# Map class labels to numeric values
class_mapping = {'Saffron': 0, 'White': 1, 'Green': 2}
y = np.array([class_mapping[label] for label in ['Saffron'] * Number_of_points + ['White'] * Number_of_points + ['Green'] * Number_of_points])

plt.figure()
plt.scatter(a_features[:Number_of_points], b_features[:Number_of_points], c='b', marker='o', s=50, label='Saffron')
plt.scatter(a_features[Number_of_points:2*Number_of_points], b_features[Number_of_points:2*Number_of_points], c='g', marker='^', s=50, label='White')
plt.scatter(a_features[2*Number_of_points:], b_features[2*Number_of_points:], c='r', marker='*', s=50, label='Green')
plt.legend(['Saffron', 'White', 'Green'], loc='best')
plt.xlabel('a*')  ## Provide x label
plt.ylabel('b*') ## Provide y label
plt.title('Lab Color Space') ## Provide title
plt.grid()
plt.show()

##------------------------------------------------------------
# Step 8: Perform LDA analysis using LinearDiscriminantAnalysis() and lda.fit()
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis

lda = LinearDiscriminantAnalysis()
lda.fit(np.column_stack((a_features, b_features)), y)

##-----------------------------------------------------------



In [18]:
# Plot LDA boundaries
plt.figure()
plt.scatter(a_features[:Number_of_points], b_features[:Number_of_points], c='b', marker='o', s=50, label='Saffron')
plt.scatter(a_features[Number_of_points:2*Number_of_points], b_features[Number_of_points:2*Number_of_points], c='g', marker='^', s=50, label='White')
plt.scatter(a_features[2*Number_of_points:], b_features[2*Number_of_points:], c='r', marker='*', s=50, label='Green')

plt.xlabel('a*')  ## Provide x label
plt.ylabel('b*') ## Provide y label
plt.title('LDA boundaries (linear model) for Colors of the Indian Flag')

# Plot the decision boundaries
ax = plt.gca()
xlim = ax.get_xlim()
ylim = ax.get_ylim()

xx, yy = np.meshgrid(np.linspace(xlim[0], xlim[1], 100), np.linspace(ylim[0], ylim[1], 100))
Z = lda.predict(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)

plt.contour(xx, yy, Z, colors='k', linewidths=2, linestyles='solid')
plt.legend(loc='best')
plt.grid()
plt.show()

## Answer following questions:
1. What are the key assumptions underlying LDA, and how do these assumptions influence the model's performance?
2. What are the hyperparameters in LDA, and how do they affect the outcome of the model?
3. What methods can be employed to assess the effectiveness of an LDA model in terms of the separation of topics and the coherence of generated topics?
4. What are some common challenges or limitations associated with LDA, and how can they be addressed or mitigated?
5. What practical applications does this assignment have in real-world situations, and what benefits does it offer in those specific scenarios?

1. Key assumptions in LDA include: (a) Data follows a Gaussian distribution, (b) Classes have the same covariance matrix, (c) Features are statistically independent. These assumptions affect model performance by influencing the shape of decision boundaries, potentially leading to misclassification if assumptions are violated.

2. Hyperparameters in LDA include: (a) Prior probabilities of classes, affecting class distribution, (b) Regularization parameter controlling overfitting. They affect model outcome by adjusting class boundaries and handling imbalanced data.

3. Methods to assess LDA effectiveness include: (a) Perplexity to measure topic coherence, (b) Visualization techniques like t-SNE for topic separation. These methods help understand how well LDA captures topic structures.

4. Common challenges in LDA include: (a) Sensitivity to assumptions leading to biased results, (b) Difficulty handling large datasets. Addressing these, one can employ robustness checks and parallel computing techniques.

5. Practical applications of this assignment include: (a) Image segmentation for flag analysis, (b) Color-based classification in various domainsl. Benefits include efficient classification and pattern recognition in various domains like manufacturing and image processing.