Skip to content

Commit

Permalink
Added column vector util func, fixed clustering
Browse files Browse the repository at this point in the history
- added column() function to utils to create a column vector for a 1D list
- fixed typos in density clustering equations
- made hard k-means algorithm also return the cluster label for each data point
- improved type hints and docstrings
  • Loading branch information
jerela committed Nov 29, 2023
1 parent 683f97b commit 50fdb9b
Show file tree
Hide file tree
Showing 9 changed files with 71 additions and 40 deletions.
18 changes: 5 additions & 13 deletions documentation/mola.clustering.html
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@
Arguments:<br>
p1&nbsp;--&nbsp;list:&nbsp;the&nbsp;first&nbsp;point<br>
p2&nbsp;--&nbsp;list:&nbsp;the&nbsp;second&nbsp;point</tt></dd></dl>
<dl><dt><a name="-find_c_means"><strong>find_c_means</strong></a>(data: mola.matrix.Matrix, num_centers=2, max_iterations=100, distance_function=&lt;function distance_euclidean_pow at 0x0000023FFD4614C0&gt;, initial_centers=None)</dt><dd><tt>Return&nbsp;the&nbsp;cluster&nbsp;centers&nbsp;and&nbsp;the&nbsp;membership&nbsp;matrix&nbsp;of&nbsp;points&nbsp;using&nbsp;soft&nbsp;k-means&nbsp;clustering&nbsp;(also&nbsp;known&nbsp;as&nbsp;fuzzy&nbsp;c-means).<br>
<dl><dt><a name="-find_c_means"><strong>find_c_means</strong></a>(data: mola.matrix.Matrix, num_centers=2, max_iterations=100, distance_function=&lt;function distance_euclidean_pow at 0x000002B30AB56670&gt;, initial_centers=None)</dt><dd><tt>Return&nbsp;the&nbsp;cluster&nbsp;centers&nbsp;and&nbsp;the&nbsp;membership&nbsp;matrix&nbsp;of&nbsp;points&nbsp;using&nbsp;soft&nbsp;k-means&nbsp;clustering&nbsp;(also&nbsp;known&nbsp;as&nbsp;fuzzy&nbsp;c-means).<br>
&nbsp;<br>
Fuzzy&nbsp;c-means&nbsp;clustering&nbsp;is&nbsp;an&nbsp;iterative&nbsp;algorithm&nbsp;that&nbsp;finds&nbsp;the&nbsp;cluster&nbsp;centers&nbsp;by&nbsp;first&nbsp;assigning&nbsp;each&nbsp;point&nbsp;to&nbsp;each&nbsp;cluster&nbsp;center&nbsp;with&nbsp;a&nbsp;certain&nbsp;membership&nbsp;value&nbsp;(0&nbsp;to&nbsp;1)&nbsp;and&nbsp;then&nbsp;updating&nbsp;the&nbsp;cluster&nbsp;centers&nbsp;to&nbsp;be&nbsp;the&nbsp;weighted&nbsp;mean&nbsp;of&nbsp;the&nbsp;points&nbsp;assigned&nbsp;to&nbsp;them.&nbsp;This&nbsp;process&nbsp;is&nbsp;repeated&nbsp;for&nbsp;a&nbsp;set&nbsp;number&nbsp;of&nbsp;iterations&nbsp;or&nbsp;until&nbsp;the&nbsp;cluster&nbsp;centers&nbsp;converge.&nbsp;The&nbsp;initial&nbsp;cluster&nbsp;centers&nbsp;are&nbsp;either&nbsp;randomized&nbsp;or&nbsp;given&nbsp;by&nbsp;the&nbsp;user.<br>
A&nbsp;major&nbsp;difference&nbsp;between&nbsp;hard&nbsp;k-means&nbsp;clustering&nbsp;and&nbsp;fuzzy&nbsp;c-means&nbsp;clustering&nbsp;is&nbsp;that&nbsp;in&nbsp;fuzzy&nbsp;c-means&nbsp;clustering,&nbsp;the&nbsp;points&nbsp;may&nbsp;belong&nbsp;partially&nbsp;to&nbsp;several&nbsp;clusters&nbsp;instead&nbsp;of&nbsp;belonging&nbsp;completely&nbsp;to&nbsp;one&nbsp;cluster,&nbsp;like&nbsp;in&nbsp;hard&nbsp;k-means&nbsp;clustering.&nbsp;Therefore,&nbsp;this&nbsp;algorithm&nbsp;is&nbsp;well-suited&nbsp;to&nbsp;cluster&nbsp;data&nbsp;that&nbsp;is&nbsp;not&nbsp;clearly&nbsp;separable&nbsp;into&nbsp;distinct&nbsp;clusters&nbsp;(e.g.,&nbsp;symmetric&nbsp;distribution&nbsp;of&nbsp;data&nbsp;points).<br>
Expand All @@ -59,9 +59,9 @@
Arguments:<br>
data&nbsp;--&nbsp;Matrix:&nbsp;the&nbsp;data&nbsp;containing&nbsp;the&nbsp;points&nbsp;to&nbsp;be&nbsp;clustered<br>
num_centers&nbsp;--&nbsp;int:&nbsp;the&nbsp;number&nbsp;of&nbsp;cluster&nbsp;centers&nbsp;to&nbsp;be&nbsp;found&nbsp;(default&nbsp;2)<br>
beta&nbsp;--&nbsp;float:&nbsp;the&nbsp;width&nbsp;of&nbsp;the&nbsp;Gaussian&nbsp;function&nbsp;(default&nbsp;0.5)<br>
sigma&nbsp;--&nbsp;float:&nbsp;the&nbsp;width&nbsp;of&nbsp;the&nbsp;Gaussian&nbsp;function&nbsp;(default&nbsp;0.5)</tt></dd></dl>
<dl><dt><a name="-find_k_means"><strong>find_k_means</strong></a>(data: mola.matrix.Matrix, num_centers=2, max_iterations=100, distance_function=&lt;function distance_euclidean_pow at 0x0000023FFD4614C0&gt;, initial_centers=None)</dt><dd><tt>Return&nbsp;the&nbsp;cluster&nbsp;centers&nbsp;using&nbsp;hard&nbsp;k-means&nbsp;clustering.<br>
beta&nbsp;--&nbsp;float:&nbsp;the&nbsp;width&nbsp;of&nbsp;the&nbsp;Gaussian&nbsp;function&nbsp;(default&nbsp;0.5)&nbsp;used&nbsp;to&nbsp;destruct&nbsp;the&nbsp;mountain&nbsp;function<br>
sigma&nbsp;--&nbsp;float:&nbsp;the&nbsp;width&nbsp;of&nbsp;the&nbsp;Gaussian&nbsp;function&nbsp;(default&nbsp;0.5)&nbsp;used&nbsp;to&nbsp;construct&nbsp;the&nbsp;mountain&nbsp;function</tt></dd></dl>
<dl><dt><a name="-find_k_means"><strong>find_k_means</strong></a>(data: mola.matrix.Matrix, num_centers=2, max_iterations=100, distance_function=&lt;function distance_euclidean_pow at 0x000002B30AB56670&gt;, initial_centers=None) -&gt; mola.matrix.Matrix</dt><dd><tt>Return&nbsp;the&nbsp;cluster&nbsp;centers&nbsp;using&nbsp;hard&nbsp;k-means&nbsp;clustering.<br>
&nbsp;<br>
K-means&nbsp;clustering&nbsp;is&nbsp;an&nbsp;iterative&nbsp;algorithm&nbsp;that&nbsp;finds&nbsp;the&nbsp;cluster&nbsp;centers&nbsp;by&nbsp;first&nbsp;assigning&nbsp;each&nbsp;point&nbsp;to&nbsp;the&nbsp;closest&nbsp;cluster&nbsp;center&nbsp;and&nbsp;then&nbsp;updating&nbsp;the&nbsp;cluster&nbsp;centers&nbsp;to&nbsp;be&nbsp;the&nbsp;mean&nbsp;of&nbsp;the&nbsp;points&nbsp;assigned&nbsp;to&nbsp;them.&nbsp;This&nbsp;process&nbsp;is&nbsp;repeated&nbsp;for&nbsp;a&nbsp;set&nbsp;number&nbsp;of&nbsp;iterations&nbsp;or&nbsp;until&nbsp;the&nbsp;cluster&nbsp;centers&nbsp;converge.&nbsp;The&nbsp;initial&nbsp;cluster&nbsp;centers&nbsp;are&nbsp;either&nbsp;randomized&nbsp;or&nbsp;given&nbsp;by&nbsp;the&nbsp;user.<br>
&nbsp;<br>
Expand All @@ -74,13 +74,5 @@
distance_function&nbsp;--&nbsp;function:&nbsp;the&nbsp;distance&nbsp;function&nbsp;to&nbsp;be&nbsp;used&nbsp;(default&nbsp;Euclidean&nbsp;distance);&nbsp;options&nbsp;are&nbsp;squared&nbsp;Euclidean&nbsp;distance&nbsp;(distance_euclidean_pow)&nbsp;and&nbsp;taxicab&nbsp;distance&nbsp;(distance_taxicab)<br>
initial_centers&nbsp;--&nbsp;Matrix:&nbsp;the&nbsp;initial&nbsp;cluster&nbsp;centers;&nbsp;if&nbsp;not&nbsp;specified,&nbsp;they&nbsp;are&nbsp;initialized&nbsp;randomly&nbsp;(default&nbsp;None)</tt></dd></dl>
<dl><dt><a name="-random"><strong>random</strong></a>()<font color="#909090"><font face="helvetica, arial"> method of <a href="random.html#Random">random.Random</a> instance</font></font></dt><dd><tt><a href="#-random">random</a>()&nbsp;-&gt;&nbsp;x&nbsp;in&nbsp;the&nbsp;interval&nbsp;[0,&nbsp;1).</tt></dd></dl>
</td></tr></table><p>
<table width="100%" cellspacing=0 cellpadding=2 border=0 summary="section">
<tr bgcolor="#55aa55">
<td colspan=3 valign=bottom>&nbsp;<br>
<font color="#ffffff" face="helvetica, arial"><big><strong>Data</strong></big></font></td></tr>

<tr><td bgcolor="#55aa55"><tt>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</tt></td><td>&nbsp;</td>
<td width="100%"><strong>INFINITE</strong> = 4294967295<br>
<strong>INFINITY</strong> = inf</td></tr></table>
</td></tr></table>
</body></html>
4 changes: 2 additions & 2 deletions documentation/mola.matrix.html
Original file line number Diff line number Diff line change
Expand Up @@ -270,7 +270,7 @@

<dl><dt><a name="LabeledMatrix-make_identity"><strong>make_identity</strong></a>(self) -&gt; None</dt><dd><tt>Set&nbsp;all&nbsp;diagonal&nbsp;elements&nbsp;of&nbsp;the&nbsp;matrix&nbsp;to&nbsp;1&nbsp;and&nbsp;all&nbsp;non-diagonal&nbsp;elements&nbsp;to&nbsp;0.</tt></dd></dl>

<dl><dt><a name="LabeledMatrix-norm_Euclidean"><strong>norm_Euclidean</strong></a>(self)</dt><dd><tt>Return&nbsp;the&nbsp;Euclidean&nbsp;norm&nbsp;of&nbsp;the&nbsp;matrix.</tt></dd></dl>
<dl><dt><a name="LabeledMatrix-norm_Euclidean"><strong>norm_Euclidean</strong></a>(self) -&gt; float</dt><dd><tt>Return&nbsp;the&nbsp;Euclidean&nbsp;norm&nbsp;of&nbsp;the&nbsp;matrix.</tt></dd></dl>

<dl><dt><a name="LabeledMatrix-row_is_zeros"><strong>row_is_zeros</strong></a>(self, r: int) -&gt; bool</dt><dd><tt>Return&nbsp;true&nbsp;if&nbsp;all&nbsp;elements&nbsp;in&nbsp;the&nbsp;row&nbsp;are&nbsp;zero-valued.&nbsp;Otherwise,&nbsp;return&nbsp;false.<br>
&nbsp;<br>
Expand Down Expand Up @@ -493,7 +493,7 @@

<dl><dt><a name="Matrix-make_identity"><strong>make_identity</strong></a>(self) -&gt; None</dt><dd><tt>Set&nbsp;all&nbsp;diagonal&nbsp;elements&nbsp;of&nbsp;the&nbsp;matrix&nbsp;to&nbsp;1&nbsp;and&nbsp;all&nbsp;non-diagonal&nbsp;elements&nbsp;to&nbsp;0.</tt></dd></dl>

<dl><dt><a name="Matrix-norm_Euclidean"><strong>norm_Euclidean</strong></a>(self)</dt><dd><tt>Return&nbsp;the&nbsp;Euclidean&nbsp;norm&nbsp;of&nbsp;the&nbsp;matrix.</tt></dd></dl>
<dl><dt><a name="Matrix-norm_Euclidean"><strong>norm_Euclidean</strong></a>(self) -&gt; float</dt><dd><tt>Return&nbsp;the&nbsp;Euclidean&nbsp;norm&nbsp;of&nbsp;the&nbsp;matrix.</tt></dd></dl>

<dl><dt><a name="Matrix-print"><strong>print</strong></a>(self, precision=4)</dt><dd><tt>Print&nbsp;a&nbsp;string&nbsp;that&nbsp;describes&nbsp;the&nbsp;matrix.<br>
Rows&nbsp;are&nbsp;delimited&nbsp;by&nbsp;semicolons&nbsp;and&nbsp;newlines.&nbsp;Elements&nbsp;in&nbsp;a&nbsp;single&nbsp;row&nbsp;are&nbsp;delimited&nbsp;by&nbsp;commas.<br>
Expand Down
4 changes: 2 additions & 2 deletions documentation/mola.regression.html
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
<font color="#ffffff" face="helvetica, arial"><big><strong>Functions</strong></big></font></td></tr>

<tr><td bgcolor="#eeaa77"><tt>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</tt></td><td>&nbsp;</td>
<td width="100%"><dl><dt><a name="-fit_nonlinear"><strong>fit_nonlinear</strong></a>(independent_values, dependent_values, h, J, initial=None, max_iters=100)</dt><dd><tt>Return&nbsp;the&nbsp;estimated&nbsp;parameters&nbsp;of&nbsp;a&nbsp;nonlinear&nbsp;model&nbsp;using&nbsp;the&nbsp;Gauss-Newton&nbsp;iteration&nbsp;algorithm.<br>
<td width="100%"><dl><dt><a name="-fit_nonlinear"><strong>fit_nonlinear</strong></a>(independent_values: mola.matrix.Matrix, dependent_values: mola.matrix.Matrix, h: mola.matrix.Matrix, J: mola.matrix.Matrix, initial=None, max_iters=100)</dt><dd><tt>Return&nbsp;the&nbsp;estimated&nbsp;parameters&nbsp;of&nbsp;a&nbsp;nonlinear&nbsp;model&nbsp;using&nbsp;the&nbsp;Gauss-Newton&nbsp;iteration&nbsp;algorithm.<br>
&nbsp;<br>
The&nbsp;algorithm&nbsp;uses&nbsp;Gauss-Newton&nbsp;iteration&nbsp;to&nbsp;find&nbsp;the&nbsp;parameters&nbsp;that&nbsp;minimize&nbsp;the&nbsp;least&nbsp;squares&nbsp;criterion&nbsp;||y-h(theta)||^2,&nbsp;where&nbsp;y&nbsp;is&nbsp;the&nbsp;vector&nbsp;of&nbsp;dependent&nbsp;values,&nbsp;h&nbsp;is&nbsp;the&nbsp;model&nbsp;function,&nbsp;and&nbsp;theta&nbsp;is&nbsp;the&nbsp;vector&nbsp;of&nbsp;the&nbsp;function's&nbsp;parameters.&nbsp;The&nbsp;estimates&nbsp;are&nbsp;improved&nbsp;iteratively&nbsp;by&nbsp;evaluating&nbsp;the&nbsp;gradient&nbsp;of&nbsp;the&nbsp;least&nbsp;squares&nbsp;criterion&nbsp;and&nbsp;using&nbsp;that&nbsp;gradient&nbsp;to&nbsp;update&nbsp;the&nbsp;parameter&nbsp;estimates&nbsp;in&nbsp;small&nbsp;steps.&nbsp;The&nbsp;gradient&nbsp;is&nbsp;approximated&nbsp;by&nbsp;Jacobian&nbsp;matrices.<br>
&nbsp;<br>
Expand All @@ -28,7 +28,7 @@
J&nbsp;--&nbsp;Matrix:&nbsp;the&nbsp;Jacobian&nbsp;matrix&nbsp;of&nbsp;the&nbsp;model&nbsp;function<br>
initial&nbsp;--&nbsp;Matrix:&nbsp;the&nbsp;initial&nbsp;guess&nbsp;of&nbsp;the&nbsp;parameters&nbsp;(default&nbsp;None,&nbsp;in&nbsp;which&nbsp;case&nbsp;they&nbsp;are&nbsp;randomized)<br>
max_iters&nbsp;--&nbsp;int:&nbsp;the&nbsp;maximum&nbsp;number&nbsp;of&nbsp;iterations&nbsp;(default&nbsp;100)</tt></dd></dl>
<dl><dt><a name="-fit_univariate_polynomial"><strong>fit_univariate_polynomial</strong></a>(independent_values, dependent_values, degrees=[1], intercept=True, weights=None, regularization_coefficient=None)</dt><dd><tt>Return&nbsp;the&nbsp;parameters&nbsp;of&nbsp;an&nbsp;nth-order&nbsp;polynomial&nbsp;in&nbsp;a&nbsp;tuple.<br>
<dl><dt><a name="-fit_univariate_polynomial"><strong>fit_univariate_polynomial</strong></a>(independent_values: mola.matrix.Matrix, dependent_values: mola.matrix.Matrix, degrees=[1], intercept=True, weights=None, regularization_coefficient=None)</dt><dd><tt>Return&nbsp;the&nbsp;parameters&nbsp;of&nbsp;an&nbsp;nth-order&nbsp;polynomial&nbsp;in&nbsp;a&nbsp;tuple.<br>
The&nbsp;algorithm&nbsp;uses&nbsp;least&nbsp;squares&nbsp;regression&nbsp;to&nbsp;minimize&nbsp;the&nbsp;term&nbsp;||y-H*theta||^2,&nbsp;where&nbsp;y&nbsp;is&nbsp;the&nbsp;vector&nbsp;of&nbsp;dependent&nbsp;values,&nbsp;H&nbsp;is&nbsp;the&nbsp;observation&nbsp;matrix,&nbsp;and&nbsp;theta&nbsp;is&nbsp;the&nbsp;vector&nbsp;of&nbsp;parameters.<br>
The&nbsp;parameters&nbsp;are&nbsp;the&nbsp;coefficients&nbsp;of&nbsp;the&nbsp;polynomial&nbsp;function.<br>
Optional&nbsp;arguments&nbsp;allow&nbsp;including&nbsp;intercept&nbsp;in&nbsp;the&nbsp;parameters,&nbsp;weighting&nbsp;certain&nbsp;data&nbsp;points&nbsp;over&nbsp;others,&nbsp;and&nbsp;L2&nbsp;(Tikhonov)&nbsp;regularization.<br>
Expand Down
15 changes: 13 additions & 2 deletions documentation/mola.utils.html
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,14 @@
<font color="#ffffff" face="helvetica, arial"><big><strong>Functions</strong></big></font></td></tr>

<tr><td bgcolor="#eeaa77"><tt>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</tt></td><td>&nbsp;</td>
<td width="100%"><dl><dt><a name="-equals_approx"><strong>equals_approx</strong></a>(left, right, precision=1e-12) -&gt; bool</dt><dd><tt>Return&nbsp;true&nbsp;if&nbsp;the&nbsp;compared&nbsp;objects&nbsp;are&nbsp;roughly&nbsp;equal&nbsp;elementwise.&nbsp;Otherwise,&nbsp;return&nbsp;false.<br>
<td width="100%"><dl><dt><a name="-column"><strong>column</strong></a>(data: list) -&gt; mola.matrix.Matrix</dt><dd><tt>Return&nbsp;a&nbsp;column&nbsp;vector&nbsp;Matrix&nbsp;object&nbsp;constructed&nbsp;from&nbsp;a&nbsp;one-dimensional&nbsp;list.<br>
This&nbsp;is&nbsp;the&nbsp;same&nbsp;as&nbsp;calling&nbsp;Matrix(data).get_transpose()&nbsp;with&nbsp;a&nbsp;check&nbsp;to&nbsp;make&nbsp;sure&nbsp;the&nbsp;list&nbsp;is&nbsp;one-dimensional.<br>
&nbsp;<br>
Arguments:<br>
data&nbsp;--&nbsp;list:&nbsp;the&nbsp;1D&nbsp;list&nbsp;to&nbsp;be&nbsp;used&nbsp;as&nbsp;the&nbsp;data&nbsp;of&nbsp;the&nbsp;matrix<br>
&nbsp;<br>
Raises&nbsp;an&nbsp;exception&nbsp;if&nbsp;the&nbsp;list&nbsp;is&nbsp;multidimensional.</tt></dd></dl>
<dl><dt><a name="-equals_approx"><strong>equals_approx</strong></a>(left, right, precision=1e-12) -&gt; bool</dt><dd><tt>Return&nbsp;true&nbsp;if&nbsp;the&nbsp;compared&nbsp;objects&nbsp;are&nbsp;roughly&nbsp;equal&nbsp;elementwise.&nbsp;Otherwise,&nbsp;return&nbsp;false.<br>
&nbsp;<br>
Arguments:<br>
left&nbsp;--&nbsp;Matrix,&nbsp;list,&nbsp;tuple,&nbsp;or&nbsp;a&nbsp;single&nbsp;value:&nbsp;the&nbsp;object&nbsp;on&nbsp;the&nbsp;left&nbsp;side&nbsp;of&nbsp;the&nbsp;comparison<br>
Expand All @@ -45,7 +52,11 @@
cols&nbsp;--&nbsp;unsigned&nbsp;integer:&nbsp;width&nbsp;of&nbsp;the&nbsp;matrix&nbsp;(default&nbsp;None)<br>
&nbsp;<br>
If&nbsp;'cols'&nbsp;is&nbsp;not&nbsp;specified,&nbsp;the&nbsp;matrix&nbsp;is&nbsp;assumed&nbsp;to&nbsp;have&nbsp;the&nbsp;same&nbsp;number&nbsp;of&nbsp;columns&nbsp;as&nbsp;the&nbsp;number&nbsp;of&nbsp;rows.</tt></dd></dl>
<dl><dt><a name="-norm"><strong>norm</strong></a>(data)</dt></dl>
<dl><dt><a name="-norm"><strong>norm</strong></a>(data: mola.matrix.Matrix) -&gt; float</dt><dd><tt>Return&nbsp;the&nbsp;Euclidean&nbsp;norm&nbsp;of&nbsp;a&nbsp;matrix.<br>
You&nbsp;could&nbsp;also&nbsp;just&nbsp;call&nbsp;data.norm_Euclidean()&nbsp;directly,&nbsp;but&nbsp;this&nbsp;is&nbsp;a&nbsp;wrapper&nbsp;function&nbsp;for&nbsp;convenience.<br>
&nbsp;<br>
Arguments:<br>
data&nbsp;--&nbsp;Matrix:&nbsp;the&nbsp;matrix&nbsp;whose&nbsp;Euclidean&nbsp;norm&nbsp;is&nbsp;to&nbsp;be&nbsp;returned</tt></dd></dl>
<dl><dt><a name="-ones"><strong>ones</strong></a>(height: int, width: int) -&gt; mola.matrix.Matrix</dt><dd><tt>Return&nbsp;a&nbsp;matrix&nbsp;where&nbsp;all&nbsp;elements&nbsp;are&nbsp;1.<br>
&nbsp;<br>
Arguments:<br>
Expand Down
33 changes: 20 additions & 13 deletions mola/clustering.py
Original file line number Diff line number Diff line change
Expand Up @@ -119,7 +119,7 @@ def find_k_means(data: Matrix, num_centers = 2, max_iterations = 100, distance_f
if iteration == max_iterations-1:
print("WARNING: k-means centers did not converge in " , str(max_iterations), " iterations. Consider increasing the maximum number of iterations or using fuzzy k-means.")

return centers
return centers, closest_center



Expand Down Expand Up @@ -219,8 +219,8 @@ def find_density_clusters(data: Matrix, num_centers = 2, beta = 0.5, sigma = 0.5
Arguments:
data -- Matrix: the data containing the points to be clustered
num_centers -- int: the number of cluster centers to be found (default 2)
beta -- float: the width of the Gaussian function (default 0.5)
sigma -- float: the width of the Gaussian function (default 0.5)
beta -- float: the width of the Gaussian function (default 0.5) used to destruct the mountain function
sigma -- float: the width of the Gaussian function (default 0.5) used to construct the mountain function
"""

# get the number of data points (samples) and the dimension of each data point
Expand All @@ -232,26 +232,30 @@ def find_density_clusters(data: Matrix, num_centers = 2, beta = 0.5, sigma = 0.5
mountain_func = [0 for x in range(n_samples)]

# construct mountain function value for each data sample
# iterate through centers
# calculate the sum of Gaussian functions centered at each data point
for i in range(n_samples):
# iterate through data points
for k in range(n_samples):
mountain_func[i] = mountain_func[i] + math.exp( - ( pow(distance_euclidean(data[i,:],data[k,:]),2) ) / (2*sigma**sigma) )
mountain_func[i] += math.exp( - ( pow(distance_euclidean(data[i,:],data[k,:]),2) ) / (2*sigma*sigma) )

# select cluster centers and destruct mountain functions
mountain_func_new = deepcopy(mountain_func)
mountain_func_prev = deepcopy(mountain_func)
mountain_func_current = deepcopy(mountain_func)

c_subtractive = zeros(num_centers,dim)

# iterate through the number of labels (assumption is that there are 2 clusters)
for k in range(num_centers):

#mountain_func_current = deepcopy(mountain_func_prev)


# select cluster centers
peak = 0;
peak_i = 0;
for i in range(n_samples):
if mountain_func_new[i] > peak:
print('For cluster ' + str(k) + ' found peak ' + str(mountain_func_new[i]) + ' at ' + str(data[i,0]) + ',' + str(data[i,1]))
peak = mountain_func_new[i]
if mountain_func_current[i] > peak:
#print('For cluster ' + str(k) + ' found peak ' + str(mountain_func_current[i]) + ' at ' + str(data[i,0]) + ',' + str(data[i,1]))
peak = mountain_func_current[i]
peak_i = i;


Expand All @@ -260,9 +264,12 @@ def find_density_clusters(data: Matrix, num_centers = 2, beta = 0.5, sigma = 0.5
# save cluster centers
c_subtractive[k,:] = data[peak_i,:]

# destruct mountain functions
print('For cluster ' + str(k) + ' found peak ' + str(mountain_func_current[peak_i]) + ' at ' + str(data[peak_i,0]) + ',' + str(data[peak_i,1]))

# destruct mountain functions at the current cluster center (peak of highest mountain function)
for i in range(n_samples):
mountain_func_new[i] = mountain_func_new[i] - mountain_func_new[peak_i] * math.exp( - ( pow(distance_euclidean(data[i,:],c_subtractive[k,:]),2)) / (2*beta**beta) )
mountain_func_current[i] -= math.exp( - ( pow(distance_euclidean(data[i,:],c_subtractive[k,:]),2)) / (2*beta*beta) )
#mountain_func_current[i] -= mountain_func_current[k]*math.exp( - ( pow(distance_euclidean(data[i,:],c_subtractive[k,:]),2)) / (2*beta*beta) )

# assign all data points to a cluster depending on the distance
labeled_subtractive = [0 for x in range(n_samples)]
Expand All @@ -275,5 +282,5 @@ def find_density_clusters(data: Matrix, num_centers = 2, beta = 0.5, sigma = 0.5
cluster = k;
labeled_subtractive[i] = cluster

return labeled_subtractive
return c_subtractive, labeled_subtractive

2 changes: 1 addition & 1 deletion mola/matrix.py
Original file line number Diff line number Diff line change
Expand Up @@ -962,7 +962,7 @@ def __type_two_row_operation(self,operable_row,scalar):
# for j in cols_list:
# self.data[i][j] = matrix[i-rows_first][j-cols_first]

def norm_Euclidean(self):
def norm_Euclidean(self) -> float:
"""Return the Euclidean norm of the matrix."""
norm = 0
for i in range(self.n_rows):
Expand Down
Loading

0 comments on commit 50fdb9b

Please sign in to comment.