-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Another distance metric #8
Comments
Hello neonntt,
Regarding the second question you mean you want to apply a weighted Euclidean distance? Cheers, |
Eren, thank you so much for your reply. I will try out as you mentioned regarding changing the metric. Must be some issue with my data. Regarding the second question, I would like to try a weighted distance with both Euclidean and Mahanalobis metric. Let's say we have another parameter in the data, for example, speed, and we would like the speed value to be given a higher weightage than the others while calculating the distance. Can you please guide how it can be implemented? |
Sure, you can adapt the code so using something like the following should work: def fit(self, X):
"""
Apply the ST DBSCAN algorithm
----------
X : 2D numpy array with
The first element of the array should be the time
attribute as float. The following positions in the array are
treated as spatial coordinates. The structure should look like this [[time_step1, x, y], [time_step2, x, y]..]
For example 2D dataset:
array([[0,0.45,0.43],
[0,0.54,0.34],...])
Returns
-------
self
"""
# check if input is correct
X = check_array(X)
if not self.eps1 > 0.0 or not self.eps2 > 0.0 or not self.min_samples > 0.0:
raise ValueError('eps1, eps2, minPts must be positive')
n, m = X.shape
# Compute sqaured form Euclidean Distance Matrix for 'time' attribute and the spatial attributes
time_dist = pdist(X[:, 0].reshape(n, 1), metric=self.metric)
# --------
# --------
# Line changed here:
# np.array of weights
weights = np.array([0.5, 1, 0.2, 0.3]) # weights for the features
euc_dist = pdist(X[:, 1:], 'wminkowski', p=2, w=weights)
# afterwards the same code snippets
# --------
# --------
# filter the euc_dist matrix using the time_dist
dist = np.where(time_dist <= self.eps2, euc_dist, 2 * self.eps1)
db = DBSCAN(eps=self.eps1,
min_samples=self.min_samples,
metric='precomputed')
db.fit(squareform(dist))
self.labels = db.labels_
return self Cheers, |
Thanks a ton, Eren...will try it and reach out to you in case I need more help. |
Easy, just reopen issue in that case. Cheers, |
Hi! Thanks so much for this implementation.
I wanted some guidance on how to use a different distance metric than the default euclidean.
I have data with multiple features and wanted to use another distance metric, such as mahalanobis
would the implementation be as under:-
st_dbscan = ST_DBSCAN(eps1 = 0.4, eps2 = 5, min_samples = 5, metric = 'mahalanobis')
I did try the above, but got an error Singular matrix. However, when I checked the correlation, it seems to be ok,
Also, in case I would want to use a different weightage for each of the features while calculating the distance, how should i go about it?
Would be grateful if you could please help out.
Thanks
The text was updated successfully, but these errors were encountered: