Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Another distance metric #8

Closed
neonntt opened this issue May 7, 2022 · 5 comments
Closed

Another distance metric #8

neonntt opened this issue May 7, 2022 · 5 comments

Comments

@neonntt
Copy link

neonntt commented May 7, 2022

Hi! Thanks so much for this implementation.
I wanted some guidance on how to use a different distance metric than the default euclidean.
I have data with multiple features and wanted to use another distance metric, such as mahalanobis
would the implementation be as under:-
st_dbscan = ST_DBSCAN(eps1 = 0.4, eps2 = 5, min_samples = 5, metric = 'mahalanobis')

I did try the above, but got an error Singular matrix. However, when I checked the correlation, it seems to be ok,

Also, in case I would want to use a different weightage for each of the features while calculating the distance, how should i go about it?
Would be grateful if you could please help out.

Thanks

@eren-ck
Copy link
Owner

eren-ck commented May 10, 2022

Hello neonntt,
I can't reproduce the issue. Changing the metric in the provided demo notebook works for me.
So if you change the fourth cell in the demo notebook to the following code snippet:

st_dbscan = ST_DBSCAN(eps1 = 0.05, eps2 = 10, min_samples = 5, metric = 'mahalanobis')

Regarding the second question you mean you want to apply a weighted Euclidean distance?

Cheers,
Eren

@neonntt
Copy link
Author

neonntt commented May 11, 2022

Eren, thank you so much for your reply. I will try out as you mentioned regarding changing the metric. Must be some issue with my data.

Regarding the second question, I would like to try a weighted distance with both Euclidean and Mahanalobis metric. Let's say we have another parameter in the data, for example, speed, and we would like the speed value to be given a higher weightage than the others while calculating the distance. Can you please guide how it can be implemented?
Thx again,

@eren-ck
Copy link
Owner

eren-ck commented May 12, 2022

Sure, you can adapt the code so using something like the following should work:

    def fit(self, X):
        """
        Apply the ST DBSCAN algorithm 
        ----------
        X : 2D numpy array with
            The first element of the array should be the time 
            attribute as float. The following positions in the array are 
            treated as spatial coordinates. The structure should look like this [[time_step1, x, y], [time_step2, x, y]..]
            For example 2D dataset:
            array([[0,0.45,0.43],
            [0,0.54,0.34],...])
        Returns
        -------
        self
        """
        # check if input is correct
        X = check_array(X)

        if not self.eps1 > 0.0 or not self.eps2 > 0.0 or not self.min_samples > 0.0:
            raise ValueError('eps1, eps2, minPts must be positive')

        n, m = X.shape

        # Compute sqaured form Euclidean Distance Matrix for 'time' attribute and the spatial attributes
        time_dist = pdist(X[:, 0].reshape(n, 1), metric=self.metric)

        # --------
        # --------
        # Line changed here:
        # np.array of weights 
        weights = np.array([0.5, 1, 0.2, 0.3]) # weights for the features 
        euc_dist = pdist(X[:, 1:], 'wminkowski', p=2, w=weights)
        # afterwards the same code snippets  
        # --------
        # --------

        # filter the euc_dist matrix using the time_dist
        dist = np.where(time_dist <= self.eps2, euc_dist, 2 * self.eps1)

        db = DBSCAN(eps=self.eps1,
                    min_samples=self.min_samples,
                    metric='precomputed')
        db.fit(squareform(dist))

        self.labels = db.labels_

        return self

Cheers,
Eren

@neonntt
Copy link
Author

neonntt commented May 12, 2022

Thanks a ton, Eren...will try it and reach out to you in case I need more help.
regards

@eren-ck
Copy link
Owner

eren-ck commented May 13, 2022

Easy, just reopen issue in that case.

Cheers,
Eren

@eren-ck eren-ck closed this as completed May 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants