**Project ðŸš§**

One of the main pain point that Uber's team found is that sometimes drivers are not around when users need them. For example, a user might be in San Francisco's Financial District whereas Uber drivers are looking for customers in Castro.  

(If you are not familiar with the bay area, check out <a href="https://www.google.com/maps/place/San+Francisco,+CA,+USA/@37.7515389,-122.4567213,13.43z/data=!4m5!3m4!1s0x80859a6d00690021:0x4a501367f076adff!8m2!3d37.7749295!4d-122.4194155" target="_blank">Google Maps</a>)

Eventhough both neighborhood are not that far away, users would still have to wait 10 to 15 minutes before being picked-up, which is too long. Uber's research shows that users accept to wait 5-7 minutes, otherwise they would cancel their ride. 

Therefore, Uber's data team would like to work on a project where **their app would recommend hot-zones in major cities to be in at any given time of day.**

**Goals ðŸŽ¯**

Uber already has data about pickups in major cities. Your objective is to create algorithms that will determine where are the hot-zones that drivers should be in. Therefore you will:

* Create an algorithm to find hot zones 
* Visualize results on a nice dashboard

# Basic information

Let's import our libraries:

In [None]:
import pandas as pd
import numpy as np
import os

import plotly.express as px
import plotly.graph_objects as go
from geopy.geocoders import Nominatim
from geopy.extra.rate_limiter import RateLimiter
import missingno as msno

from sklearn.cluster import KMeans, MiniBatchKMeans, DBSCAN
from sklearn.metrics import silhouette_score

import warnings
warnings.filterwarnings("ignore")

Our data is fragmented. Let's work a little on it to bring it up together:

In [None]:
def get_name(filename):
    """Prepares batch extraction of our fragmented csv files"""
    name_without_ext = filename.replace('.csv', '')
    name = name_without_ext.split('-')[-1]
    return name

In [None]:
uber_dict = {}

for file in os.listdir('../uber-trip-data'):
    if file.endswith('.csv'):
        key = get_name(file)
        uber_dict[key] = pd.read_csv(f'../uber-trip-data/{file}')
        print(f"AjoutÃ©: {key}")
    else:
        print(f'{file} is not a csv!')

print(f"\nLoaded CSV files: {len(uber_dict)}")

In [None]:
df_avsep14 = pd.concat([uber_dict['apr14'], uber_dict['may14'], uber_dict['jun14'],
                        uber_dict['jul14'], uber_dict['aug14'], uber_dict['sep14']], ignore_index=True)

In [None]:
df_avsep14.info()

In [None]:
df_avsep14.head()

Our data for 2014 has been gathered into a single dataframe with more than 4.5 million entries. Now for 2015:

In [None]:
df_jajun15 = pd.read_csv(f'../uber-trip-data/raw-data-15/uber-raw-data-janjune-15.csv')

In [None]:
df_jajun15.info()

In [None]:
df_jajun15.head()

# Cleaning

### 2014

First we'll work on our 2014 dataset by making it more convenient to manipulate, and giving its time value an easier format to work on:

In [None]:
df_2014 = df_avsep14.copy(deep=True)
df_2014.rename(columns={key:str.lower(key) for key in df_2014.columns}, inplace=True)
df_2014.sort_values(by='date/time', inplace=True)

df_2014['date'] = df_2014['date/time'].str.split(" ").str[0]
df_2014['time'] = df_2014['date/time'].str.split(" ").str[1]
df_2014 = df_2014.drop('date/time', axis=1)

In [None]:
df_2014['date'] = pd.to_datetime(df_2014['date'])
df_2014['time'] = pd.to_datetime(df_2014['time']).dt.time
df_2014['year'] = df_2014['date'].dt.year
df_2014['month'] = df_2014['date'].dt.month
df_2014['day'] = df_2014['date'].dt.day
df_2014['dayofweek'] = df_2014['date'].dt.day_of_week

df_2014.head()

### 2015

Then on our 2015 dataset, by making sense of its content before clarifying it:

In [None]:
df_lookup = uber_dict['lookup']
df_lookup.rename(columns={key:str.lower(key) for key in df_lookup.columns}, inplace=True)
df_jajun15.rename(columns={key:str.lower(key) for key in df_jajun15.columns}, inplace=True)
df_2015 = df_jajun15.merge(uber_dict['lookup'], on='locationid')
df_2015.head()

Some pickup areas have a placeholder information "Unknown" which we'll leave blank instead:

In [None]:
df_2015[['borough', 'zone']] = df_2015[['borough', 'zone']].replace('Unknown', '')
print(df_2015['borough'].unique())
print(sorted(df_2015['zone'].unique()))

##### GPS affiliation

One thing sorely missing in our 2015 data are GPS coordinates, which we'll need for our work:

In [None]:
unique_places = df_2015[["borough", "zone"]].drop_duplicates()
unique_places.info()

In [None]:
geolocator = Nominatim(timeout=10, user_agent="uber_app")
geocode = RateLimiter(geolocator.geocode, min_delay_seconds=1)

In [None]:
def get_location(row):
    try:
        query = f"USA, New York, {row['borough']}, {row['zone']}"
        location = geocode(query)
        if location:
            return pd.Series([location.latitude, location.longitude])
        else:
            try:
                query = f"USA, New York, {row['borough']}, {(row['zone'].split('/')[0])}"
                location = geocode(query)
                if location:
                    return pd.Series([location.latitude, location.longitude])
                else:
                    try:
                        query = f"USA, New York, {row['borough']}, {(row['zone'].split('/')[1])}"
                        location = geocode(query)
                        if location:
                            return pd.Series([location.latitude, location.longitude])
                        else:
                            return pd.Series([None, None])
                    except:
                        return pd.Series([None, None])
            except:
                return pd.Series([None, None])
    except Exception:
        return pd.Series([None, None])

In [None]:
unique_places[["lat", "lon"]] = unique_places.apply(get_location, axis=1)

In [None]:
df_2015 = df_2015.merge(unique_places, on=["borough", "zone"], how="left")
df_2015.head()

In [None]:
df_2015.info()

##### Missing values

A large half of the values were missing without the second try/except in our geopy request. Here is a quick view of what we have in the end:

In [None]:
msno.matrix(df_2015)

In [None]:
df_2015 = df_2015.dropna()
df_2015.info()

In the end, instead we have lost less than 10% of our data. We can thus keep the information that will match our 2014 dataset after the last transformations:

In [None]:
df_2015 = df_2015.drop(columns=['borough', 'zone', 'locationid', 'affiliated_base_num'], axis = 1).reset_index()
df_2015 = df_2015.rename(columns={'pickup_date': 'date/time', 'dispatching_base_num' : 'base' })
df_2015.head()

In [None]:
df_2015.sort_values(by='date/time', inplace=True)
df_2015 = df_2015.drop(columns='index')

df_2015['date'] = df_2015['date/time'].str.split(" ").str[0]
df_2015['time'] = df_2015['date/time'].str.split(" ").str[1]
df_2015 = df_2015.drop('date/time', axis=1)

df_2015['date'] = pd.to_datetime(df_2015['date'])
df_2015['time'] = pd.to_datetime(df_2015['time']).dt.time
df_2015['year'] = df_2015['date'].dt.year
df_2015['month'] = df_2015['date'].dt.month
df_2015['day'] = df_2015['date'].dt.day
df_2015['dayofweek'] = df_2015['date'].dt.day_of_week

df_2015.head()

### Merging 2014+2015

In [None]:
df_data = pd.concat([df_2014, df_2015], ignore_index=True)
df_data.info()

In [None]:
# Checkpoint creation. Careful with file size!
#df_data.to_csv('../uber-trip-data/merge/uber_data.csv')

# Clustering

With our data now unified, we can start working on our clustering model:

In [None]:
df_data = pd.read_csv('../uber-trip-data/merge/uber_data.csv')
df_data = df_data.drop('Unnamed: 0', axis=1)
df_data.head()

### Dissociating days

Since we ultimately want to define hot zones per day, we will divide our data into dictionary entries depending on which day is it related to:

In [None]:
day_of_week = {
    0 : 'Monday',
    1 : 'Tuesday',
    2 : 'Wednesday',
    3 : 'Thursday',
    4 : 'Friday',
    5 : 'Saturday',
    6 : 'Sunday'
}

In [None]:
day_dict = {}
day_dict_mini = {}

for i in range(0,7):
    day_dict[i] = df_data[df_data['dayofweek'] == i]
    day_dict_mini[i] = day_dict[i].sample(10_000, random_state=42)

In [None]:
day_dict[0].head()

### MiniBatchKMEANS

This is what we created our 10 000 sample for, to compare results with the full dataset.

##### Estimate daily clusters

In [None]:
def get_clusters_mkm(dayofweek):

    sil = []
    k = []
    wcss =  []
    
    for i in range (2,21): 
        minikmeans = MiniBatchKMeans(n_clusters= i)
        minikmeans.fit(day_dict_mini[dayofweek][['lat','lon']])
        # elbow
        wcss.append(minikmeans.inertia_)
        
        # sil score
        sil.append(silhouette_score(day_dict_mini[dayofweek][['lat','lon']], minikmeans.predict(day_dict_mini[dayofweek][['lat','lon']])))
        k.append(i)
        # print("Silhouette score for K={} is {}".format(i, sil[-1]))
    # print(wcss)

    # == elbow method ==
    cluster_scores=pd.DataFrame(sil)
    k_frame = pd.Series(k)

    fig = go.Figure()

    fig.add_trace(go.Scatter(
        x=k,
        y=wcss, 
        mode='lines+markers', 
        name= 'WCSS', 
        yaxis='y1'
    ))

    fig.add_trace(go.Bar(
        x=k,
        y=sil, 
        name='sil_score',
        opacity=0.3,
        yaxis='y2'
    ))

    fig.update_layout(
        title=f"WCSS and Silhouette Score for {day_of_week[dayofweek]}",
        xaxis=dict(title="Number of Clusters (k)"),
        yaxis=dict(
            title="WCSS",
            showgrid=False,
            side="left"
        ),
        yaxis2=dict(
            title="Silhouette Score",
            overlaying="y",
            side="right",
            showgrid=False
        ),
        title_x=0.5,
        template="plotly_white"
    )

    fig.show()

In [None]:
for i in range(0,7):
    get_clusters_mkm(i)

##### Backup: MiniK scores
![minik_sc_monday](./scores//minikmeans/minik_monday.png)
![minik_sc_tuesday](./scores//minikmeans/minik_tuesday.png)
![minik_sc_wednesday](./scores//minikmeans/minik_wednesday.png)
![minik_sc_thursday](./scores//minikmeans/minik_thursday.png)
![minik_sc_friday](./scores//minikmeans/minik_friday.png)
![minik_sc_saturday](./scores//minikmeans/minik_saturday.png)
![minik_sc_sunday](./scores//minikmeans/minik_sunday.png)

##### Display clusters

The difficulty in reading the graphs above is deciding which information is worth the most keeping: we want a balance between both scores. 

For business purposes, even though New York is a metropolis, it doesn't make much sense to dispatch drivers over 20 hot zones with several close to one another.

We will keep the maximum number of clusters at most at 10. Thus we choose the following values:

* Monday: 5 clusters
* Tuesday: 6 clusters
* Wednesday: 4 clusters
* Thursday: 5 clusters
* Friday: 6 clusters
* Saturday: 5 clusters
* Sunday: 5 clusters

In [None]:
def show_cluster_mkm(dayofweek, cluster):
    print('===== MiniKmeans =====')
    print(f'show for {day_of_week[dayofweek]}')
    minikmeans = MiniBatchKMeans(n_clusters=cluster, random_state=42)
    minikmeans.fit(day_dict_mini[dayofweek][['lat', 'lon']])
    day_dict_mini[dayofweek]['cluster'] = minikmeans.predict(day_dict_mini[dayofweek][['lat', 'lon']])

    fig2 = px.scatter_map(day_dict_mini[dayofweek], 
                          lat='lat', 
                          lon='lon', 
                          color='cluster', 
                          color_continuous_scale='Bluered')
    fig2.update_layout(
        title = f"All {cluster} clusters for {day_of_week[dayofweek]}",
        title_x=0.5
        )
    fig2.show()

In [None]:
show_cluster_mkm(0,5)

In [None]:
show_cluster_mkm(1,6)

In [None]:
show_cluster_mkm(2,4)

In [None]:
show_cluster_mkm(3,5)

In [None]:
show_cluster_mkm(4,6)

In [None]:
show_cluster_mkm(5,5)

In [None]:
show_cluster_mkm(6,5)

##### Backup: MiniK plots
![minik_5c_monday](./plots/minikmeans/minik_5c_monday.png)
![minik_6c_tuesday](./plots/minikmeans/minik_6c_tuesday.png)
![minik_4c_wednesday](./plots/minikmeans/minik_4c_wednesday.png)
![minik_5c_thursday](./plots/minikmeans/minik_5c_thursday.png)
![minik_6c_friday](./plots/minikmeans/minik_6c_friday.png)
![minik_5c_saturday](./plots/minikmeans/minik_5c_saturday.png)
![minik_5c_sunday](./plots/minikmeans/minik_5c_sunday.png)

##### Observations

With the MiniBatch version of KMeans, we've been able to narrow down potential daily hot zones by covering vast areas of the city.

Most of the time we can notice how Manhattan takes two of those, with a third around EWR airport.

Although it seems to fit well the overall shape of the city, the model doesn't know what to do with geography, which would turn in real conditions into an issue: navigating around a metropolis under 10 minutes can prove difficult as a taxi driver if you must cross a bridge to reach your customer.

Anyway, now how does the classic algorithm compare to the MiniBatch?

### KMeans

##### Estimate daily clusters

We'll keep running our sample for the sake of computation time (10 000 rows against 17+ millions) as well as for the sake of having the same input:

In [None]:
def get_clusters_km(dayofweek):

    sil = []
    k = []
    wcss =  []
    
    for i in range (2,11): 
        kmeans = KMeans(n_clusters= i)
        kmeans.fit(day_dict_mini[dayofweek][['lat','lon']])
        # elbow
        wcss.append(kmeans.inertia_)
        
        # sil score
        sil.append(silhouette_score(day_dict_mini[dayofweek][['lat','lon']], kmeans.predict(day_dict_mini[dayofweek][['lat','lon']])))
        k.append(i)
        # print("Silhouette score for K={} is {}".format(i, sil[-1]))
    # print(wcss)

    # == elbow method ==
    cluster_scores=pd.DataFrame(sil)
    k_frame = pd.Series(k)

    fig = go.Figure()

    fig.add_trace(go.Scatter(
        x=k,
        y=wcss, 
        mode='lines+markers', 
        name= 'WCSS', 
        yaxis='y1'
    ))

    fig.add_trace(go.Bar(
        x=k,
        y=sil, 
        name='sil_score',
        opacity=0.3,
        yaxis='y2'
    ))

    fig.update_layout(
        title=f"WCSS and Silhouette Score for {day_of_week[dayofweek]}",
        xaxis=dict(title="Number of Clusters (k)"),
        yaxis=dict(
            title="WCSS",
            showgrid=False,
            side="left"
        ),
        yaxis2=dict(
            title="Silhouette Score",
            overlaying="y",
            side="right",
            showgrid=False
        ),
        title_x=0.5,
        template="plotly_white"
    )

    fig.show()

In [None]:
for i in range(0,7):
    get_clusters_km(i)

##### Backup: KMeans scores
![kmeans_sc_monday](./scores/kmeans/kmeans_monday.png)
![kmeans_sc_tuesday](./scores/kmeans/kmeans_tuesday.png)
![kmeans_sc_wednesday](./scores/kmeans/kmeans_wednesday.png)
![kmeans_sc_thursday](./scores/kmeans/kmeans_thursday.png)
![kmeans_sc_friday](./scores/kmeans/kmeans_friday.png)
![kmeans_sc_saturday](./scores/kmeans/kmeans_saturday.png)
![kmeans_sc_sunday](./scores/kmeans/kmeans_sunday.png)

##### Display clusters

* Monday: 5 clusters
* Tuesday: 5 clusters
* Wednesday: 5 clusters
* Thursday: 4 clusters
* Friday: 4 clusters
* Saturday: 5 clusters
* Sunday: 5 clusters

In [None]:
def show_cluster_km(dayofweek, cluster):
    print('===== Kmeans =====')
    print(f'show for {day_of_week[dayofweek]}')
    kmeans = KMeans(n_clusters=cluster, random_state=42)
    kmeans.fit(day_dict_mini[dayofweek][['lat', 'lon']])
    day_dict_mini[dayofweek]['cluster'] = kmeans.predict(day_dict_mini[dayofweek][['lat', 'lon']])

    fig2 = px.scatter_map(day_dict_mini[dayofweek], 
                          lat='lat', 
                          lon='lon', 
                          color='cluster', 
                          color_continuous_scale='Bluered')
    fig2.update_layout(
        title = f"All {cluster} clusters for {day_of_week[dayofweek]}",
        title_x=0.5
        )
    fig2.show()

In [None]:
show_cluster_km(0,5)

In [None]:
show_cluster_km(1,5)

In [None]:
show_cluster_km(2,5)

In [None]:
show_cluster_km(3,4)

In [None]:
show_cluster_km(4,4)

In [None]:
show_cluster_km(5,5)

In [None]:
show_cluster_km(6,5)

##### Backup: KMeans plots
![kmeans_5c_monday](./plots/kmeans/kmeans_5c_monday.png)
![kmeans_5c_tuesday](./plots/kmeans/kmeans_5c_tuesday.png)
![kmeans_5c_wednesday](./plots/kmeans/kmeans_5c_wednesday.png)
![kmeans_4c_thursday](./plots/kmeans/kmeans_4c_thursday.png)
![kmeans_4c_friday](./plots/kmeans/kmeans_4c_friday.png)
![kmeans_5c_saturday](./plots/kmeans/kmeans_5c_saturday.png)
![kmeans_5c_sunday](./plots/kmeans/kmeans_5c_sunday.png)

##### Observations

Although results slightly change, clustering trends remain the same between both algorithms - the same way the issues do.

We've covered two versions of KMeans. Let's try a different algorithm now!

### DBScan

##### Estimate daily clusters

In [None]:
def get_clusters_dbs(dayofweek):    
    dbs_results = []

    for eps in np.arange(0.005, 0.5, 0.05):
        for min_samples in range(10, 1000, 100):
            db = DBSCAN(eps=eps, min_samples=min_samples, metric="euclidean")
            labels = db.fit_predict(day_dict_mini[dayofweek][['lat', 'lon']])
            n_clusters = len(set(labels)) - (1 if -1 in labels else 0)

            # business purpose: filter to have number of clusters between 5 and 10
            if 4 <= n_clusters <= 10:
                day_dict_mini[dayofweek]['cluster'] = db.labels_
                max_item = max(day_dict_mini[dayofweek]['cluster'].value_counts())
                min_item = min(day_dict_mini[dayofweek]['cluster'].value_counts())

                dbs_results.append({'eps': eps, 'min_samp' : min_samples, 'n_clusters': n_clusters, 'max_item' : max_item, 'min_item' : min_item})
    
    dbs_results = pd.DataFrame(dbs_results, columns=['eps', 'min_samp', 'n_clusters','max_item', 'min_item'])
    # dbs_results = dbs_results.sort_values(by="max_item", ascending=False)
    # print(f'results for {day_of_week[dayofweek]}')
    # print(dbs_results.head())
    return dbs_results

In [None]:
results_dbs_day = {}

for i in range (0,7):
    results_dbs_day[i]= get_clusters_dbs(i)

In [None]:
for i in range (0,7):
    print(fr"The possible clusters on {day_of_week[i]} are: {results_dbs_day[i]['n_clusters'].unique()}")

In [None]:
for i in range (0,7):
    results_dbs_day[i] = results_dbs_day[i].sort_values(by="max_item", ascending=False)
    print('='*50)
    print(f'day {day_of_week[i]}')
    print(results_dbs_day[i].head())


##### Backup: training output

In [None]:
"""
==================================================
day Monday
     eps  min_samp  n_clusters  max_item  min_item
0  0.005       110           5      6575       118
3  0.005       410           4      6376       183
1  0.005       210           7      5263        43
2  0.005       310           6      4670       323
==================================================
day Tuesday
     eps  min_samp  n_clusters  max_item  min_item
0  0.005       110           4      6883       189
2  0.005       410           4      5635       199
1  0.005       210           6      5560        45
==================================================
day Wednesday
     eps  min_samp  n_clusters  max_item  min_item
0  0.005       110           5      7161       120
2  0.005       410           4      5886       246
1  0.005       210           5      5817        44
==================================================
day Thursday
     eps  min_samp  n_clusters  max_item  min_item
0  0.005       110           5      6900       154
1  0.005       410           5      5241       371
==================================================
day Friday
     eps  min_samp  n_clusters  max_item  min_item
0  0.005       110           6      6655       127
3  0.005       410           4      5933       450
1  0.005       210           4      5753       267
2  0.005       310           4      4662        99
==================================================
day Saturday
     eps  min_samp  n_clusters  max_item  min_item
0  0.005       110           5      6339       125
3  0.005       410           6      5557       416
1  0.005       210           4      5196       239
2  0.005       310           4      5191        98
==================================================
day Sunday
     eps  min_samp  n_clusters  max_item  min_item
2  0.005       410           6      6126         7
0  0.005       110           8      5990       112
1  0.005       310           4      5378       332
"""

##### Display clusters

In [None]:
def show_cluster_dbs(dayofweek, eps_value, min_samp_value):
    db = DBSCAN(eps=eps_value, min_samples=min_samp_value, metric="euclidean")
    db.fit(day_dict_mini[dayofweek][['lat', 'lon']])
    day_dict_mini[dayofweek]['cluster'] = db.labels_
    # labels = db.fit_predict(day_dict_mini[dayofweek][['lat', 'lon']])
    n_clusters = day_dict_mini[dayofweek]['cluster'].value_counts().shape[0]
    print('===== DBScan =====')
    print(f'show for {day_of_week[dayofweek]}')
    print(day_dict_mini[dayofweek]['cluster'].value_counts())

    fig3 = px.scatter_map(day_dict_mini[dayofweek], 
                          lat='lat', 
                          lon='lon', 
                          color='cluster', 
                          color_continuous_scale='Bluered')
    fig3.update_layout(
        title = f"All {n_clusters} clusters for {day_of_week[dayofweek]}",
        title_x=0.5
        )
    fig3.show()

In [None]:
show_cluster_dbs(0,0.005,110)

In [None]:
show_cluster_dbs(1,0.005,110)

In [None]:
show_cluster_dbs(2,0.005,110)

In [None]:
show_cluster_dbs(3,0.005,110)

In [None]:
show_cluster_dbs(4,0.005,110)

In [None]:
show_cluster_dbs(5,0.005,110)

In [None]:
show_cluster_dbs(6,0.005,110)

##### Backup: DBscan plots
![dbscan_5c_monday](./plots/dbscan/dbscan_5c_monday.png)
![dbscan_4c_tuesday](./plots/dbscan/dbscan_4c_tuesday.png)
![dbscan_5c_wednesday](./plots/dbscan/dbscan_5c_wednesday.png)
![dbscan_5c_thursday](./plots/dbscan/dbscan_5c_thursday.png)
![dbscan_6c_friday](./plots/dbscan/dbscan_6c_friday.png)
![dbscan_5c_saturday](./plots/dbscan/dbscan_5c_saturday.png)
![dbscan_8c_sunday](./plots/dbscan/dbscan_8c_sunday.png)

##### Observations

While KMeans is really handy at designing areas, playing around with DBscan's settings lets us pinpoint zones with the most customer pick-up requests.

While very efficient at that, and thus better suited at adapting to geography (such as crossing the Hudson river here), it can conversely lose itself in an especially dense zone such as Manhattan. This is best explained with the numbers we produced while working on this, with this last example of sunday:

![DBscan_example](./plots/dbscan_example.jpg)

These mean not only does Manhattan's cluster cover 60% of our pickups, but the remaining ones represent between 1 and 2%.

This is the kind of extra and very needed information that would help dispatching drivers in proportion to the projected needs!

Now it is largely possible to further optimize it by playing some more with DBscan's parameters; regardless for the time being, although not optimal because of geography, these points make for interesting locations for a driver to stand by until a customer requests transportation!

# Concluding

Both algorithms, KMeans and DBscan, have their ups and downs. While one will definitely produce entire areas, the other can be made to pinpoint intensity.

Although they can compliment one another, this project's objective is to produce *hot zones*. Quantity here is key and would give DBscan the edge.

It would take a lot more testing to play around with its settings to possibly produce even more accurate locations for drivers to be in for quick response.

DBscan's results can be otherwise more complex or diverse; we will provide an example of it below on sundays. For now we'll stick with this algorithm first, but we should still keep in mind KMeans to avoid forgetting about more distant customers whom could be made to wait well over 10 or 15 minutes!

On sunday, 8 clusters versus 6:
![dbscan_8c_sunday](./plots/dbscan/dbscan_8c_sunday.png)
![dbscan_6c_sunday](./plots/dbscan_6c_sunday.png)