## Description:

The program basically focuses on analysis of various features with respect to their location on the map. The data has been observed and cleaned in the starting of the notebook and then it has been visualized through interactive plots at the end.

The dataset in this notebook is from 1996 - 2000.

## How to use:

The program is pretty simple to use since we're using a markdown. First of all you should have a Jupyter Notebook to open the ipynb or alternatively you can access the file by uploading on google drive. Make sure you have following libraries installed:

Pandas

Numpy

SkLearn

Folium

In [None]:
import numpy as np
import pandas as pd
import folium
from sklearn import preprocessing
from folium.plugins import HeatMap

In [None]:
data_btl = pd.read_csv('cst_btl_FinalDataset.csv')
data_btl = data_btl.loc[data_btl['Date']<'2000-01-01']

In [None]:
data_btl.head()

Unnamed: 0,Cst_Cnt,Date,latitude,longitude,Btl_Cnt,Depthm,T_degC,Salnty,ChlorA,PO4uM,NO2uM,NO3uM
0,28108,1996-01-30,32.953333,-117.306666,691492,0,13.86,33.39,3.04,0.22,0.01,0.0
1,28108,1996-01-30,32.953333,-117.306666,691493,2,13.86,33.398,3.04,0.22,0.01,0.0
2,28108,1996-01-30,32.953333,-117.306666,691494,5,13.85,33.398,2.87,0.22,0.01,0.0
3,28108,1996-01-30,32.953333,-117.306666,691495,10,13.75,33.398,3.04,0.23,0.01,0.0
4,28108,1996-01-30,32.953333,-117.306666,691496,15,13.33,33.409,6.84,0.36,0.08,0.8


In [None]:
data_btl.describe()

Unnamed: 0,Cst_Cnt,latitude,longitude,Btl_Cnt,Depthm,T_degC,Salnty,ChlorA,PO4uM,NO2uM,NO3uM
count,32014.0,32014.0,32014.0,32014.0,32014.0,32014.0,32014.0,32014.0,32014.0,32014.0,32014.0
mean,28704.521178,32.757421,-121.029442,709555.298713,154.673205,11.504198,33.720812,0.458638,1.40717,0.041537,16.237593
std,363.46541,1.30062,1.942072,10944.053858,147.091699,3.849297,0.36032,0.989607,1.004078,0.087261,14.403165
min,28108.0,29.83,-124.335,691492.0,0.0,3.13,31.96,0.0,0.09,0.0,0.0
25%,28394.0,31.911666,-122.665,700222.25,40.0,8.28,33.44025,0.11,0.35,0.0,0.3
50%,28672.0,32.848333,-121.053333,708736.5,100.0,10.99,33.684,0.34,1.33,0.01,15.7
75%,29010.0,33.656666,-119.48,718856.75,230.0,14.59,34.044,0.481185,2.26,0.03,29.1
max,29338.0,36.82,-117.303333,728476.0,1319.0,23.42,34.543,31.28,4.39,1.39,43.6


In [None]:
data_btl.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 32014 entries, 0 to 32013
Data columns (total 12 columns):
 #   Column     Non-Null Count  Dtype  
---  ------     --------------  -----  
 0   Cst_Cnt    32014 non-null  int64  
 1   Date       32014 non-null  object 
 2   latitude   32014 non-null  float64
 3   longitude  32014 non-null  float64
 4   Btl_Cnt    32014 non-null  int64  
 5   Depthm     32014 non-null  int64  
 6   T_degC     32014 non-null  float64
 7   Salnty     32014 non-null  float64
 8   ChlorA     32014 non-null  float64
 9   PO4uM      32014 non-null  float64
 10  NO2uM      32014 non-null  float64
 11  NO3uM      32014 non-null  float64
dtypes: float64(8), int64(3), object(1)
memory usage: 3.2+ MB


In [None]:
mean_lat = data_btl['latitude'].mean()
mean_long = data_btl['longitude'].mean()
print(mean_lat)
print(mean_long)

32.75742084379031
-121.02944196088752


In [None]:
def generateBaseMap(default_location=[mean_lat, mean_long], default_zoom_start=6):
    base_map = folium.Map(location=default_location, control_scale=True, zoom_start=default_zoom_start)
    return base_map

def AddPinsToMap(data, base_map, color, column):
    for i in range(len(data[column])):
        tooltip = data[column][i]
        folium.Marker([data['latitude'][i], data['longitude'][i]], popup="<b>"+str(data['latitude'][i])[:5]+","+str(data['longitude'][i])[:6]+"\n"+data['Date'][i]+"\nCast Count:"+str(data[column][i])+"\n"+"</b>",
        icon=folium.Icon(color=color, icon="record"), tooltip=tooltip,
    ).add_to(base_map)


In [None]:
data_dpt = data_btl[['latitude', 'longitude', "Depthm", 'Date']]

In [None]:
data_dpt.head()

Unnamed: 0,latitude,longitude,Depthm,Date
0,32.953333,-117.306666,0,1996-01-30
1,32.953333,-117.306666,2,1996-01-30
2,32.953333,-117.306666,5,1996-01-30
3,32.953333,-117.306666,10,1996-01-30
4,32.953333,-117.306666,15,1996-01-30


In [None]:
data_dpt.describe()

Unnamed: 0,latitude,longitude,Depthm
count,32014.0,32014.0,32014.0
mean,32.757421,-121.029442,154.673205
std,1.30062,1.942072,147.091699
min,29.83,-124.335,0.0
25%,31.911666,-122.665,40.0
50%,32.848333,-121.053333,100.0
75%,33.656666,-119.48,230.0
max,36.82,-117.303333,1319.0


In [None]:
data_dpt.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 32014 entries, 0 to 32013
Data columns (total 4 columns):
 #   Column     Non-Null Count  Dtype  
---  ------     --------------  -----  
 0   latitude   32014 non-null  float64
 1   longitude  32014 non-null  float64
 2   Depthm     32014 non-null  int64  
 3   Date       32014 non-null  object 
dtypes: float64(2), int64(1), object(1)
memory usage: 1.2+ MB


In [None]:
base_map_Dep_Cnt = generateBaseMap(default_location = [mean_lat, mean_long])
HeatMap(data=data_dpt[['latitude', 'longitude', "Depthm"]].groupby(['latitude', 'longitude']).sum().reset_index().values.tolist(), radius=8, max_zoom=13).add_to(base_map_Dep_Cnt)
AddPinsToMap(data_dpt, base_map_Dep_Cnt, 'orange', "Depthm")

loc = "Depth in Meters Pins"
title_html = '''
                 <h3 align="center" style="font-size:16px"><b>{}</b></h3>
                 '''.format(loc)
base_map_Dep_Cnt.get_root().html.add_child(folium.Element(title_html))

<branca.element.Element at 0x7fd49b7d33d0>

In [None]:
base_map_Dep_Cnt

## Conclusion:

Here we can observe a few things like as we move towards sea from the coast, the temperature falls.