<a href="https://colab.research.google.com/github/shaunwbell/GoogleCoLabArchive/blob/main/Collab_and_Plotly_to_plot_cruise_stations_from_excel.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Using Jupyter Notebooks for Geophysical Data Exploration

First, some key references:
+ conda a package manager alternative to PIP
+ anaconda python - a prepackaged, scientific analysis oriented python distribution (comes with conda)

These two products will enable you to fairly easily, create your own locally hosted jupyter-lab environment (assuming you have some basic familiarity with command line tools, linux/unix, and python)

## Use Pandas to read in an excel file

So we have a file, lets ingest it and take a look at it

In [5]:
import pandas as pd

In [6]:
df = pd.read_excel('data/DY2103_sideopssummary.xlsx',sheet_name=0)

In [7]:
df.drop(index=[89,90],inplace=True) #becasue i know i have a summary line at the bottom of this excel page
df.fillna(0,inplace=True)
df.sample(5)

Unnamed: 0,CTD #,Lat.,Unnamed: 3,Lon.,Unnamed: 5,Site,Station Number,No. of Nutrients,Sal. Samples,O2 Samples,Chl Samples,Number of DIC,Gann Sample,Abs. Sample,FCM Sample,Size Frac Chlor,Latitude,Longitude,Unnamed: 18
88,CTD088,54.0,20.22,165.0,25.62,UBS4,s87h1,9.0,0.0,0.0,6.0,9.0,0.0,0.0,1.0,1.0,54.337,-165.427,0
72,CTD072,54.0,21.58,165.0,55.41,UBW1,s71h1,11.0,0.0,1.0,6.0,0.0,1.0,0.0,1.0,1.0,54.359667,-165.9235,0
14,CTD015,57.0,19.64,166.0,19.04,70M10,s15h1,7.0,1.0,1.0,6.0,0.0,0.0,0.0,1.0,1.0,57.327333,-166.317333,0
6,CTD007,56.0,52.54,164.0,4.36,M2C,s7h1,7.0,1.0,7.0,6.0,7.0,0.0,0.0,0.0,0.0,56.875667,-164.072667,0
77,CTD077,54.0,48.93,165.0,51.97,UBN2,s76h1,9.0,0.0,1.0,6.0,1.0,0.0,0.0,1.0,1.0,54.8155,-165.866167,0


In [8]:
import plotly.express as px
fig = px.scatter_geo(data_frame=df,lat='Latitude',lon='Longitude',hover_name="Site",hover_data=["Station Number","Sal. Samples","O2 Samples","Chl Samples"])

fig.update_layout(
    title_text = 'DY2103 Cruise CTD Stations',
    showlegend = False,
    geo = dict(
        resolution = 50,
        showland = True,
        showlakes = True,
        landcolor = 'rgb(204, 204, 204)',
        countrycolor = 'rgb(204, 204, 204)',
        lakecolor = 'rgb(255, 255, 255)',
        projection_type = "equirectangular",
        coastlinewidth = 2,
        lataxis = dict(
            range = [50, 70],
            showgrid = True,
            dtick = 10
        ),
        lonaxis = dict(
            range = [-180, -150],
            showgrid = True,
            dtick = 20
        ),
    )
)
fig.show()

## More Trivial Examples


In [10]:
df.describe()

Unnamed: 0,Lat.,Unnamed: 2,Lon.,Unnamed: 4,No. of Nutrients,Sal. Samples,O2 Samples,Chl Samples,Number of DIC,Gann Sample,Abs. Sample,FCM Sample,Size Frac Chlor,Latitude,Longitude
count,89.0,89.0,89.0,89.0,89.0,89.0,89.0,89.0,89.0,89.0,89.0,89.0,89.0,89.0,89.0
mean,56.685393,29.910449,167.640449,29.664157,7.078652,0.224719,1.089888,5.382022,0.988764,0.52809,0.05618,1.382022,0.932584,57.183901,-168.134852
std,2.318873,17.837133,3.057209,17.846384,2.585909,0.419762,1.302449,1.818501,2.253756,0.545433,0.231573,1.361029,1.513558,2.286659,3.082843
min,53.0,0.95,163.0,0.22,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,53.632167,-173.828333
25%,54.0,15.1,165.0,14.83,7.0,0.0,1.0,6.0,0.0,0.0,0.0,1.0,0.0,54.753833,-170.326667
50%,57.0,27.07,166.0,30.12,7.0,0.0,1.0,6.0,0.0,1.0,0.0,1.0,1.0,57.327333,-166.821
75%,59.0,46.14,170.0,44.71,8.0,0.0,1.0,6.0,1.0,1.0,0.0,1.0,1.0,59.109,-165.673
max,61.0,59.9,173.0,59.93,11.0,1.0,11.0,6.0,9.0,2.0,1.0,7.0,7.0,61.251667,-163.84


## Now what if I want to share this with someone else?

- GoogleCoLab: If your data is publicly available, you could just invite them to the CoLab project (like a shared google doc)
- If you just want them to see the results, you could host it on GitHub and link to that
- you can also download the raw python or the notebook and share it that way