<a href="https://colab.research.google.com/github/shaunwbell/GoogleCoLabArchive/blob/main/Collab_and_Plotly_to_plot_cruise_stations_from_excel.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Using Jupyter Notebooks for Geophysical Data Exploration

First, some key references:
+ conda a package manager alternative to PIP
+ anaconda python - a prepackaged, scientific analysis oriented python distribution (comes with conda)

These two products will enable you to fairly easily, create your own locally hosted jupyter-lab environment (assuming you have some basic familiarity with command line tools, linux/unix, and python)

**But Wait!!** I don't have the ability or knowledge to run my own jupyter-lab enviornment

 ## Google Hosted Jupyter Notebooks to the rescue

 No need to setup Jupyter-lab, but you will need to install any needed package... and more importantly, you will need to know how to access your data you want to use

In [2]:
## this wont work with a NOAA account because its disabled, but will work on a personal account

from google.colab import drive
drive.mount('/gdrive')
%cd /gdrive

ValueError: ignored

In [3]:
## this will work with NOAA but its tough to share/collaborate
# because you have to upload a file each time you rerun it all

from google.colab import files

uploaded = files.upload()

for fn in uploaded.keys():
  print('User uploaded file "{name}" with length {length} bytes'.format(
      name=fn, length=len(uploaded[fn])))

Saving DY2103_sideopssummary.xlsx to DY2103_sideopssummary (3).xlsx
User uploaded file "DY2103_sideopssummary.xlsx" with length 19957 bytes


In [4]:
uploaded.keys()

dict_keys(['DY2103_sideopssummary.xlsx'])

So we have a file uploaded, lets ingest it and take a look at it

In [5]:
#pandas is already pre-installed
import pandas as pd

In [38]:
df = pd.read_excel(uploaded['DY2103_sideopssummary.xlsx'],sheet_name=0)

In [39]:
df.drop(index=[89,90],inplace=True) #becasue i know i have a summary line at the bottom of this excel page
df.fillna(0,inplace=True)
df.sample(5)

Unnamed: 0,CTD #,Lat.,Unnamed: 3,Lon.,Unnamed: 5,Site,Station Number,No. of Nutrients,Sal. Samples,O2 Samples,Chl Samples,Number of DIC,Gann Sample,Abs. Sample,FCM Sample,Size Frac Chlor,Latitude,Longitude,Unnamed: 18
12,CTD013,57.0,15.76,165.0,44.7,70M8,s13h1,7.0,0.0,1.0,6.0,0.0,1.0,0.0,1.0,1.0,57.262667,-165.745,0
45,CTD045,59.0,53.83,172.0,10.44,M5W,s45h1,7.0,1.0,1.0,6.0,0.0,1.0,0.0,1.0,1.0,59.897167,-172.174,0
14,CTD015,57.0,19.64,166.0,19.04,70M10,s15h1,7.0,1.0,1.0,6.0,0.0,0.0,0.0,1.0,1.0,57.327333,-166.317333,0
27,CTD027,57.0,51.88,168.0,53.48,M4C,s27h1,7.0,0.0,1.0,6.0,7.0,1.0,0.0,1.0,1.0,57.864667,-168.891333,0
3,CTD004,56.0,46.09,164.0,19.84,M2W,s4h1,7.0,0.0,1.0,6.0,0.0,1.0,0.0,1.0,1.0,56.768167,-164.330667,0


In [41]:
import plotly.express as px
fig = px.scatter_geo(data_frame=df,lat='Latitude',lon='Longitude',hover_name="Site",hover_data=["Station Number","Sal. Samples","O2 Samples","Chl Samples"])

fig.update_layout(
    title_text = 'DY2103 Cruise CTD Stations',
    showlegend = False,
    geo = dict(
        resolution = 50,
        showland = True,
        showlakes = True,
        landcolor = 'rgb(204, 204, 204)',
        countrycolor = 'rgb(204, 204, 204)',
        lakecolor = 'rgb(255, 255, 255)',
        projection_type = "equirectangular",
        coastlinewidth = 2,
        lataxis = dict(
            range = [50, 70],
            showgrid = True,
            dtick = 10
        ),
        lonaxis = dict(
            range = [-180, -150],
            showgrid = True,
            dtick = 20
        ),
    )
)
fig.show()

## Now what if I want to share this with someone else?

- If your data is publicly available, you could just invite them to the CoLab project (like a shared google doc)
- If you just want them to see the results, you could host it on GitHub and link to that
- you can also download the raw python or the notebook and share it that way