# Python '`1-liners`' to start you on your data exploration journey.

<div class = "alert alert-info" role = "alert" 
     style = "font-size: 1.7em; padding: 15px; margin: 10px 0; text-align: center; background-color: #d9edf7; 
    border-color: #bce8f1; color: #31708f; border-radius: 8px;">
    In this tutorial, we will be kicking off exploratory data analysis using 5 simple but essential approaches

</div>

### First key step: Import the python libraries you'll need -- basic setup

In [None]:
# Data Handling and Manipulation
import numpy as np
import pandas as pd

# Data Visualisation and Plots
import plotly.express as px

# Supress warnings (not errors)
import warnings
warnings.filterwarnings('ignore')

### Load data using pandas' `read_csv function`

In [None]:
# Let's give our input data file a short name so we can use it more easily later:
data_file = 'P2_TestData_CMEMS.csv'

In [None]:
# Import our named text file (data_file) as a dataframe (df)
df = pd.read_csv(data_file, comment='#')   #The hashtag prevents commented lines from being read

# Look inside your dataframe using 'print'
print(df)

### Exploring your data using 1-liners

<div class="alert alert-info" role="alert" 
     style="font-size: 1.1em; padding: 10px; margin: 10px 0; text-align: left;">

     .  
     That looks good! Now you are going to use 5 brilliant 1-liners to explore your data more fully.
    
       1) df.info() 
       2) df.describe()
       3) df.value_counts()
       4) df.corr()
       5) px.scatter & px.line w/ trendline       
    .
</div>

In [None]:
df.info()

In [None]:
df.describe()

In [None]:
df[['SST']].describe()

In [None]:
df[['SST']].value_counts()

In [None]:
df[['SSH', 'SST', 'SSS', 'MLD']].corr()

In [None]:
df[['SSH', 'SST', 'SSS', 'MLD']].corr().round(2)

### Step 5: Line plot of `air temperatures` and `wind speeds` from your birthday day

In [None]:
# Line plot
px.line(df, x = 'Date', y = 'SST')

In [None]:
# Scatter plot
px.scatter(df, x = 'Date', y = 'SST')

In [None]:
# Scatter plot with trendline
px.scatter(df, x = 'SST', y = 'SSH', 
           trendline = 'ols', trendline_color_override = 'navy')

<div class="alert alert-info" role="alert" 
     style="font-size: 1.1em; padding: 10px; margin: 10px 0; text-align: center;">
 
     Neat trick: Hover over the OLS trendline to see `y  = mx + c` linear equation as well as the `R²` value.

</div>