# Jupyter Notebook Demo @DELix 2020, Instituo Superior Técnico (12.2.2020)

Jacinto Estima
- https://jestima.github.io/
- https://github.com/jestima

URL: https://jupyter.org/

__Purpose:__ Project Jupyter exists to develop open-source software, open-standards, and services for interactive computing across dozens of programming languages

Notebooks are Web documents that can mix together:
- Live code
- Text
- Formulas
- Among others

Notebooks contain 2 types of cells:
- Code cell:
    - Several languages supported (__Python__, R, Julia, among many others)
- Markdown cell:
    - Markdown language
    - HTML
    - LATEX (including math formulas)

Notebooks have the extension __*.ipynb__

## Let's start by something very basic

To run the code inside a cell, one can either press the option __*Run*__ at the top menu, or press the keys __*Shift + Enter*__

In [None]:
print('Hello Jupyter!')

In [None]:
3+3

In [None]:
a = 3

We can incorporate formulas with Markdown and Latex

\begin{align}
\dot{x} & = \sigma(y-x) \\
\dot{y} & = \rho x - y - xz \\
\dot{z} & = -\beta z + xy
\end{align}

$$\begin{eqnarray}
x' &=& &x \sin\phi &+& z \cos\phi \\
z' &=& - &x \cos\phi &+& z \sin\phi \\
\end{eqnarray}$$

We can create interactive charts

In [None]:
from ipywidgets import interact
import numpy as np
import matplotlib.pyplot as plt

In [None]:
def func_plot(k=1):
    x = np.linspace(-2, 2, 200)
    plt.plot(x, np.sin(2*np.pi*k*x))

In [None]:
interact(func_plot, k=(0.5,10))

We can incorporate videos from Youtube

In [None]:
webscale_id = 'HW29067qVWk'
from IPython.display import YouTubeVideo
YouTubeVideo(webscale_id)

# Let's now work with a dataset on local accomodation (alojamento local) in Portugal

We will use Pandas, a library for data manipulation and analysis, to read a CSV and analyze the data.

1. Let's import Pandas

In [None]:
import pandas as pd

2. Now we need to read the CSV file into a dataframe, which is a Pandas data structure much like a table

In [None]:
alojamentos = pd.read_csv('https://raw.githubusercontent.com/jestima/jupyter-notebook-demo/master/PORDATA_Alojamentos.csv', sep=';', encoding = "cp1252")

3. Let's explore the dataframe

In [None]:
alojamentos

In [None]:
alojamentos.head()

In [None]:
alojamentos.describe()

4. Let's try to do some plots now

In [None]:
import matplotlib.pyplot as plt

In [None]:
plt.plot(alojamentos.Anos, alojamentos.Total)

In [None]:
plt.plot(alojamentos.Anos, alojamentos['Hotéis'])
plt.plot(alojamentos.Anos, alojamentos['Pensões'])
plt.plot(alojamentos.Anos, alojamentos['Alojamento Local'])

## Let's work on a new dataset representing the expenses of portuguese families by category between 1995 and 2017

1. Since we have already imported Pandas, we can directly read the CSV file into a dataframe

In [None]:
df = pd.read_csv("../datasets/despesa-familias.csv", sep = ';', encoding='latin1')

2. Let's print the dataframe to have a look at the data

In [None]:
df

In [None]:
df.head()

In [None]:
df.describe()

In [None]:
import matplotlib.pyplot as plt

In [None]:
plt.plot(df.Anos,df.Alimentacao_bebidas_tabaco, label="Alimentacao, bebidas e tabaco")
plt.plot(df.Anos,df.Vestuario_calcado, label='Vestuário e Calçado')
plt.legend(shadow=True, fancybox=True)

In [None]:
data = df.drop(['Anos'], axis=1)
data = data.drop(['Total'], axis=1)
corrs = data.corr()
corrs.round(2)

In [None]:
import seaborn as sns

In [None]:
plt.figure(figsize=(15,10))
sns.heatmap(corrs.round(2), vmin=0.5, vmax=1, annot=True, cmap=sns.color_palette("Reds"))

In [None]:
sns.regplot(x=df.Mobiliario_artigos_decoracao_equipamento_domestico_manutencao_habitacao,
            y=df.Habitacao_agua_eletricidade_gas_outros_combustiveis)

In [None]:
sns.regplot(x=df.Saude, y=df.Educacao)

In [None]:
import squarify

In [None]:
df[df.Anos == 2002]

In [None]:
volume = df[df.Anos == 2017].drop(['Anos','Total'], axis=1)

labels = volume.columns

volume = volume.iloc[0, :]


plt.figure(figsize=(15,10))
squarify.plot(sizes=volume, label=labels, alpha=0.7)

# This notebook is available online on my github account

https://github.com/jestima/jupyter-notebook-demo

######################################################################################################

# New stuff

### Still under testing (not to use)

In [None]:
import ipywidgets as widgets
from ipywidgets import interact, interact_manual

In [None]:
@interact
def show_articles_more_than(column='Total', x=5000):
    return df.loc[df[column] > x]

## Trying new stuff with more interactivity

In [None]:
from plotly.offline import plot as py
import cufflinks as cf
import pandas as pd
import numpy as np
print(cf.__version__)

In [None]:
@interact
def scatter_plot(x=list(df.select_dtypes('float64').columns), 
                 y=list(df.select_dtypes('float64').columns)[1:]):
    
    
    df.iplot(kind='scatter', x=x, y=y, mode='markers')


In [None]:
from plotly.offline import plot
import plotly.graph_objects as go

import numpy as np

s = np.linspace(0, 2 * np.pi, 240)
t = np.linspace(0, np.pi, 240)
tGrid, sGrid = np.meshgrid(s, t)

r = 2 + np.sin(7 * sGrid + 5 * tGrid)  # r = 2 + sin(7s+5t)
x = r * np.cos(sGrid) * np.sin(tGrid)  # x = r*cos(s)*sin(t)
y = r * np.sin(sGrid) * np.sin(tGrid)  # y = r*sin(s)*sin(t)
z = r * np.cos(tGrid)                  # z = r*cos(t)

surface = go.Surface(x=x, y=y, z=z)
data = [surface]

layout = go.Layout(
    title='Parametric Plot',
    scene=dict(
        xaxis=dict(
            gridcolor='rgb(255, 255, 255)',
            zerolinecolor='rgb(255, 255, 255)',
            showbackground=True,
            backgroundcolor='rgb(230, 230,230)'
        ),
        yaxis=dict(
            gridcolor='rgb(255, 255, 255)',
            zerolinecolor='rgb(255, 255, 255)',
            showbackground=True,
            backgroundcolor='rgb(230, 230,230)'
        ),
        zaxis=dict(
            gridcolor='rgb(255, 255, 255)',
            zerolinecolor='rgb(255, 255, 255)',
            showbackground=True,
            backgroundcolor='rgb(230, 230,230)'
        )
    )
)

fig = go.Figure(data=data, layout=layout)
py.iplot(fig, filename='jupyter-parametric_plot')