* Unfortunately, the display of interactive plot graphs does not work on naucse.python.cz. To view including graphs, you can use nbviewer: https://nbviewer.jupyter.org/github/coobas/pydataladies-dashboard/blob/main/notebooks/dashboardy-1.ipynb or run the notebook locally. All files together can be found in the repository https://github.com/coobas/pydataladies-dashboard.*

# Interactive visualizations and applications
When working with data, there are many opportunities where interactivity is useful. When visualizing, zooming in / out, selecting a sub-area, showing plotted values, etc. is suitable. Or we want to make the results of our great analysis available to &quot;non-technical&quot; colleagues or friends who (yet) can&#39;t start the Jupyter laptop.
Here we will show how to deal with such tasks using two tools: [plots] (https://plotly.com/python/), resp. especially [plotly express] (https://plotly.com/python/plotly-express/), and [streamlit] (https://www.streamlit.io/).
There are other tools that provide similar options. You can find a detailed overview at https://pyviz.org/tools.html. For interactive visualizations, these are mainly [holoviews] (http://github.com/pyviz/holoviews) or [altair] (http://github.com/altair-viz/altair). On &quot;dashboarding&quot; then [dash] (http://github.com/plotly/dash), [panel] (https://panel.holoviz.org/), [voila] (http://github.com/ QuantStack / voila) or [justpy] (https://justpy.io).
Each of these tools has, as usual, its advantages and disadvantages. The most common tool is [Dash] (http://github.com/plotly/dash) from the same workshop as plotly, which also provides enterprise solutions for running applications. Dash is definitely a good choice, as you can find out at the [lecture from the Prague PyData Meetup] (https://www.youtube.com/watch?v=dewrzMPPLDU). The panel (and also the Voila) differs from the Dash in that it can also be used in a Jupyter laptop and then the laptop can be used directly as an application. The biggest advantage of `voila` is the simple way to make a dashboard directly from your laptop: see [documentation] (https://voila.readthedocs.io/en/stable/using.html).
The two biggest advantages of Streamlite are the speed (simplicity) of application development and an attractive default appearance.
A few articles or lectures related to the topic:* [Going beyond Jupyter notebooks](https://www.intelligencerefinery.io/post/making-python-apps/)
* [How to Build a Reporting Dashboard using Dash and Plotly](https://towardsdatascience.com/how-to-build-a-complex-reporting-dashboard-using-dash-and-plotl-4f4257c18a7f)
* [Turn any Notebook into a Deployable Dashboard | SciPy 2019 | James Bednar](https://www.youtube.com/watch?v=L91rd1D6XTA)


## Pros and cons
However, it must be said that all the mentioned approaches have their significant disadvantages and limits and are not suitable for large and complex applications. Interaction options in the application are limited and can also be slow. Robust scaling for many users (high traffic) is generally more complex. So when will we primarily use what we show here?* On a small application for a limited number of users (dashboard for colleagues).* For rapid prototype development.
And what if we want to build a large (web) application?* We will assign the development team to create a beautiful and fast &quot;front-end&quot; for us in modern JavaScript tools such as React or Vue.js, while we will create a &quot;back-end&quot; in Python, which will communicate with the front-end, for example using JSON API. We&#39;ll see that in our API lesson.* If we don&#39;t have such a team, we will learn to program in JavaScript ... No, we prefer to type in JavaScript ...* ... we&#39;d better hire the development team :-)

## Install and import graphics libraries
If you do not have the plot library installed, uncomment and run the appropriate lines.

In [None]:
# install the plotter# %pip install plotly

The abbreviation `px` has been used for express plots, which we will also use.

In [None]:
import plotly.express as px

## Interactive data visualization
Let&#39;s take a closer look at the data we worked with in previous machine learning lessons.

### Fish measures
Let&#39;s start with the dimensions of the fish on which we showed regression and classification. Definitely worth a look at the data first. (We probably won&#39;t draw the appearance of the fish directly, the data isn&#39;t enough for us :)

In [None]:
import pandas as pd

In [None]:
fish_data = pd.read_csv("fish_data.csv", index_col=0)

And instead of the classic display of numbers, we will try to plot the data straight into the graph. We know (we know) that there are a lot of columns in the data. We can have them all rendered using `scatter_matrix`.

In [None]:
px.scatter_matrix(fish_data)

This is not a bad thing at all. It can already be seen that we have a categorical variable Species, some continuous variables with dimensions and an irrelevant ID. We also see that some variables correlate a lot.
We could already use interactive elements: try moving the cursor to the points in the graph or using the tools to change the scale or select the data that will be displayed in the upper right corner. But it will be even better to improve the graph a bit: Drop the ID and mark the species with a color.

In [None]:
px.scatter_matrix(
    fish_data,
    dimensions=["Weight", "Length1", "Length2", "Length3", "Height", "Width"],
    color="Species",
    opacity=1,
    hover_data=["Species"],
)

In addition to the colors, a legend has been added to the right here. And even an interactive legend! With a simple click we can hide / show individual categories, with a double click we can show only one category. Try it! Selecting data - Box Select or Lasso Select - can also be useful.

** Task: ** Use the Weight column for the color, make the symbols partially transparent with the `opacity` argument (range 0 - 1) and in the legend, displayed when the cursor moves to a certain point, let all columns be displayed (the` argument will help) hover_data`).

If we then want to look at the statistical properties of one particular variable (column), we can use one of the functions to display the distribution function, respectively. some of its properties (moments).
We can start with a fairly frequent box fence. The plot bonus is mainly in the interactive display of numerical values: medians and quantiles, as well as the identification of (probably) outliers.

In [None]:
px.box(fish_data, x="Species", color="Species", y="Height", points="all", notched=False)

In [None]:
px.histogram(fish_data, color="Species", x="Height", opacity=0.5)

To display the relationship between two variables, it may still be useful to look at the density of points in the area using contours. In this graph, we can also display the so-called marginal distribution: the most probable distribution of one variable depending on the other.

In [None]:
px.density_contour(
    fish_data,
    color="Species",
    x="Height",
y=&quot;Length3&quot;,    marginal_x="histogram",
    marginal_y="box",
)

** Task: ** Try to display in the graphs different quantities (different columns) than just Height and Width. Try changing the type of marginal graphs.

## What&#39;s next?

### Report for boss and boss
We also have colleagues who do not (yet) use Python and would still appreciate if they could get a report with such beautiful and interactive visualizations instead of a static report. For this purpose, exporting a laptop to html using `nbconvert` is useful.
At the command prompt, run `nbconvert` using the` jupyter nbconvert` command. To export to html, then add `--to html`, we must not forget to specify which notebook (ie file) we actually want to convert.

In [None]:
# Uncommenting starts the command in the command line (thanks to the exclamation mark)# Maybe your file is named differently than dashboards-1, so use the current file name# !jupyter nbconvert dashboardy-1.ipynb --to html

We can also export only outputs and &quot;hide&quot; source code using `--no-input`:

In [None]:
# !jupyter nbconvert dashboardy-1.ipynb --to html --no-input

### Analysis of new data
Everyone liked our visualizations, and since more data needed to be analyzed, we were given the task. This time it&#39;s not about fish, it&#39;s about penguins.

** Task: ** Choose the one you like best from the charts and use penguins instead of fish data.

In [None]:
penguins = pd.read_csv("penguins_size_nona.csv")

## We are creating an application
A very common pattern has emerged from our work in the notebook: Similar visualizations and analyzes, in which data and several key parameters change. Opportunity to create an application that will allow us and a circle of knowledgeable users.

Let&#39;s define our simple application:1. Retrieve data from csv file.2. Draw a scatter matrix where I will be able to choose the dimensions, column for color and transparency.3. For the selected column, view the distribution of the selected column using the histogram, fence box, or violin fence.

### Preparing in a laptop
Let&#39;s sketch it out here in a notebook. We will be the first to prepare user inputs.

In [None]:
# input 1: data set selectiondata_file_path = "penguins_size_nona.csv"
# input 2: scatter matrix parameter selectiondimensions = ['culmen_length_mm', 'culmen_depth_mm', 'flipper_length_mm', 'body_mass_g']
color = "sex"
opacity = 0.5
# column selection to display the data distributioninteresting_column = "body_mass_g"
# function selection to display the distribution functiondist_plot = px.violin

And this is our application: We used the same functions and parameters as at the beginning of the plot work, we only parameterized them using the inputs from the previous block.

In [None]:
# retrieve datadata = pd.read_csv(data_file_path)

In [None]:
# scatter matric plat
px.scatter_matrix(data, dimensions=dimensions, color=color, opacity=opacity)

In [None]:
# display of the distribution functiondist_plot(data, x=interesting_column, color=color)

And now let&#39;s turn it into an interactive web application! We will not do this right here in the laptop, but in a &quot;regular&quot; .py file with Python code.
We have the application ready in the file `app.py`, here in the laptop we can view the file:

In [None]:
%cat ../app.py

The basis is that we have redesigned the user inputs from the form `variable = value` to the form` variable = value_widget (...) `. Here&#39;s how to create a streamlit application:* We write the application basically as a linear script (of course we can structure the source code into functions / modules / classes at our own discretion, but streamlit will always run the application step by step like that script).* We load user inputs from the return value of the `st.some_widget` functions, Streamlit will make sure that the widget works properly and the return value is always the current one.* The application elements (outputs) are displayed to the user using `st.write`.

* Widgets - auxiliary &quot;things&quot;: * In graphical user interfaces (GUI) are used * widgets *: tools for selecting options, variable values, entering text or date, etc.

### Starting
You probably don&#39;t have Streamlit installed yet. It is installed in the usual way via pip:
    pip install streamlit
    
or if you use condos
conda install -c conda-forge streamlit
You will then start the application on your computer with the command `streamlit run` with the name of the application file. In our case, then
    streamlit run app.py
    
If everything is OK, something like this will appear:
```
  You can now view your Streamlit app in your browser.

Network URL: http://192.168.2.103:8800  External URL: http://85.207.123.46:8800
```

Follow the instructions to open the link (the first one) in a browser. It is very likely that our just created data visualization application will appear.

## We publish on the Internet
In principle, we could run the application on your computer so that other users can use it. This would be easy on the internal network (home, work) (although security settings could prevent this on the work network and work computer), access from the external Internet would be more complicated.
Fortunately, we are not alone in a similar situation :) So there are more or less complex and sophisticated ways to run the application on a server (in the cloud) and make it accessible from the Internet. We&#39;ll show you how it works on [Heroku] (https://heroku.com). Similar services are offered by AWS (Elastic Beanstalk), Google App Engine or Dokku on Digital Ocean. The advantage of Herok is simplicity and the possibility of free services, the disadvantage is the rapidly rising price and limited options.
There is also the option to publish the application using [Streamlit Sharing] (https://blog.streamlit.io/introducing-streamlit-sharing/). This service offers even simpler publications from public GitHub repositories, but does not have as many options as, for example, Heroku.
We will basically follow the guide [Getting started with Python] (https://devcenter.heroku.com/articles/getting-started-with-python).

### Heroku Registration and Client
Before we begin, we need to:1. Create a free account at https://signup.heroku.com/signup/dc.2. Install Git:   * Windows: https://gitforwindows.org/
* Mac OS: https://sourceforge.net/projects/git-osx-installer/files/ (or other options described at https://www.atlassian.com/git/tutorials/install-git).* Linux: It will probably be there or you know how to do it :)3. At least the basic configuration of Git. In the command line (* fill in your name and email *):```
git config --global user.name &quot;My Name&quot;git config --global user.email "muj@email.com"
```
3. Install Heroku client:* Follow https://devcenter.heroku.com/articles/heroku-cli   
We verify the installation from the command line using:1. Go:```
git config --list
```
Something like this should appear```
user.name = My Nameuser.email=muj@email.com
```

2. Heroku:
```
heroku --version
```
This should be the output roughly```
heroku/7.39.2 darwin-x64 node-v12.13.0
```

### Go-live

We will now work for a while on the command line. First you need to create a new folder for the published application and copy the files `app.py`,` requirements.txt`, `runtime.txt` and` Procfile` there.
And now in the command line, where the current folder must be the newly created one, where you copied the files:
1. Create a Git repository:```
git init
```

2. Heroku needs a file called `Procfile`. He very simply says what we really want to run. There will be a `streamlit run` command (just like we ran the application locally) with a few extra switches. For `Procfile` it must contain this line:```
web: streamlit run --server.headless 1 -server.port $PORT app.py
```

3. It is definitely good to specify the Python version, otherwise the default will be used (at the time of writing it is 3.6). In the `runtime.txt` file, the line` python-3.8.10` is enough.
3. The next file we need is `requirements.txt`. You may have heard about it - it contains a list of Python packages that the Python project needs. Heroku will use this file to install everything needed before running the application. For our dashboard, the `requirements.txt` must read:```
pandas==1.2.5
streamlit==1.2.0
plotly==4.14.3
```

4. We have the files ready, we need to add them to the Git repository now. To do this, use the following two commands:```
git add app.py Procfile runtime.txt requirements.txt
git commit -v -m &quot;first dashboard version&quot;```

5. And now we can publish. So first log in using```
heroku login
```
(Follow the instructions - a login page will open in your web browser, prompting you to enter your login information.)
6. Then still create a Heroku application using```
heroku create
```

7. Finally, you really have to launch the application on the internet using```
git push heroku main
```



If all goes well, this step will take a while. Gradually, what Heroku is doing for us will be listed:```
Enumerating objects: 7, done.
Counting objects: 100% (7/7), done.
Delta compression using up to 4 threads
Compressing objects: 100% (4/4), done.
Writing objects: 100% (4/4), 940.72 KiB | 2.40 MiB/s, done.
Total 4 (delta 2), reused 0 (delta 0)
remote: Compressing source files... done.
remote: Building source:
remote:
remote: -----> Python app detected
remote: -----> Installing python-3.8.6
remote: -----> Installing pip 20.1.1, setuptools 47.1.1 and wheel 0.34.2
remote: -----> Installing SQLite3
remote: -----> Installing requirements with pip
remote:        Collecting pandas==1.1.4
remote:          Downloading pandas-1.1.4-cp36-cp36m-manylinux1_x86_64.whl (9.5 MB)
remote:        Collecting streamlit==0.71.0
remote:          Downloading streamlit-0.71.0-py2.py3-none-any.whl (7.4 MB)
remote:        Collecting plotly==4.13.0
remote:          Downloading plotly-4.13.0-py2.py3-none-any.whl (13.1 MB)
...
```

Eventually something like this should appear```
remote: -----> Launching...
remote:        Released v6
remote:        https://cryptic-ocean-20431.herokuapp.com/ deployed to Heroku
remote:
remote: Verifying deploy... done.
To https://git.heroku.com/cryptic-ocean-20431.git
```


The link below the &quot;Released v6&quot; line is important. This is the address where our dashboard is accessible &quot;from anywhere on the Internet&quot;. Either copy it to the browser or use the `heroku open` command.

** Task 1: ** Send a link to your running application :)

** Task 2: ** Use `st.title` to add an application title. Try locally, then commit the change to git and update the application on Herok (using the `git push heroku master`).