Welcome to HoloViews!

This 'Getting Started' guide aims to get you using HoloViews productively as quickly as possible. It is designed as an entrypoint for new users that will introduce the core concepts necessary to get you working productively with your own data. We recommend reading this guide in order if you wish to get an overview of what is offered by HoloViews. For detailed documentation, please consult our [User Guide] which we will link to from the appropriate sections of this guide.

I-Introduction </br>
II-Customization </br>
III-Datasets </br>
IV-Live_Data </br>
V-Pipelines </br>
VI-Principles </br>


# Why HoloViews?

HoloViews is a BSD licensed package used for data analysis and visualization for Python 2 and 3. There are plenty of excellent tools available for Python already such as numpy, pandas and xarray for raw data processing as well as some excellent plotting libraries such as matplotlib, bokeh and plotly. So why is there a need for another plotting library?

As will be made clear over the course of this guide, HoloViews takes a distinct approach to visualization which is quite distinct from the traditional plotting paradigm. Instead of building plots and writing plotting code, you describe your data with a small amount of semantic information. This then enables immediate, automatic visualization that can be effortlessly requested at any time as your data evolves. Without requiring any traditional plotting code, HoloViews brings your data to life with your favorite plotting library, whether it is matplotlib, bokeh or plotly.

HoloViews is data-centric while traditional plotting libraries are visualization-centric. Instead of a 'Data Visualization' library that visualizes data at a snapshot in time, HoloViews is about empowering your data with the power of self-visualization as it is explored and transformed. For this reason, HoloViews can be viewed as a 'Visualization Data' library and by the end of this guide this distinction should be clear.


## Tabulated data: subway stations

We will now introduce HoloViews and demonstrate some of its most compelling features using real data relating to transportation in Manhattan, New York. First lets run some imports to make [numpy] and [pandas] accessible for loading our transportation data. We will start with a table of subway station information loaded from a CSV file with pandas and later on in this section, we will load some data relating to taxi dropoffs using numpy:

In [None]:
import pandas as pd
import numpy as np
import holoviews as hv
hv.extension('bokeh')

This is the standard way to make the numpy and pandas libraries available in the namespace. We recommend always importing HoloViews as ``hv`` and if you haven't already installed HoloViews, checkout our [installation page].

Note that after importing HoloViews as ``hv`` we run ``hv.extension('bokeh')`` to load the bokeh plotting extension, allowing us to generate visualizations with [Bokeh]. In the next section we will see how you can use other plotting libraries such as [matplotlib] and even how you can mix and match between them.

Now let's load our subway data using pandas:

In [None]:
station_info = pd.read_csv('../assets/station_info.csv')
station_info.head()

We see that this table contains the subway station name, its latitude and longitude, the year it was opened, the number of services available from the station and ther names and finally the yearly ridership (in millions for 2015).

## ``Elements`` of visualization

We can immediately visualize some of the the data in this table as a scatter plot. Let's view how ridership varies with the number of services offered at each station:

In [None]:
scatter = hv.Scatter(station_info, kdims=['services'], vdims=['ridership'])
scatter

Here we passed or dataframe to ``hv.Scatter`` to create an *object* called *scatter*. This object is independent of any plotting library and is a simple wrapper around our dataframe that knows that the 'services' column is to be plotted along the x-axis and 'ridership' column is to be plotted on the y-axis. These are our *dimensions* which we will describe in more detail a little later.

Given that we have the handle ``scatter`` on our ``Scatter`` object, we can show that it is indeed an object and not a plot by printing it:

In [None]:
print(scatter)

The bokeh plot above is simply the rich, visual representation of ``scatter`` which is plotted automatically by HoloViews and displayed automatically in [the Jupyter notebook]. Although HoloViews itself is independent of notebooks, this convenience makes working with HoloViews easiest in the notebook environment.

## Compositional ``Layouts``

The class ``hv.Scatter`` is a subclass of ``Element`` which are the simplest viewable components in HoloViews as shown in our [element gallery]. Now we have a handle on ``scatter``, we can demonstrate the compositionality of these objects:

In [None]:
layout = scatter + hv.Histogram(np.histogram(station_info['opened'], bins=24), kdims=['opened'])
layout

In a single line and the ``+`` operator, we created a new, compositional object called a ``Layout`` built from our scatter visualizations and a ``Histogram`` that shows how many subway stations opened in Manhattan since 1900. Note that once again, all the plotting is happening behind the scenes and ``layout`` is a new object that exists independently of any given plotting system:

In [None]:
print(layout)

## Array data: taxi dropoffs

So far we have visualized data in a [pandas DataFrame] but ``HoloViews`` is as agnostic to data formats as it is to plotting libraries; see [User Guide] for more information. This means we can work with array data as easily as we can work with  tabular data and to demonstrate this, here are some [numpy arrays] relating to taxi dropoff locations in Manhattan:

In [None]:
taxi_dropoffs = {hour:arr for hour, arr in np.load('../assets/hourly_taxi_data.npz').items()}
#print('Hours: {hours}'.format(hours=', '.join(taxi_dropoffs.keys())))
print('Taxi data contains {num} arrays (one per hour).\nDescription of the first array:\n'.format(num=len(taxi_dropoffs)))
np.info(taxi_dropoffs['0'])

As we can see, this dataset contains 24 arrays (one for each hour of the day) of taxi dropoff locations (by latitude and longitude), aggregated over one month in 2015, where the array shown above contains the accumulated dropoffs for the first hour of the day.

## Compositional  ``Overlays``

Once again, we can easily visualize this data with HoloViews by passing our array to ``hv.Image`` to create the ``image`` object which has the spatial extent of the data declared as the ``bounds`` in terms of the corresponding range of latitudes and longitudes.

In [None]:
bounds = (-74.05, 40.70, -73.90, 40.80)
image = hv.Image(taxi_dropoffs['0'], bounds=bounds, kdims=['lon','lat'])

HoloViews supports ``numpy``, ``xarray``, ``iris``, ``dask`` when working with array data (see [User Guide]) and we can compose elements containing array data with those containing tabular data. To illustrate, let's pass our tabular station data to a ``Points`` element which is used to mark positions in two-dimensional space:

In [None]:
points = hv.Points(station_info, kdims=['lon','lat'])
image + image * points


On the left, we have the visual representation of the ``image`` object we declared. Using ``+`` we put it into a ``Layout`` together with a new compositional object created with the ``*`` operator called an ``Overlay``. This particular overlay displays the station positions on top of our image which works correctly as both elements contain data that exist in the same space, namely Manhattan.

This overlay on the right lets us see the location of all the subway stations in relation to our midnight taxi dropoffs. Note that HoloViews allows you to visually express more of the available information with our points, for instance, you could represent the ridership of each subway by point color or point size. For more information see [User Guide].

## Effortlessly exploring data

You can keep composing datastructures together until there are more dimensions than can fit on simultaneously on your screen. For instance, you can visualize a dictionary of ``Images`` (on for every hour of the day) by declaring a ``HoloMap``: 

In [None]:
dictionary = {int(hour):hv.Image(arr, bounds=bounds, kdims=['lon','lat']) for hour, arr in taxi_dropoffs.items()}
hv.HoloMap(dictionary, kdims=['Hour'])

This is yet another object which is rendered by the HoloViews plotting system with Bokeh behind the scenes:

In [None]:
holomap = hv.HoloMap(dictionary, kdims=['Hour'])
print(holomap)

As this a ``HoloMap`` is a container for our ``Image`` elements, we can the methods it offers to return new containers. For instance, in the next cell we select three different hours of the morning from the ``HoloMap`` and display them as a ``Layout``:

In [None]:
holomap.select(Hour={3,6,9}).layout()

Here the ``select`` method picks values from the specified 'Hour' *dimension*. We have seen dimensions used in the ``kdims`` and ``vdims`` arguments when declaring our elements (``scatter``, ``histogram`` and ``image`` above) and when declaring the ``HoloMap`` of ``Image``s above. These are the *key dimensions* (``kdims``) and *value dimensions* (``vdims``) used to express the space in which our data lives.

Note how the ``Image`` elements where the holomap is constructed are declared using ``kdims=['lat','lon']`` which describes the fact that Manhattan is being viewed in terms of longitude and latitude. This semantic information is automatically mapped to our visualization by the HoloViews plotting system which sets the x-axis and y-axis labels accordingly. In the case of the ``HoloMap`` we used ``kdims=['Hour']`` to declare that the interactive slider ranges over the hours of the day.

## Data as visualization

Holomaps are able to compose with elements and other holomaps into overlay and layouts just as easily as you compose two elements together. Here is one such composition where we select a range of longitudes and latitudes from our ``Points`` before we overlay them:

In [None]:
%%opts Image [xrotation=90] Points (color='deepskyblue' marker='v' size=6)
hotspot = points.select(lon=(-73.99, -73.96), lat=(40.75,40.765))
composition = holomap * hotspot
composition

The line starting with ``%%opts`` used to specify the visual style is part of the HoloViews options system described in the next 'Getting started' section which also describes how to achieve the same effect with standard Python syntax.

In the cell above we created and styles a composite object within a few short lines of code. Furthermore, this composite object relates tabular and array data and is immediately presented in a way that can be explored interactively. This way of working enables highly productive exploration allowing new insights to be gained easily. For instance, after exploring with the slider we notice a hotspot of taxi dropoffs at 7am which we can select as follows:

In [None]:
composition.select(Hour=7)

We can now see that the slice of subway locations was chosen in relation to the hotspot in taxi dropoffs around 7am in the morning. This area of Manhattan just south of Central Park contains many popular tourist attractions, including Times Square and we can infer that tourists often take short taxi rides from the subway stations into this area.

At this point it may appear that HoloViews is about easily generating explorative, interactive visualizations *from* your data. In fact, as we have been building these visualizations we have been working *with* our data, as we can show by examining the ``.data`` attribute of our sliced subway locations:

In [None]:
hotspot.data

We see that slicing the HoloViews ``Points`` object in the visualization sliced the underlying data with the structure of the table left intact. We can see that the Times Square 42nd Street station is inded one of the subway stations surrounding our taxi dropoff hotspot.

## Onwards

Advanced topics:

* For geographical areas much larger than Manhattan the curvature of the Earth becomes important: holoviews extension called geoviews.
* The taxi array data was derived from a very large tabular dataset and rasterized using datashader which is also supported by HoloViews.