# Data

## Data types

Vizzu currently supports two types of data series: dimensions and measures. Dimensions slice the data cube Vizzu uses, whereas measures are values within the cube.

Dimensions are categorical series that can contain strings and numbers, but both will be treated as strings. Temporal data such as dates or timestamps should also be added as dimensions. Vizzu will draw the elements on the chart in the order they are provided in the data set by default. Thus we suggest adding temporal data in a sorted format from oldest to newest.

Measure in the current beta phase can only be numerical.

### Adding data

There are multiple ways you can add data to Vizzu:

* Specified by series - column after column if you think of a spreadsheet
* Specified by records - row after row.
* Data cube form

Elements with a missing value should contain the number zero. `null`, `undefined` and empty cells will result in an error. In case of dimensions, add '' as a value to have a category without a name.

In the first two cases, data has to be in first normal form. Here is an example of that:

|Genres |	Types | Popularity |
|---|---|---|
|Pop 	 |Hard 	|114|
|Rock 	 |Hard 	|96|
|Jazz 	 |Hard 	|78|
|Metal 	 |Hard 	|52|
|Pop 	 |Smooth 	|56|
|Rock 	 |Smooth 	|36|
|Jazz 	 |Smooth 	|174|
|Metal 	 |Smooth 	|121|
|Pop 	 |Experimental 	|127|
|Rock 	 |Experimental 	|83|
|Jazz 	 |Experimental 	|94|
|Metal 	 |Experimental 	|58|

In the type parameter, you can set if a series is a dimension or a measure. Adding the type parameter is optional. If omitted, Vizzu will automatically select the type depending on the first element of the values array using the typeof operator. If all items are numbers, it will be declared as a measure, in any other case, a dimension.


Data specified by series:

In [1]:
from ipyvizzu import Data

data_series = Data()
data_series.add_series(
    "Genres",
    ["Pop", "Rock", "Jazz", "Metal",
     "Pop", "Rock", "Jazz", "Metal",
     "Pop", "Rock", "Jazz", "Metal"],
    type="dimension"
)
data_series.add_series(
    "Types",
    ["Hard", "Hard", "Hard", "Hard",
     "Smooth", "Smooth", "Smooth", "Smooth",
     "Experimental", "Experimental", "Experimental", "Experimental"],
    type="dimension"
)
data_series.add_series(
    "Popularity",
    [114, 96, 78, 52, 56, 36, 174, 121, 127, 83, 94, 58],
    type="measure"
)

Data specified by records:

In [2]:
from ipyvizzu import Data

data_records = Data()

data_records.add_series('Genres', type='dimension')
data_records.add_series('Types', type='dimension')
data_records.add_series('Popularity', type='measure')

record = ['Pop', 'Hard', 114]

data_records.add_record(record)

records = [
    ['Rock', 'Hard', 96],
    ['Jazz', 'Hard', 78],
    ['Metal', 'Hard', 52],
    ['Pop', 'Smooth', 56],
    ['Rock', 'Smooth', 36],
    ['Jazz', 'Smooth', 174],
    ['Metal', 'Smooth', 121],
    ['Pop', 'Experimental', 127],
    ['Rock', 'Experimental', 83],
    ['Jazz', 'Experimental', 94],
    ['Metal', 'Experimental', 58],
]

data_records.add_records(records)

Data cube:

In [3]:
from ipyvizzu import Data

data_cube = Data()

data_cube.add_dimension('Genres', [ 'Pop', 'Rock', 'Jazz', 'Metal'])
data_cube.add_dimension('Types', [ 'Hard', 'Smooth', 'Experimental' ])

data_cube.add_measure(
    'Popularity',
    [
        [114, 96, 78, 52],
        [56, 36, 174, 121],
        [127, 83, 94, 58],
    ]
)

Data can be loaded from pandas DataFrame too.

Note: `Data().add_data_frame()` arguments are:

* `data_frame` (mandatory): pandas DataFrame object
* `infer_types` (default: None): if `infer_types` is not given, ipyvizzu tries to determine the infer type of a column (dimension or measure). The infer type of a column can be set in the following format `infer_types={"column_name": "infer_type"}`.
* `default_measure_value` (default: 0): ipyvizzu fills the not available measure values with this
* `default_dimension_value` (default: ""): ipyvizzu fills the not available dimension values with this

In [4]:
import pandas as pd

from ipyvizzu import Data

data_frame = pd.DataFrame(
    {
        "Genres": [
            "Pop",
            "Rock",
            "Jazz",
            "Metal",
            "Pop",
            "Rock",
            "Jazz",
            "Metal",
            "Pop",
            "Rock",
            "Jazz",
            "Metal"
        ],
        "Types": [
            "Hard",
            "Hard",
            "Hard",
            "Hard",
            "Smooth",
            "Smooth",
            "Smooth",
            "Smooth",
            "Experimental",
            "Experimental",
            "Experimental",
            "Experimental"
        ],
        "Popularity": [
            114, 
            96, 
            78, 
            52, 
            56, 
            36, 
            174, 
            121, 
            127, 
            83, 
            94, 
            58
        ]
    }
)

data_pd = Data()
data_pd.add_data_frame(data_frame)

Data can be loaded from JSON file too.

Content of `./music_data.json` (in this example the data stored in the Data Cube format):

```JSON
{
    "dimensions": [
        {"name": "Genres", "values": [ "Pop", "Rock", "Jazz", "Metal"]},
        {"name": "Types", "values": [ "Hard", "Smooth", "Experimental"]}
    ],
    "measures": [
        {
            "name": "Popularity",
            "values":  [
                [114, 96, 78, 52],
                [56, 36, 174, 121],
                [127, 83, 94, 58]
            ]
        }
    ]
}
```

In [5]:
from ipyvizzu import Data

data_json = Data.from_json("./music_data.json")