# Automatic cube creation with Atoti

[Atoti](https://www.atoti.io/) is a free Python BI analytics platform for Quants, Data Analysts, Data Scientists & Business Users to collaborate better, analyze faster and translate their data into business KPIs.  
This notebook uses [ipywidget](https://ipywidgets.readthedocs.io/en/stable/) to interactively upload a CSV of user's choice and spins up a BI application, ready for users to build their own analytics dashboards.  

<img src="https://data.atoti.io/notebooks/auto-cube/spin-up-cube.gif" width="70%" />

__NOTE:__
- This is a simplified use case where there is only 1 single Atoti table (created from the uploaded CSV)
- The CSV should be of encoding UTF8
- For best experience, choose a dataset with a fair number of numeric and non-numeric columns, e.g. [Data Science Job Salaries dataset](https://www.kaggle.com/datasets/ruchi798/data-science-job-salaries) from Kaggle:  
    - non-numerical columns are translated into hierarchies
    - a SUM and a MEAN measure will be automatically created for numerical columns (non-key columns)
- When selecting keys for the Atoti table, choose the columns that will ensure data uniqueness.
    - When unsure, skip key selection.
    - Non-unique keys will result in a smaller dataset getting loaded. Only the last occurrence of the duplicates will be kept.
    

To understand more about multidimensional datacubes, check out the [Atoti tutorial](https://docs.atoti.io/latest/getting_started/tutorial/tutorial.html).  

<div style="text-align: center;" ><a href="https://www.atoti.io/?utm_source=gallery&utm_content=auto-cube" target="_blank" rel="noopener noreferrer"><img src="https://data.atoti.io/notebooks/banners/Discover+Atoti+now.jpg" alt="Try Atoti"></a></div>

In [1]:
import functools
import io
import webbrowser

import atoti as tt
import pandas as pd
from IPython.display import SVG, Markdown
import ipywidgets as widgets

Since Atoti is a Python library, we can use it along with other libraries such as ipywidget and Pandas.  
We used FloatProgress from ipywidget to track the loading progress of web application.

In [2]:
out = widgets.Output()
fp = widgets.FloatProgress(min=0, max=5)

## Atoti processes

In the following function, the key steps to create an Atoti web application are defined:
- Instantiate Atoti session (web application is created upon instantiation)
- Create Atoti table by loading the Pandas DataFrame (Atoti also accepts other datasources such as CSV, Parquet, SQL, Spark DataFrame etc.)
- Create cube with the Atoti table

It is possible to create and join multiple Atoti table. However, in our use case, we are only creating one Atoti table.  
We make use of the [webbrowser](https://docs.python.org/3/library/webbrowser.html) api to launch the web application in a new browser tab.

In [3]:
def create_cube(df, keys=None, port=9090):
    print(f"-- Creating session on port {port}")
    fp.value = 2
    session = tt.Session(port=port)

    print("--- Loading data into table")
    fp.value = 3
    tbl = session.read_pandas(df, table_name="table", keys=keys)

    print("---- Creating cube")
    fp.value = 4
    cube = session.create_cube(tbl)

    fp.value = 5
    print(f"----- Launching web application: {session._local_url}")
    webbrowser.open(session._local_url)

    print("======================================================")
    print(f"Number of records loaded: {len(tbl)}")
    print("Table schema: ")
    display(cube.schema)

    print()
    display(Markdown("### Access web application"))
    display(
        Markdown(
            "__Click on this URL if web application is not automatically launched:__"
        ),
        session.link(),
    )
    print()
    print("======================================================")

We trigger the creation of the cube upon selection of a CSV.  
Note that we recreate the session whenever a new CSV is selected. So the previous dataset will no longer be accessible.

In [4]:
@out.capture()
def on_key_change(b, _df, _keys=None):
    b.disabled = True
    keys = list(_keys.value)
    keys = [] if "None" in keys else keys

    create_cube(_df, keys)
    displayFileLoader()

In [5]:
@out.capture()
def on_upload_change(change):
    out.clear_output()
    display(fp)

    print("Starting cube creation for ", change["new"][0].name)

    fp.value = 0
    print("- Reading file")
    input_file = list(change["new"])[0]
    content = input_file["content"]
    df = pd.read_csv(io.BytesIO(content))

    fp.value = 1
    columns = ["None"] + df.columns.tolist()
    keys_selection = widgets.SelectMultiple(
        options=columns,
        value=["None"],
        description="Choose key(s)",
        disabled=False,
    )

    button = widgets.Button(
        description="Submit",
        disabled=False,
        button_style="",
        tooltip="Submit selected keys",
        icon="check",  # (FontAwesome names without the `fa-` prefix)
    )

    display(keys_selection, button)
    button.on_click(functools.partial(on_key_change, _df=df, _keys=keys_selection))

In [6]:
def displayFileLoader():
    uploader = widgets.FileUpload(
        accept=".csv",
        multiple=False,
    )

    uploader.observe(on_upload_change, "value")
    with out:
        display(uploader)

Feel free to re-select a new CSV file to test out different datasets.

In [7]:
displayFileLoader()
out

Output()

<div style="text-align: center;" ><a href="https://www.atoti.io/?utm_source=gallery&utm_content=auto-cube" target="_blank" rel="noopener noreferrer"><img src="https://data.atoti.io/notebooks/banners/Try+Atoti.jpg" alt="Try Atoti"></a></div>