# Guest Session Feature Demo

This notebook is a demo of a feature that WhyLabs is working on called Guest Sessions. The feature builds ontop of our open source library, whylogs, to make it easier to visualize datasets and identify data issues. 

Typically when using whylogs, you'll log data to using one of the `whylogs.log()` apis to generate JSON and/or binary profiles locally that summarize various statistics about the dataset. That feature isn't going anywhere, don't worry! Guest Sessions take those generated statistics and uploads them to WhyLabs.ai to populate our dashbaord without the need to sign in or share any personal identifiable information.

## How it works

The rest of the notebook will go through how the feature works and how to use it, starting with imports.

In [1]:
import pandas as pd
from datetime import datetime
from whylogs.app.session import start_whylabs_session
from tqdm.auto import tqdm

### Load the data

Next, we'll load a sample dataset for logging with whylogs. The dataset was based on a set [from Kaggle](https://www.kaggle.com/yugagrawal95/sample-media-spends-data) for various advertising campaigns. We'll just use the subset of data from April months.

In [2]:
csv_file = "data/april.csv"
csv_dataframe = pd.read_csv(csv_file)
csv_dataframe

Unnamed: 0,Division,Calendar_Week,Paid_Views,Organic_Views,Google_Impressions,Email_Impressions,Facebook_Impressions,Affiliate_Impressions,Overall_Views,Sales
0,A,4/7/2021,604,22,130736,418862.0356,91874,6575,737,53473
1,A,4/14/2021,342,432,208099,508059.5563,61025,7733,550,59351
2,A,4/21/2021,833,10,209285,412036.2647,38548,8358,792,72647
3,A,4/28/2021,868,750,232531,473344.0824,27777,8208,905,62975
4,A,4/6/2021,3179,8439,432125,364178.8038,57047,8476,11929,79718
...,...,...,...,...,...,...,...,...,...,...
211,Z,4/28/2021,684,950,266994,692721.3919,26112,17965,139,77031
212,Z,4/6/2021,2003,8666,506927,533365.3933,65380,14159,10794,73578
213,Z,4/13/2021,2501,8047,620729,517477.9072,104872,13415,10091,70892
214,Z,4/20/2021,4893,9131,604834,562746.4443,168324,11114,13666,84341


### Group the data

WhyLabs currently offers daily aggregation of data so we'll group up the data by day using `pandas.groupby()`.

In [3]:
grouped_data = csv_dataframe.groupby(["Calendar_Week"])

### Create a Guest Session

Next, we get to the main feature. We'll start a Guest Session with the `start_whylabs_session` api. The important thing about this API to notice is that there is a consent flag to data uploading.

```python
start_whylabs_session(data_collection_consent=True)
```

We just want to make it clear that this does upload statistical data to WhyLabs.ai. We don't ever upload raw data, only the profiles that we generate via logging. The flag has to be `True` or the feature will bail out.

In [13]:
session = start_whylabs_session(data_collection_consent=True, report_progress=True)

WARN: Missing config


### Log the data

Next we'll actually log the data and generate the statistical profiles. This will generate a profile for each day in the data that we're logging.

In [14]:
with tqdm(grouped_data) as t:
    # Group each of the rows by the day they occur on using the date string in the Calendar_Week col
    for day_string, dataframe_for_day in t:
        # This dataset has dates of the form 9/5/2020
        dt = datetime.strptime(day_string, "%m/%d/%Y")
        t.set_description(f"Logging data for {day_string}")

        # whylabs loggers are specific to the dataset's timestamp so we'll be using a different one for each
        # date in our dataset.
        logger = session.logger(dataset_timestamp=dt)

        # log the data to the logger. The logger will write this data out in binary form when it closes, which
        # at the end of the with block in the session's internal logic.
        logger.log_dataframe(dataframe_for_day)


  0%|          | 0/8 [00:00<?, ?it/s]

### Close the session and upload the generated profiles

Finally, we close the session which causes the profile uploads to WhyLabs.ai. The progress bar reflects the upload process. When it's done it will print out a URL that you can use to view the data in the WhyLabs.ai dashboard.

In [15]:
with session:
    print("done")

done


  0%|          | 0/8 [00:00<?, ?it/s]

You can explore your data in Observatory here: https://hub.whylabsapp.com/models/model-1/profiles/?sessionToken=session-79313


### View your data in WhyLabs

Clicking the URL above will bring you to the WhyLabs dashboard. You'll see a banner at the top that indicates that we may still be processing data in our backend. You'll land on our profile viewer page and you'll be able to see the uploaded profiles from here, possibly before all of the data finishes processing.