# Guest Session Feature Demo

This notebook is a demo of a feature that WhyLabs is working on called Guest Sessions. The feature builds ontop of our open source library, whylogs, to make it easier to visualize datasets and identify data issues. 

Typically when using whylogs, you'll log data to using one of the `whylogs.log()` apis to generate JSON and/or binary profiles locally that summarize various statistics about the dataset. That feature isn't going anywhere, don't worry! Guest Sessions take those generated statistics and uploads them to WhyLabs.ai to populate our dashboard without the need to sign in or share any personal identifiable information.

## How it works

The rest of the notebook will go through how the feature works and how to use it, starting with imports.

In [24]:
import pandas as pd
from datetime import datetime, timedelta
from whylogs.app.session import start_whylabs_session
from tqdm.auto import tqdm

### Load the data

Next, we'll load a sample dataset for logging with whylogs. The dataset was based on a set [from Kaggle](https://www.kaggle.com/yugagrawal95/sample-media-spends-data) for various advertising campaigns.

In [25]:
csv_file = "data/sample_media_spend.csv"
csv_dataframe = pd.read_csv(csv_file)
csv_dataframe

Unnamed: 0,Division,Days_In_Past,Paid_Views,Organic_Views,Google_Impressions,Email_Impressions,Facebook_Impressions,Affiliate_Impressions,Overall_Views,Sales
0,N,4,30186,13332,319,1.057709e+06,267930,15162,43178,213285
1,B,4,178914,68433,2780542,3.356824e+06,1183683,74325,246655,905326
2,T,4,5409,4250,83965,1.720437e+05,62845,2797,8422,45115
3,C,4,1792,1350,390,9.788965e+04,26071,1838,2772,87712
4,G,4,49413,18101,635478,1.097394e+06,320715,20644,68333,189695
...,...,...,...,...,...,...,...,...,...,...
373,S,1,38844,34276,1447904,2.533716e+06,95613,21997,72410,184597
374,H,1,18626,15887,397768,1.018171e+06,41344,8045,33672,68251
375,Z,1,52480,52321,1185817,2.675419e+06,71610,20145,104088,144794
376,O,1,84267,56717,2616351,4.106323e+06,158521,26292,140350,243933


### Group the data

The dataset is labeled with the number of days in the past (relative to today) the data happened for as a convenience for this demo. We'll group it up by that day offset.

In [26]:
grouped = csv_dataframe.groupby("Days_In_Past")

### Create a Guest Session

Next, we get to the main feature. We'll start a Guest Session with the `start_whylabs_session` api. The important thing about this API to notice is that there is a consent flag to data uploading.

```python
start_whylabs_session(data_collection_consent=True)
```

We just want to make it clear that this does upload statistical data to WhyLabs.ai. We don't ever upload raw data, only the profiles that we generate via logging. The flag has to be `True` or the feature will bail out.

In [27]:
session = start_whylabs_session(data_collection_consent=True, report_progress=True)

WARN: Missing config


### Log the data

Next we'll actually log the data and generate the statistical profiles. This will generate a profile for each day in the data.

In [28]:
for days_in_past, df in tqdm(grouped):
    timestamp = datetime.now() - timedelta(days=days_in_past)
    logger = session.logger(dataset_timestamp=timestamp)
    logger.log_dataframe(df)

  0%|          | 0/4 [00:00<?, ?it/s]

### Close the session and upload the generated profiles

Finally, we close the session which causes the profile uploads to WhyLabs.ai. The progress bar reflects the upload process. When it's done it will print out a URL that you can use to view the data in the WhyLabs.ai dashboard.

In [29]:
with session:
    print("done")

done


  0%|          | 0/4 [00:00<?, ?it/s]

You can explore your data in Observatory here: https://hub.whylabsapp.com/models/model-1/profiles/?sessionToken=session-41833


### View your data in WhyLabs

Clicking the URL above will bring you to the WhyLabs dashboard. You'll see a banner at the top that indicates that we may still be processing data in our backend. You'll land on our profile viewer page and you'll be able to see the uploaded profiles from here, possibly before all of the data finishes processing.