# Visualizing Labels produced by Dandelion, using fbTREX semantic API.

First of all we load the usual libraries, and the output csvs of labels.py for two different users.

In this example, we use the sample dataset. You can change the paths for file1 and file2, as well as the number of top sources to get for both of the users.

Then we build a dataframe containing the n top sources per day for the two different users.

In [None]:
# import libraries
import pandas as pd
import altair as alt
alt.renderers.enable('notebook')

# configure files location and number of top labels to get.
file1 = 'sample_data/user_a_labels.csv'
file2 = 'sample_data/user_b_labels.csv'
top = 5

# load the data
df1 = pd.read_csv(file1)
df2 = pd.read_csv(file2)
df = pd.concat([df1, df2])


# filter out to get only top n labels

keep_list = df1.groupby('word')['count'].sum().nlargest(5).index.tolist()
df1 = df1[df1['word'].isin(keep_list)]
keep_list = df2.groupby('word')['count'].sum().nlargest(5).index.tolist()
df2 = df2[df2['word'].isin(keep_list)]

top = pd.concat([df1, df2])

### Panoramic of top labels per user.

In [None]:
alt.Chart(top).mark_line().encode(
    x='impressionTime:T',
    y='count:Q',
    color='word:N',
    row='user:N'
).properties(
    width = 600,
    height = 450
)

### Compare how specific words have been appearing on two different users timelines.

Choose a list of words (in this example, 'Barcelona' and 'Partido Popular').
Then we show the trending of the two words on the two users profiles.

In [None]:
words_list = ['Barcelona', 'Partido Popular']

filtered = df[df['word'].isin(words_list)]
alt.Chart(filtered).mark_line().encode(
    x='impressionTime:T',
    y='count:Q',
    color='user:N',
    row='word:N'
).properties(
    width = 600,
    height = 300
)

### At an higher level, we can also just compare the top contents of the two users Newsfeed.

In [None]:

df1 = df1.sort_values('count', axis=0, ascending=False)
df2 = df2.sort_values('count', axis=0, ascending=False)

user1 = alt.Chart(df1).mark_bar().encode(
    x='count:Q',
    y=alt.Y(
        'word:N',
        sort=alt.SortField(
            field="count:Q",
            order="descending"
        )
    )
).properties(title='user 1')

user2 = alt.Chart(df2).mark_bar().encode(
    x='count:Q',
    y=alt.Y(
        'word:N',
        sort=alt.SortField(
            field="count:Q",
            order="descending"
        )
    )
).properties(title='user 2')

user1 & user2